Friday, January 30, 2015

Lego, sampling and bad-behaving confidence intervals

Yesterday, during the second lecture of our Introduction to Data Science course for students in non-quantitative program. We did a sampling demo adapted from Andrew Gelman and Deborah Nolan's teaching book (a bag of tricks).

Change from candies to legos. The original teaching recipe uses candies. A side effect of that is the instructor will always get so much left over of candies as the students are getting more and more health conscious. So this time, I decided to use lego pieces. One advantage of this change is that we can save the kitchen scale and just count the number of studs (or "points") on the lego pieces.

Preparation. The night before I counted two bags of 100 lego pieces: population A and population B. Population A consists of about 30 large pieces and 70 tiny pieces. Population B consists of 100 similar pieces (4 studs, 6 studs and 8 studs).

In-Class demo. At the beginning of the lecture, we explained to the students what they need to do and passed one bag to half of the class, and the other bag to the other half, along with  data recording sheets.

Results. Before class, I asked a MA student, Ke Shen, in our program who is very good at visualization and R to create a RShiny app for this demo, where I can quickly key in the numbers and display the confidence intervals.

Here are population A samples.
Here are population B samples. 

Conclusion. Several things we noticed from this demo:
  1. sampling lego pieces can be pretty noisy. 
  2. all samples of population A over-estimated the true population mean (the red line). samples of population B seemed to be doing better. 
  3. population variation affects the width of the confidence intervals. 
  4. but even wider confidence intervals were wrong due to large bias. 

1 comment:

Bill Jefferys said...

I like the idea of handing two differently prepared bags to the two halves of the class; I've taught a seminar-style class on decision theory where we sit around a table, and it's quite convenient to do this sort of thing in that context. One thing I've done is to hand out a "quiz" on the second day where some of the questions are subtly different in the left half of the table from the right half (for example, the "trolley problem":

where half the class is asked the decision in the context of pulling the switch, and half in the context of pushing the fat man off the bridge. Works every time even though objectively the utilities are the same!).

Discussion of the reasons why some questions are responded to differently is very illuminating at this early point in the class.