This is the second part of a series of posts on sample-to-population inferences, and progressively developing students understandings.
The need for a confidence interval (NZ Curriculum Level 7/Year 12-ish onwards)…
The next step in developing students’ appreciation of sampling variation is for them to understand that making a sample-to-population inference where they give a point estimate of the population parameter is setting them up for failure – they will always be wrong. Instead they need to use an interval of plausible/believable values within which the population parameter is most likely to fall between, based on our best estimate of the size of the sampling variation.
My lesson progression, outlined briefly below in relation to developing the Level 7 informal confidence interval, is adapted from others’ work including Pip Arnold (Pip’s work) and Lindsay Smith (Lindsay’s work).
- Introduction to/ recap of sampling methods including how and why we sample.
I set a silly homework assignment the period before this lesson, then get the students to brainstorms all different ways we could select five students to check their homework. Generally students come up with our standard sampling methods needed (eg simple random, systematic, stratified, convenience…), occasionally with a little prompting from me. We then “check homework” which always involves students receiving a lolly if they get selected. This does lead into interesting conversations on RANDOM (see my previous blog on this!) and our perception of RANDOM. It does pay to save a few lollies for the end for any students that haven’t been selected!
- The need for confidence intervals – meet the Kiwis…
Students work in pairs each take four samples (n=15) from the kiwi population and use their samples (after creating dot plots, box plots and summary statistics) to finish the following sentence “From my sample data I estimate that the median weight for all New Zealand kiwis is….“. Usually students who have worked together will give the same/similar answers but we get a variety of answers across the class. Sometimes we even get a student or two who give an interval for their answer – send them to stand in the back corner of the room immediately! The big reveal, of course, comes with the teacher-prompt of “but your answers are all different – who is right?“. Generally your more confident personalities will claim that they’re right, while the quiet students sit quietly looking concerned. Pad this out as long as you need to to make your point that “they are all wrong“. How can they know EXACTLY what the population median will be?
NOTE: Exactly how you deal with this will depend on the relationship you have with your class – it may be better to soften this statement to “you might be right, you might be wrong – in the real world you would never know…”. I enjoy stirring my class a little and the reactions I get are always lots of fun!
We now bring back into the room the students sent to the back of the class and praise them for being three steps ahead of the rest and putting an interval around where they thought the population median would be.
We could use our medians from our repeated sampling (collect the class results on a big graph) and read off where most of the medians lie to give us a good idea of the interval for the population median. This works fine in the classroom where we have easy access to the population to actually complete repeated samples, but in reality we want methods that we’re happy working with when we only have one sample.
- Increased sample sizes = decreased sampling variation. Therefore our interval should get narrower as our sample size increases. See PART 1 for how I reinforce this with students. Hand gestures are very useful again here: If your confidence interval is THIS wide for this sample size – what happens when you INCREASE your sample size? – make sure you include a sound effect as your confidence interval gets narrower with the increased sample size!
- Increased spread in the population = increased sampling variation. Therefore our interval should get wider as our spread increases. I use the simple situation (from Pip) of comparing student heights for new furniture in an intermediate school (Year 7 & 8) or a middle school (Year 7, 8, 9 & 10). We ask which teacher is likely to get a closer estimate of the students heights?
Bring back your hand visuals again: If your confidence interval is THIS wide for this spread – what happens when you INCREASE your spread? – make sure you include a sound effect as your confidence interval gets wider with the increased spread! Of course, both here and for increased sample size we can start showing the ICI visually with a (red) line on our box plot, centered about our median.
- Formula for informal confidence interval (ICI). I introduce the suggested formula for students to use, and we check that it meets our requirements. Yes, the bit we add on or subtract does get smaller when we increase the sample size (as we’re dividing by a bigger number); yes, the bit we add on or subtract does get bigger when the spread (interquartile range IQR) increases (as we’re multiplying by a bigger number)
- Checking our informal confidence interval formula works most of the time. We have class set of 100 different samples from the kiwi population (here if you want it) where students calculate 5 different ICIs themselves, then we collect these to check whether they captured the population median or not (collection sheet is here). I am very clear to reinforce with students that we are working in TEACHING WORLD so we can do exactly what we’re doing – testing our ideas to convince ourselves they work.
- Reinforcing the ICI has been developed and tested by people who know what they’re doing, ready for us to use this year in class. It is a step along our sample-to-population inference journey which culminates next year with the introduction of formal methods for constructing confidence intervals.
- EVERY TIME!!!!! you construct a confidence interval you should be interpreting it (even if it’s just in your head!). This is almost a mantra in my class. Students need to continually remind themselves what the point of creating the confidence interval is, and by interpreting it carefully we are doing this. Remember, every confidence interval interpretation should include the statistics, the population, “pretty sure” (or equivalent indication of uncertainty), the variable, numbers and units. For example “we’re pretty sure that the population median height of all Year 12 boys in New Zealand is somewhere between 173cm and 182cm“.
Other formative assessment questions such as “Would it be believable that the median height of Year 12 boys in New Zealand is 185cm? Why? Why not?” are also super-important to check students’ understanding. And of course slipping in annoying questions such as “Why do we use confidence intervals?”