## Prepping for the 2013 AP Stats test: common student errors Part 1

At my school, we gave our AP Statistics students a final assignment (notice I don’t quite say assessment) to help them prepare for the 2013 AP Statistics exam on May 10.  This was not an attempt to replicate an actual AP exam in the traditional sense. We didn’t want to.  We really wanted to incentivize students to sit down, quietly prepare at an individual level, and synthesize what they learned this year in a quiet, focused environment.

What we did:

• We used old  AP questions, but concealed how/where I got the questions. I retyped, changed numbers / context when possible.  I strongly advise against teachers using a single AP exam as a tool for summative assessment.  The answers are out there and easily found with minimal effort.  At our school we are very transparent about how to go to AP central and find practice questions and rubrics.  So if we use real AP questions as a source, we mix questions up, conceal how / where  we got them, change  real AP questions into questions “inspired by AP.”
• Students were given 5 class periods to complete the 40 MC questions, 5 FR questions, and one investigative task. This is 225 minutes: 45 minutes more than a typical exam.
• Students were able to bring in one 8.5 x 11 inch paper with any HAND-WRITTEN notes. This had to be brought in on the first day of the assignment, and could not leave the classroom for the rest of the week.
• Students also could not bring the assignment home:  All work was done individually in the classroom.  Students were allowed to discuss with others, but not “get answers.”  They could also “look up”  topics /. concepts/ other problems that might lead to insight. But nothing could be brought into the classroom after the fist day.
• This was worth 10% of their course grade.  We will give formative feedback using rubrics from the AP. Grades will be assigned to their work. AP score equivalents will inform but not dictate, how we assign grades to students’ work.

Below is a log of “common errors” from the first two  FR questions I gave them.

Question 1:  Students had to explain how to estimate a median value from a histogram,  compare two groups by a analyzing a quantitative variable in context,  and explain the relationship between a mean and a median in skewed and symmetric distributions. The topic : Teacher/ Pupil ratios in each state.  States were grouped by Western and Eastern.

• Students read histograms like dot plots.  They use the endpoints of each interval as substitutes for individual data values.  Some didn’t communicate or recognize that non-integer values are possible. As a result they erroneously reported the median value in a group as “15” instead of “some number between 15 and 16.”
• Students need to make explicit comparisons.  When comparing two groups students need to answer the question, “Who wins? Is it a tie?”  When asked to compare distributions, students need to make more explicit comparative statements “____’s average is larger than  _____’s average,”  “the same as,”  “more skewed.”
• My students could improve their language for describing shape of a distribution.  Evenly distributed is vague. I think it means uniformly distributed. “Unimodal”  means “one clump of data.”  Many symmetric distributions are not “even.” Many symmetirc distributions are not unimodal.
• Students mix up which features of a distribution are  “shape” and which are “spread.”  Example:  “The western states were skewed right and had a wider range than the Eastern states.”  They never mention anything about the Eastern states’ shape (symmetric?,  roughly normal?  skewed left?).    Another example:  “The wide range causes the mean to be pulled up.”  It’s the skewness, not the high range, that causes this.
• Students do not give a correct definition of “range.”  The P-T ratios in the east produce a range  from 12-22″ is NOT correct.    The range for this group is 22-12 = 10.   This didn’t necessary result in a point deduction on this problem, but it could in a problem where it matters.

Question 2:  In this question, students had to describe how to execute a simple random sample,  identify/justify an effective way to stratify a random sample, and explain a statistical advantage of a stratified sample over a simple random sample.  (Topic:  a survey about a new lunch program in a school district)

Common errors

• Students must describe more specific plans that another person can execute without having to ask questions.  “I will have a random number generator select numbers from  1  to 2500, with no repeats.
• Students mistakenly say that SRS’s are biased, and stratification removes bias .   If this were true, then that would mean that simple random samples systematically over-estimate (or under-estimate) population parameters.  This is a very common error.  Stratification does two things:  1. Stratification ensures sufficient representation of each group. Note: the benefit is not in EQUAL representation, but in sufficient representation. Suppose High-schoolers make up only 10% of the population.  An “unlucky” SRS of 200 students may result  in a disproportionate number of high schoolers (maybe only 7 or 8 out of 200, maybe 40 out of 200) This causes values of our sample estimates to  “swing,’ or vary more than if we had 40 seniors in our sample. We could “set” this value with a stratified sample. We could then use proper scaling to get a good estimate for satisfaction level in the population.   As a result… 2. Estimates from stratified samples typically show less variability (more prescision)  than simple random samples.  But both sample random samples and stratified random samples are unbiased methods of sampling.