Seventy-five female college students and 24 male college students reported the cost (in dollars) of his or her most recent haircut. The resulting data are summarized in the following table.
Females | Males | |
---|---|---|
No. of observations | 75 | 24 |
Minimum | 0 | 0 |
Maximum | 150 | 35 |
1st Quartile | 20 | 9.25 |
Median | 31 | 17 |
3rd Quartile | 75 | 20 |
Mean | 52.53 | 20.13 |
Students can sketch out a basic box plot with whiskers extending to the min and max, a box extending from the first quartile to the third quartile, and a line at the median, as shown below. In order to compare haircut costs of males and females, the two boxplots should be plotted side by side on the same scale.
Commentary
This problem could be used as an introductory lesson to introduce group comparisons and to engage students in a question they may find amusing and interesting. More generally, the idea of the lesson could be used as a template for a project where students develop a questionnaire, sample students at their school and report on their findings.
Being able to use data to compare two groups is an important skill. These distributions have similarities (both appear to be skewed); we can also see that haircut cost tends to be greater for females than males and that there is more variability in haircut cost for females.
The data can also be used to start (or continue) a discussion about what we should report as a “typical” haircut cost. The data distributions appear to be skewed (for the females more so than the males). It allows us to see how extreme values in the data “pull” the mean toward the high end of hair cut costs. With strongly skewed data, measures such as the mean and standard deviation aren’t very useful.
These data came from a survey given to a class of introductory statistics students in a college class.