Let’s summarise what we’ve got so far:
n | average | difference | standard deviation (SD) |
|
---|---|---|---|---|
Fish-2-Whale | 8 | 267 g | 94 g | 44g |
Control | 8 | 173 g | 28g |
We said earlier that the accuracy of our estimate of the means depends on two things:
- the amount of variation in our population and
- the size of our samples.
The standard deviations of 44g and 28g tell us something amount the widths of the distributions in the graph below.
Standard deviation tells us how wide each distribution is. Notice that the SD for Fish-2-Whale is bigger than for the control – 44 vs 28g – and this corresponds to the wider distribution on the graph above. However, it still doesn’t tell us how accurately we are pinpointing the location of the peaks; that depends both on SD and on sample size. Statisticians have worked out a standard way to estimate the error in the location of the peaks called the Standard Error in the Mean, which is abbreviated as SEM. You may see this abbreviation a lot in research papers and on websites. The formula for SEM has the SD in the top fraction and the square-root of the sample size in the bottom of the fraction:
Think about this for a moment. As your sample size (n) goes up, SEM will go down – but because of the square root sign, you need pretty big sample sizes to really push your SEM down very far.
n | average | difference | standard deviation (SD) |
|
---|---|---|---|---|
n |
mean |
standard deviation |
standard error in the mean (SEM) |
|
Treatment |
8 |
267 g |
44 g |
15.6 g |
Control |
8 |
173 g |
28 g |
9.9 g |
We could use this information in writing our report to our boss (or an article for publication in a journal), so we could write something like:
The average weight gain for the control group was 173±10 g and the average weight gain for those fed with Fish-2-Whale was 267±16 g.
Note that we don’t bother including too many digits in our final written report, since it’s just an estimate of the error and a couple of digits is adequate.