4: More about normal distributions

More about normal distributions

And here are some specific (and real) normal distributions:

heights of american women follow a normal distribution SAT scores follow a normal distribution
The age of women attending this seminar is affected by a variety of factors:

  • topic of the seminar
  • where it is held
  • what time of day
  • how it was advertised…
blood pressure readings are affected by a variety of factors:

  • stress levels
  • diet
  • duration and frequency of exercise…

An ideal, or theoretical, normal distribution is symmetrical and shaped a bit like a bell (its also called a Bell Curve). Or course your data will never follow an ideal normal distribution exactly, but many datasets do approximate a normal distribution.

An amazing fact is that distributions that are exactly normal can be described by 2 parameters. These two parameters are:

  1. where the distribution is centered – or, the value at the peak.
  2. how wide the distribution is – or, how much variability there is in the thing you’re measuring.

The centre of the distribution is called the mean. Because normal distributions are symmetric we can  simply find the peak, and determine its position on the horizontal axis (this is known as the x coordinate); that’s your mean. Here are the two distributions from above:

mean height of women is 64.5 inches mean SAT score is 500


Its harder to measure how wide the distribution is. Very big or very small fish, very high or low blood pressures, and very old or young ages do occur, at least with a small probability. So instead of measuring the entire width, we determine where the middle two-thirds of the data lies (actually the middle 68%, for reasons of mathematical theory). This measure is called the Standard Deviation, or SD. Again, the distributions from above:

mean height of women is 64.5 inches mean SAT score is 500


So,“68% of the observations” fall between plus and minus 1 SD. Another way of saying this is that if you measure plus and minus 1 SD from the mean, you will shade 68% of the area under the curve. Most of us are not very good at eye-balling 68% of a curvy shape,
and there is a mathematical formula for determining the standard deviation. If you already
know it, good for you. If not, we’ll discuss it later. For now, just remember that
the standard deviation (SD) measures how far in each direction you have to go FROM THE MEAN along the x-axis to encompass 68%
of the population
– in other words, the SD measures how variable the population is.