General Steps |
In the Dilbert example… |
1. Decide on a null hypothesis — a "model" that the data should fit | Dilbert’s null hypothesis was that the sickdays were randomly distributed. |
2. Note your "expected" and "observed" values | Since 40% of weekdays fall on Monday or Friday, the same should be true of sickdays — or 40 out of 100. The observed value was 42 out of 100. |
3. Simulate lots of data | We simulated 100 trials with the applet. |
4. Decide what your “threshold of pain” is (otherwise known as a p-value). *Note: technically this should come before simulating your data! |
We picked a threshold of 5%, or 5 out of 100 trials |
5. Determine whether the agreement of the simulated data with the observed data falls within the threshhold — if so, we say the model fits the data well. |
Since the simulated data showed many more than 5% of trials with at least 42 mon/fri sickdays, we decide that the model (random sickdays) fits the data. |