Histograms of data in each group
When data are collected from two groups, a histogram can be used to graphically display the distribution of values in each group.
End-of-year bonuses paid to lower-level executives
A company has a generous but rather complicated policy on end-of-year bonuses for its lower-level managerial personnel. A key factor of the policy is a subjective judgement of 'contribution to corporate goals'. The diagram below shows the bonuses awarded to the 24 female and 36 male executives. The crosses have been jittered a little (randomly moved) to separate them in the scatterplot.
This diagram is 3-dimensional. Position the mouse in the middle of the diagram and drag towards the top left of the screen to rotate the plot (or click the 3D rotation button). The histogram within each group describes the distribution of bonuses awarded to that gender.
Model for each group
A single batch of numerical values is usually modelled as a random sample from some hypothetical infinite population -- often a normal distribution. In a similar way, data sets that consist of measurements from two groups are often modelled as two independent random samples from two underlying hypothetical infinite populations. Normal distributions are again commonly used as models.
(The assumption of normality should be checked from graphical displays of the sample data. If the data are noticably skew, a transformation may provide values that can be adequately modelled by normal distributions.)
The histograms of bonuses paid to male and female executives both seemed fairly symmetrical, so normal distributions are reasonable models within the two groups. The diagram below shows a possible model for the bonus data.
Click Take Sample to select a random sample from each of the two normal distributions. The model claims that the real data set consists of random samples from distributions like these.