The average score for
the village or town is therefore 1.5,
apparently a very satisfactory
result, but the residents of the last
two areas would probably not
agree. The average for these two areas
is 8.5. The above treatment of
the statistical data data disobeys a
fundamental law of statistics.
When taking the average of a collection
(or population using
statistical terminology) of data, all
the contributory data should
fall on a single peak and the width of
the peak in normally
distributed data can be calculated
(standard deviation or standard
error). Note that the average value for
the first 28 districts is 1.0
quite different from the average for
the last two districts. This
example illustrates how simple it is to
bury or disguise unfavourable
data by using the statistically illegal
technique of lumping together
data from different populations.
Another example could be illustrated
by measuring the height of all the
inhabitants of a village including
neonates and members of the local rugby
XV. The result would probably
come out at around 3 ft, a result
consistent with a race of pigmies.
Again, the stupid result is due to
combining the results of groups of
data possessing quite different
averages.
To conclude, be wary of
a presentation of statistics that hides
disparate groups in a larger
group which aims to present a
particular viewpoint that is
statistically invalid. This is not the
only method of misusing
statistics in order to delude the
uninitiated, but it is probably the
most prevalent
technique.