Correlation and Application of Statistics to Problems of Heredity 63
Oxford, and the three last were two men who committed suicide under circumstances of great disgrace and Palmer, the Rugeley murderer, who was hanged. There is possibly little knowledge to be obtained from the result for a single medical school, but comparative statistics for several would be of considerable value.
Chapter V deals with Normal Variability, and Galton shows how the distribution depends only on the two constants, the median and the quartile, and further that if two individuals whose grades are known be actually measured, then the median and quartile, and so the whole distribution of variation, can be discovered (p. 62, footnote, and cf. our Vol. II, p. 385). The origin of the normal distribution is illustrated mechanically by aid of the " quincunx " (see our pp. 9 and 10). Nor is Galton able to avoid becoming poetically enthusiastic in a paragraph headed The Charms of Statistics, for he writes
"It is difficult to understand why statisticians commonly limit their inquiries to averages and do not revel in more comprehensive views. Their souls seem as dull to the charm of variety as that of the native of one of our flat English counties, whose retrospect of Switzerland was that, if its mountains could be thrown into its lakes, two nuisances would be got rid of at once. An average is but a solitary fact, whereas if a single other fact be added to it, an entire Normal Scheme, which nearly corresponds to the observed one, starts potentially into existence.
"Some people hate the very name of statistics, but I find them full of beauty and interest. Whenever they are not brutalised, but delicately handled by the higher methods, and are warily interpreted, their power of dealing with complicated phenomena is extraordinary. They are the only tools by which an opening can be cut through the formidable thicket of difficulties that bars the path of those who pursue the Science of Man." (pp. 62-63.)
Galton at the end of his Chapter V gives the two fundamental propositions on which his normal surface for the distribution of characters in two relatives depends. He envisages it in the following manner.
"(1) Bullets are fired by a man who aims at the centre of a target, which we will call its 11f, and we will suppose the marks that the bullets make to be painted red, for the sake of distinction. The system of lateral deviations of these red marks from the centre M will be approximately Normal, whose Q [Probable Error] we will call c. [This is the distribution of the first relative.] Then another man takes aim, not at the centre of the target, but at one or other of the red marks, selecting these at random. We will suppose his shots to be painted green. The lateral distance of any green shot from the red mark at which it was aimed will have a Probable Error, that we will call b. Now if the lateral distance of a particular green mark from M is given [a], what is the most probable distance from M of the red mark at which it was aimed?
It is + b2 a*.
"(2) What is the Probable Error of this determination? In other words, if estimates have been made for a great many distances founded upon the formula in (1), they would be correct on the average, though erroneous in particular cases. The errors thus made would form a normal
system whose Q [Probable Error] it is desired to determine. Its value is ,/b2 L2 1."
* Unfortunately Galton has the value V O + b21 which is very liable to confuse the reader. t In more modern notation, this may be looked upon as the variability of the array of the
second relative = c2 (1 -?-'I); therefore r = ./c2/(c2 +b 2). Hence the regression of first relative on second relative = rc/ Jc2 + b2 x a = ~ + b2 x a. Again the variance of the difference in character
between the two relatives = c2 + (c2 + b2) - 2c1/c2 + b2 r = b2, or b has for physical meaning the probable error of the distribution of the difference in character between the two relatives.