Galton’s Bayesian Inference Engine

Monday, March 7th, 2011

In 1877, Francis Galton built a Bayesian inference engine — although he thought of it as displaying the action of natural selection in a model for inheritance of quantitative characteristics:

The machine is reproduced in Figure 1 from the original publication. It depicts the fundamental calculation of Bayesian inference: the determination of a posterior distribution from a prior distribution and a likelihood function. Look carefully at the picture — notice it shows the upper portion as three-dimensional, with a glass front and a depth of about four inches. There are cardboard dividers to keep the beads from settling into a flat pattern, and the drawing exaggerates the smoothness of the heap from left to right, something like a normal curve. We could think of the top layer as showing the prior distribution p(x) as a population of beads representing, say, potential values for x, from low (left) to high (right).

The machine does the computation with gravity providing the motive force. There is a knob at the right-hand side of each of two levels. When the platform supporting the top level of beads is withdrawn by pulling the upper knob at the right, the beads fall to the next lower level. On that second level, you can see what is intended to be a vertical screen, or wall, that is close to the glass front at both the left and the right, but recedes to the rear in the middle. If viewed from above, that screen would look something like a normal curve. The vertical screen represents the likelihood function; in this position, it reflects high likelihood for xs in the middle, but if moved to the right, it would represent high likelihood for larger values of x. Similarly, if moved to the left, high likelihood for smaller x.

The way the machine works its magic is that those beads to the front of the screen are retained as shown; those falling behind are rejected and discarded. (You might think of this stage as doing rejection sampling from the upper stage.) The surviving beads are shown at this level as a sort of nonstandard histogram, nonstandard because the depths of the compartments vary, with those toward the middle being deeper than those in the extremes.

The final stage turns this into a standard histogram: The second support platform is removed by pulling to the right on its knob, and the beads fall to a slanted platform immediately below, rolling then to the lowest level, where the depth is again uniform — about one inch deep from the glass in front. This simply rescales the retained beads, resulting in a distribution that again looks somewhat like another normal curve, one a bit less disperse that the prior distribution at the top. The magic of the machine is that this lowest level is proportional to the posterior distribution!

(Hat tip to Alex Tabarrok.)

Leave a Reply