Thursday, November 22, 2007

Applications of Benford's Law

Tim Harford;

Benford’s Law does not apply to every set of numbers - for example, it does not apply to post codes or national insurance numbers, which are assigned by bureaucratic processes. But all sorts of “natural” processes should produce Benford data. And since the units in which many quantities are measured are arbitrary (grams or ounces, miles or millimetres, dollars or yen) then converting to a different unit of measurement preserves Benford’s law.

As an example, think about an economy that is growing from an initial value of $10bn. It must grow by 100 per cent before the first digit changes, to $20bn. Then it need only grow by 50 per cent to reach the next digit at $30bn, which is likely to happen more quickly. To grow from $90bn to $100bn requires just over 10 per cent growth; but then to change the first digit back to two, at $200bn, requires that the stock grow by 100 per cent again. That sort of story suggests why Benford data may be common, although quite why Benford’s Law holds so widely is not yet settled.

Regardless, the pattern is a useful test of the plausibility of data. In the early 1970s, Hal Varian, now chief economist at Google, argued that if economic data satisfied Benford’s Law on the way into an economic model but not on the way out, it was worth taking a second look at the model itself.

And Mark Nigrini, an accountancy professor, found fame in the 1990s by using Benford’s Law to discover accounting scams, frauds and tax dodges, such as inventing invoices that were just under some threshold for managerial approval.

John Nye and Charles Moul, two economists at Washington University in St Louis, have now checked some basic macroeconomic statistics using Benford’s Law. They find that OECD statistics fit the law quite well, suggesting that GDP data should follow Benford. But African GDP data do not fit. It is not possible to say whether the anomaly is due to fraud or underfunded statistical offices. But it is a reminder that some data should come with a health warning.

No comments: