Since I enjoyed writing about the St. Petersburg Paradox, here’s an interesting mathematical law for you.
I noticed a weird distribution looking at the market indexes the other day. Today, I see the S&P trading at 1099, Dow at 10273, and NASDAQ at 2266. Strange. Two indexes begin with the number 1 and the third with a 2. Looking at the other indexes, I see lots more beginning with 1s (there’s an 1860, 1865, and 11566). There are indexes that start with most of the other digits, but none starting with 8 or 9. Calculating the numbers:
- 29% of the indexes begin with a 1
- 12% with a 2
- 18% with a 3
- 12% with a 4
- 0% with a 5
- 24% with a 6 (3 of the 4 are Russell indexes, which I bet is not coincidence)
- 6% with a 7
- 0% with an 8
- 0% with a 9
Though not a perfect distribution, you clearly see a lot more of the first three digits (59%) than the last three (6%). Why is this?
It’s Benford’s law at work!
Benford’s law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 almost one third of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than one time in twenty. This distribution of first digits arises whenever a set of values has logarithms that are distributed uniformly, as is approximately the case with many measurements of real-world values.
This counter-intuitive result has been found to apply to a wide variety of data sets, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). The result holds regardless of the base in which the numbers are expressed (except for trivial bases), although the exact proportions change.
1 30.1% 2 17.6% 3 12.5% 4 9.7% 5 7.9% 6 6.7% 7 5.8% 8 5.1% 9 4.6%
For example, if a quantity increases continuously and doubles every year, then it will be twice its original value after one year, four times its original value after two years, eight times its original value after three years, and so on. When this quantity reaches a value of 100, the value will have a leading digit of 1 for a year, reaching 200 at the end of the year. Over the course of the next year, the value increases from 200 to 400; it will have a leading digit of 2 for a little over seven months, and 3 for the remaining five months. In the third year, the leading digit will pass through 4, 5, 6, and 7, spending less and less time with each succeeding digit, reaching 800 at the end of the year. Early in the fourth year, the leading digit will pass through 8 and 9. The leading digit returns to 1 when the value reaches 1000, and the process starts again, taking a year to double from 1000 to 2000. From this example, it can be seen that if the value is sampled at uniformly distributed random times throughout those years, it is more likely to be measured when the leading digit is 1, and successively less likely to be measured with higher leading digits.
One of the more interesting applications of Benford’s law is fraud detection:
Based on the plausible assumption that people who make up figures tend to distribute their digits fairly uniformly, a simple comparison of first-digit frequency distribution from the data with the expected distribution according to Benford’s law ought to show up any anomalous results. Following this idea, Mark Nigrini showed that Benford’s law could be used as an indicator of accounting and expenses fraud. In the United States, evidence based on Benford’s law is legally admissible in criminal cases at the federal, state, and local levels.
Benford’s law has been invoked as evidence of fraud in the 2009 Iranian elections.
I am always looking out for cool mathematical laws and puzzles. If you know about any, forward them to me or leave a comment.