Digit analysis using Benford's law
1. Digit analysis using Benford's Law
Welcome to the last Chapter, and congratulations on making it this far! In this lesson we'll introduce Benford's law, which describes a surprising and fascinating fact about the distribution of the first digits of numbers.2. Introduction
Open a newspaper on a random page and circle all numbers. How many numbers will start with digit 1? With digit 2? With digit 9?3. Introduction
If all digits are equally likely then we expect to observe each digit as the first digit in approximately 1 out of 9 or 11% of the cases.4. Introduction
Benford's law, however, predicts a different distribution for the first digit of a number. According to this law, the probability that the first digit equals 1 is about 30%, while it's only 4.6% for digit 9.5. Newcomb and Benford
This surprising phenomenon was first discovered by Newcomb in 1881 and later rediscovered in 1938 by Benford. Both noted that in a book of logarithms the first pages, with low first digits, were more frequently used than the last pages with digits 7, 8 and 9 since they were more dirty. In those days logarithm tables were frequently used to speed up the multiplication of two numbers. Benford analyzed the distribution of the first digit in 20 different tables containing information about populations, molecular weights, mathematical sequences and death rates. He showed that for these datasets proportionally more numbers start with 1 than with 2, and more with 2 than with 3 and so on. Moreover, on average, the first digits are distributed in a particular way.6. Benford's law for the first digit
A dataset satisfies Benford's Law if the first digit d_1 appears with probability log, base 10, of 1 plus 1 divided by d_1. This Law for the first digit may be extended to the second digit, third digit,..., last digit and even combinations of digits. We will use the Law for the first two digits in the next lesson. A generalization to another base is also possible (for example, base 16 instead of base 10). Scale invariance means that we can convert from one unit to another. If Benford's Law is observed in a dataset expressed in euros, then it should still hold if we convert the data to dollars.7. Benford's law for the first digit
We implement Benford's law for the first digit as a function, so that we can calculate the expected frequency for each digit. For example 1 gives approximately 30%. Then we plot the expected frequencies for 1,2,9 in a barplot.8. Generating Fibonacci numbers and powers of 2
We generate two well-known mathematical sequences and we test whether they conform to Benford's Law.9. Function `benford` from package benford.anaysis
To validate Fibonacci numbers and powers of 2 against Benford's Law, we will use the `benford.analysis` package. We take number.of.digits=1 because we want to check for Benford's Law for the first digit. The figures are the first figures obtained with the plot function. The blue histogram is based on the data and the red line are the expected frequencies under Benford's Law. We see that there is an accurate fit for both data sets.10. Let's practice!
Now let's try some examples.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.