Math history

53 readers

14 users here now

Sharing of different mathematic elements, stories, archives of all kinds.

founded 5 days ago

MODERATORS

xiao@sh.itjust.works

Generalized Benford’s Law as a Lie Detector (sh.itjust.works)

submitted 2 days ago by xiao@sh.itjust.works to c/Math_history@sh.itjust.works

0 comments fedilink hide all child comments

What is Benford’s Law?

Benford’s law describes the relative frequency distribution for leading digits of numbers in datasets. Leading digits with smaller values occur more frequently than larger values. This law states that approximately 30% of numbers start with a 1 while less than 5% start with a 9. According to this law, leading 1s appear 6.5 times as often as leading 9s! Benford’s law is also known as the First Digit Law.

If leading digits 1 – 9 had an equal probability, they’d each occur 11.1% of the time. However, that is not true in many datasets. The graph displays the distribution of leading digits according to Benford’s law.

Analysis of datasets shows that many follow Benford’s law. For example, analysts have found that stock prices, population numbers, death rates, sports statistics, TikTok likes, financial and tax information, and billing amounts often have leading digits that follow this distribution. Below is a table that Benford produced for his 1938 study, which shows the different types of data he evaluated.

While Benford popularized the law in 1938, he didn’t actually discover it. Simon Newcomb first found the distribution in 1881. Hence, some analysts refer to it as the Newcomb-Benford Law.

In this post, learn about Benford’s law, its formula, how it works, and the types of datasets it applies to. Additionally, I’ll work through an example where I assess how well Benford’s law applies to a real dataset. And you’ll learn how to use Excel to assess it yourself!

Uses for Benford’s Law

Benford’s law is an intriguing, counterintuitive distribution, but can you use it for practical purposes?

Analysts have used it extensively to look for fraud and manipulation in financial records, tax returns, applications, and decision-making documents. They compare the distribution of leading digits in these datasets to Benford’s law. When the leading digits don’t follow the distribution, it’s a red flag for fraud in some datasets.

The idea behind why this works is straightforward. When people manipulate numbers, they don’t track the frequencies of their fake leading digits, producing an unnatural distribution of leading digits. In some cases, they might systematically adjust the leading digits to be below a particular threshold value. For example, if there is a $100,000 limit on a transaction type, fraudsters might start many numbers with a 9 for $99,000.

Using Benford’s law to find fraud is admissible in local, state, and federal US courts. In the past, it has detected irregularities in Greece’s EU application and investment return data for Ponzi schemes, such as Bernie Madoff’s.

However, there are several important caveats.

When a dataset you expect should follow Benford’s curve does not, it’s only a red flag, not proof of fraud. You’ll still need to send in the auditors and investigators, but at least you can target them more effectively on questionable records.

Furthermore, not all data follow Benford’s law naturally. In those cases, leading digits that follow a different distribution aren’t signs of fraud. Consequently, it’s crucial to know which datasets are appropriate to compare to it—which takes us to the next section.

Benford’s Law Formula

Benford’s law formula is the following:

Where d = the values of the leading digits from 1 to 9.

The formula calculates the probability for each leading digit.

https://statisticsbyjim.com/probability/benfords-law/

Note. NBL = Newcomb-Benford law.

Many real-world datasets approximately conform to NBL (Hill, 1998; Nigrini, 2012). For instance, the distance between earth and known stars (Alexopoulos & Leontsinis, 2014) or exoplanets (Aron, 2013), crime statistics (Hickman & Rice, 2010), the number of daily-recorded religious activities (Mir, 2014), earthquake depths (Sambridge, Tkalcic, & Arroucau, 2011), interventional radiology Dose-Area Product data (Cournane, Sheehy, & Cooke, 2014), financial variables (Clippe & Ausloos, 2012), and internet traffic data (Arshadi & Jahangir, 2014), were found to conform to NBL. In psychology, NBL was found relevant in the study of gambling behaviors (Chou, Kong, Teo, Wang, & Zheng, 2009), brain activity recordings (Kreuzer et al., 2014), language (Dehaene & Mehler, 1992; Delahaye & Gauvrit, 2013), or perception (Beeli, Esslen, & Jäncke, 2007).

Although NBL is ubiquitous, not all random variables or datasets conform to it. Scott and Fasli (2001) studied 230 sets of data and found that among them, less than 13% conformed precisely to NBL. Diekmann and Jann (2010), Bonache, Maurice, and Moris (2010), or Lolbert (2008) have also warned against overconfidence in NBL. NBL is not, they argue, a universal law but a property that appears in certain specific (albeit numerous) contexts.

The Sensitivity and Specificity of Benford Analysis

Human pseudorandom productions are in many ways different from true randomness (Nickerson, 2002). For instance, participants’ productions show an excess of alternations (Vandierendonck, 2000) or are overly uniform (Falk & Konold, 1997). As a consequence, fabricated data might fit NBL to a lesser extent than genuine data (Banks & Hill, 1974). Haferkorn (2013) compared algorithm-based and human-based trade orders and concluded that algorithm-based orders approximated NBL better than human-based orders. Hales, Chakravorty, and Sridharan (2009) showed that NBL is efficient in detecting fraudulent data in an industrial supply-chain context.

These results support the so-called Benford analysis, which uses a measure of discrepancy from NBL to detect fraudulent or erroneous data (Bolton & Hand, 2002; Kumar, 2013; Nigrini, 2012). It has been used to audit industrial and financial data (Rauch, Göttsche, Brähler, & Kronfeld, 2014; Rauch, Göttsche, & El Mouaaouy, 2013), to gauge the scientific publication process (de Vries & Murk, 2013), to separate natural from computer-generated images (Tong, Yang, & Xie, 2013), or to detect hidden messages in images’ .jpeg files (Andriotis, Oikonomou, & Tryfonas, 2013).

As a rule, the Benford analysis focuses on the distribution of the first digit and compares it to the normative logarithmic distribution. However, a more conservative version of Benford’s law states that numerical values or a variable X should conform to the following property: Frac(Log(X)) should follow a uniform distribution in the range of [0,1[. Here, Frac(x) stands for x—Floor(x), Floor(x) being the largest integer inferior or equal to x. The logarithmic distribution of the first digit is a mathematical consequence of this version (Raimi, 1976).

Hsü (1948), Kubovy (1977), and Hill (1988) provided direct experimental evidence that human-produced data conform poorly to NBL. However, in their experiments, participants were instructed to produce numbers with a given number (four or six) of digits. Specifying such a constraint could well induce participants to attempt to generate uniformly chosen numbers or to use a digit-by-digit strategy (repeatedly picking a random digit). Researchers who study the situations in which NBL appears often conclude that one important empirical condition is that the numerical data cover several orders of magnitude (e.g., Fewster, 2009; for a more detailed mathematical account, see Valadier, 2012). Consequently, any set of numbers that are bound to lie in the thousands scale (four digits) or in the hundreds of thousands scale (six digits) will probably not conform to NBL, whether produced by humans or not. Participants’ failure to produce data conforming to NBL could just be a consequence of the instructions they were given.

Furthermore, these studies were decontextualized. Participants were asked to give either the “first number that came to mind” or a number chosen “at random”, without being told what the numbers were supposed to represent. A “random number with four digits” usually implicitly refers to the uniform distribution (Gauvrit & Morsanyi, 2014)—and the uniform distribution does not conform to NBL. Therefore, the lack of context could prime a non-Benford response, even if participants are, in fact, able to produce series conforming to NBL.

For these reasons, the idea that fabricated numerical data will usually not follow NBL has been questioned. Using a more contextualized design, Diekmann (2007) asked social science students to create plausible regression coefficients, with four-digit precision. Note that, contrary to the case of a four-digit integer, which is bound to fall between 1,000 and 9,999, covering only one order of magnitude, here, the coefficients could run between .0001 and 1, covering four orders of magnitude. Diekmann found that, in this case, the first digit does approximately conform to NBL and concluded that researchers should not only consider the first digit as relevant to detecting fraud but should also look beyond the first digit, toward the conservative version of NBL. Using correlation coefficients makes the task meaningful, and this may explain why Diekmann’s participants are not bad at producing plausible rs, whereas other samples suggested that human participants would be unable to mimic NBL.

Another study went even further in formulating meaningful tasks by using the type of data known to exhibit a Benford distribution. Burns (2009) asked participants to guess real-world values, such as the US gross national debt or the peak summer electricity consumption in Melbourne. He found that although participants’ first digit responses did not perfectly follow a logarithmic law, they conformed to the logarithmic distribution better than to the uniform distribution. Burns concluded that participants are not too bad at producing a distribution that conforms to NBL as soon as the task involves the type of real-world data that do follow NBL.

One limitation of Burns’ (2009) study is that it only works at a population level. We cannot know from his data if a particular individual would succeed in producing a pseudorandom series conforming to NBL, since each participant produced a single value. Nevertheless, his and Diekmann’s studies certainly suggest that using Benford’s law to detect fraud is questionable in general since humans may be able to produce data confirming to NBL, in which case a Benford test will yield many undetected frauds, lacking sensitivity. As mentioned above, not all random variables or real-world datasets conform to NBL (and when they do, it is generally only in an approximate manner). Because many real-world datasets do not conform to NBL, a Benford test used to detect fraud not only may have low sensitivity but may also have low specificity.

Generalized Benford’s law

Several researchers (e.g., Leemis, Schmeiser, & Evans, 2000) have studied conditions under which a distribution seems more likely to satisfy NBL. Fewster (2009) provided an intuitive explanation of why and when the law applies and concluded that any dataset that smoothly spans several orders of magnitude tends to conform to NBL. Data limited to one or two orders of magnitude would generally not conform to the law.

To pursue the question of why many data conform to NBL further, the conservative version of the NBL may be a better starting point than the mere first-digit analysis. Recall that in the conservative version, a random variable X conforms to NBL if Frac(Log(X)) ~ U([0,1[). In an attempt to show that the roots of NBL ubiquity should not be looked for in the specific properties of the logarithm, Gauvrit and Delahaye (2008, 2009) defined a generalized Benford’s law (GBL) associated with any function f as follows: A random variable X conforms to a GBL associated with function f if Frac(f(X)) ~ U([0,1[). The classical NBL thus appears as a special case of GBL, associated with function Log.

Testing several mathematical and real-world datasets, Gauvrit and Delahaye (2011) found that several of them fit GBL better than NBL. Of 12 datasets they studied, six conformed to the classical NBL, while 10 conformed to a GBL with function f(x) = π × x2, and nine with square-root function. On the other hand, none conformed to GBL with function Log o Log. These findings suggest that a GBL associated with the relevant function—depending on the context—might yield more specific or sensitive tests for detecting fraudulent or erroneous data.

We addressed this question in two studies. In both studies, each participant produced a whole series of values, allowing analyzing the resulting distribution at an individual level. In Study 1, we examined three versions of GBL in four different situations in order to compare the sensitivity and specificity of different types of GBL analyses. Study 2 explored the potential effects of variations of familiarity with the material and of cognitive effort on the productions.

Concluding Discussion

We performed the first investigation of the generalized Benford analysis, an equivalent of the classical Benford analysis, but based on the broader GBL. Results from Study 1 rendered mild support for the generalized Benford analysis, including the classical Benford analysis. They also draw attention to the fact that different types of data yielded different outcomes, suggesting that the best way of detecting fraud using GBL associated with some function f would be obtained either by finding the function f that best matches the particular data at hand or by combining different analyses. Although the classical Benford analysis was validated in our studies, it occasionally failed at detecting human-produced data as efficiently as other generalized Benford analysis.

The present positive results could have been the result of our sample characteristic, in which participants, contrary to real swindlers, might have put little effort into the task since the stakes were low. Plus, the participants were not highly familiar with the material at hand. To rule out the possibility that our results resulted from such features and GBL would be inapplicable in real situations, Study 2 aimed at demonstrating that cognitive effort and familiarity with the material have little effect on the participants’ responses. The data supported this view, although further studies (including higher levels of cognitive pressure and true experts) would be recommended.

With Benford analysis having become more common in fraud detection, new complementary analyses are needed (Nigrini & Miller, 2009). The GBL analysis potentially provides a whole set of such fraud detection methods, which means making it more difficult, even for informed swindlers intentionally conforming to NBL, to remain undetected.

https://pmc.ncbi.nlm.nih.gov/articles/PMC5504535/

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here