pydit.statistics.benford¶
Module to compute the Benford’s Law frequencies for a column in a dataframe
This is an common audit test to find indications (non conclusive) of fraud or errors in the population The Benford’s Law is an expected distribution for the “first n digits” of a magnitude.
It applies to natural magnitudes (please do research before applying it), typically height of people, lenght of rivers, etc. Because it posit that low digits should be more common, it tends to highlight fabricated transactions as, to humans, it look more natural to create them with a mix of low and high digits (e.g a transaction starting with 9 or 8 are disproportionally less likely to occur according to Benford’s Law)
Also where there is an artificial limit (approvals are needed over a certain amount) there is a tendency to see higher number of transactions with high first digits (e.g. $4,980 vs $4,000 for a limit of $5,000)
Functions
Returns the Benford's Law frequencies expected and actual for a column of values. |
|
Returns the Mean Absolute Deviation (MAD) of the Benford's Law frequencies. |
|
Returns the Benford's Law probability for the first n digits provided |
|
Returns a summary with the expected and actual Benford's Law frequency. |
|
Plots the histogram with Benford's Law expected and the actual frequencies. |