Benford's Law is one of most intriguing properties within data science. Benford's Law states the first digit in a set of natural numbers is not evenly distributed. Instead, 30.1% of first digits begin with a one, 17.6% begin with a two, 12.5% begin with a three and so on. I created a SPSS Extension
that makes graphing Benford's Law easy, and since that time I'm always testing it on new datasets, so I've tried it with the price and volume of a few crypto currencies.
[caption id="attachment_7350" align="alignnone" width="2668"]
Benford's Law prediction (green line) vs. Bitcoin Daily High Prices isn't a great match[/caption]
First, I compared Benford's Law to the Bitcoin daily high price. The results of the daily high price do not closely match Benford's Law. That isn't too surprising - many parties have attempted to use Benford's Law with stock prices without much success.
Next I graphed Bitcoin Volume.
[caption id="attachment_7354" align="alignnone" width="2668"]
Bitcoin USD-Volume on the Kraken exchange
and Benford's Law match amazingly well[/caption]
For this dataset, I used the Bitcoin volume on the Kraken Exchange
. It's remarkable how well Bitcoin's Volume matches with the predicted value of Benford's Law. I tried this both for Bitcoin Volume measured in Bitcoin and again measured in USD and in both cases Benford's Law holds very well.
Next I tried some altcoins on a completely different exchange. Here I examined the volume of altcoins on the Polonix exchange
[caption id="attachment_7395" align="alignnone" width="3558"]
Polonix quoteVolume for these altcoins match Benford's Law ideal very well[/caption]
It's amazing how well Benford's Law holds up across multiple exchanges and multiple cryptos. Of all these cryptos, only Bitcoin Cash (BCH) has a noticeable difference. As a follow-up, it'd be interesting to see if BCH shows similar behavior across other exchanges.
Because the match between Crypto Volume and Benford's Law is so strong, it may be used for fraud detection of fake cryptos or to verify exchange data. Presuming you can obtain the volume data, it would serve as a great sanity check to ensure the volume distribution has not been tampered with or entirely falsified.
This entire analysis was completed with SPSS Statistics Subscription