Recently when taking a look at the early decimal digits of I made the following observations:
- The first run of seven distinct digits (8327950, shown underlined) appears in the 26th to 32nd decimal place. Curiously the third such run (5923078, also underlined) in decimal places 61 to 67 contains the same seven digits. (There is also a run of seven distinct digits in places 51 to 57 with 5820974.)
- Decimal digits 60 to 69 (shown in bold) are distinct (i.e. all digits are represented once in this streak). The same is true for digits 61 to 70 as both digits 60 and 70 are ‘4’.
Assume the digits of are generated independently from a uniform distribution. Firstly, how often would we expect to see a run of 7 distinct digits? Places to are distinct with probability
Hence we expect runs of 7 distinct digits to appear places apart. This includes the possibility of runs such as 12345678 which contain two runs of 7 distinct digits that are only 1 place apart.
How often would we expect to see the same 7 digits appearing in a run as we did in places 26-32 and 61-67? Furthermore let’s assume the two runs have no overlap, so we discount possibilities such as 12345678 which have a six-digit overlap. We expect a given sequence (e.g. 1234567, in that order) to appear of the time. There are permutations of such a sequence, but of these has overlap 6 (2345671), has overlap 5 (3456712 or 3456721), has overlap 4, …, has overlap 1 with the original sequence. This leaves us with possible choices of the next run to have the same 7 digits but non-overlapping (or to appear in precisely the same order – overlap 7). Hence we expect the same 7 digits to recur (no overlap with the original run) after approximately places apart so what we saw in the first 100 places was remarkable.
Now let us turn to runs of all ten distinct digits. Repeating the argument above, such runs occur every places. According to  the next time we see ten distinct digits is in decimal places 5470 to 5479.
To answer the question of when we would expect to see the first occurrence of ten distinct digits, we adopt an argument from renewal-reward theory based on Sec 7.9.2 of  (there also exist approaches based on setting up recurrence relations, or martingale theory (modelling a fair casino), see -). Firstly we let be the first time we get 10 consecutive distinct values – we wish to find where denotes the expected value operator. Note that this will be more than the 2756 answer we obtained above since we make no assumption of starting with a run of ten distinct digits – there is no way could be 1 for example, but we could have two runs of ten distinct digits that are 1 apart.
From a sequence of digits we first define a renewal process in which after we get 10 consecutive distinct values (at time ) we start over and wait for the next run of 10 consecutive distinct values without using any of the values up to time . Such a process will then have an expected length of cycle of .
Next, suppose we earn a reward of $1 every time the last 10 digits of the sequence are distinct (so we would have obtained $1 at each of decimal places 69 and 70 in the example). By an important result in renewal-reward theory, the long run average reward is equal to the expected reward in a cycle divided by the expected length of a cycle.
In a cycle we will obtain
- $1 at the end
- $1 at time 1 in the cycle with probability (if that digit is the same as ten digits before it)
- $1 at time 2 in the cycle with probability (if the last two digits match those ten places before it)
- $1 at time 9 in the cycle with probability (if the last nine digits match those ten places before it)
Hence the expected reward in a cycle is given by
We have already seen that the long run average reward is at each decimal place. Hence the expected length of a cycle (i.e. the expected number of digits before we expect the first run of ten consecutive digits) is given by
Hence it is pretty cool that we see it so early in the decimal digits of . 🙂
 S. Ross, Introduction to Probability Models, Academic Press, 2014.
 A Collection of Dice Problems – madandmoonly.com