What? Of course there's a reason! It's because you only post top runs! That's like saying "there's no reason lottery winners will have hit more winning numbers than anyone else", but they're lottery winners exactly because they did!
The longer each game session goes, the closer the RNG will be to average. This will happen even if you're only posting one-in-a-million runs. One-in-a-million when you pick 10 numbers is extremely biased. One-in-a-million when you pick a thousand numbers is only subtly biased.
Yes, the longer the game is the closer the average will be to the true average. But given a fixed or bounded game length, you can play enough games to get as much deviation from the average as you want.
Again, the numbers need to be run. There's no way to tell without them, the truth could be either way.
> But given a fixed or bounded game length, you can play enough games to get as much deviation from the average as you want.
If "you" is some abstract entity with infinite time, perhaps. Even if you do ten three-hour runs a day, you can't actually get a very high deviation from average in any given century. Getting a certain deviation requires exponential time in proportion to the number of RNG calls.
You're assuming those are random numbers. As the article says, they are only semi-random (that's one of the big surprises they found).
And you can see from the numbers that it's not just Mitchell's values that deviate from the expected mean, it's the others as well (he's just more extreme), which also shows we can't expect them all to converge to the mean, there's some more complex statistical process going on.
His deviations being more extreme might be explained by him playing more games, or it might be cheating. So far we can't tell.
Exactly. If a game has an element of chance, then a player's top scores will tend to be top scores in part because of luck helping there (same reason as why regression toward the mean [1] exists).
In such a game if each player can choose how many times they play then they can gain an advantage simply by playing more. And their top games will look more and more statistically unlikely. That can be especially surprising if they only report those top games, which is the case here.
What would be interesting to compute is how many more games does a player need to be playing in order to reasonably get the results in the article. If it's 2x, it's not proof of cheating. But if it's 1,000x then maybe it is, because who has that much free time?
Yet every run still has many thousands of RNG invocations for that particular metric. To be off by a few percent, safe to say that RNG would fail any randomness test.
Are there thousands of smashes per game? (I'm not a Donkey Kong player.)
Yes, the RNG is not perfect, as the article says,
> the points derived from those enemy smashes are assigned semi-randomly
(which apparently solves an old mystery there).
In any case, the numbers for total smashes is 380 for Mitchell and 242-371 for the other players. Percent of score from smashes is 17.7 versus 10.6-14.8. In both cases the large variance among the other players shows that while he's an outlier, we can't expect the numbers to be close to a specific average, this isn't a simple statistical process where we average out thousands of independent variables.
That's assuming that the random number generator is truly random, and not a deterministic pseudo-random-number-generator, which will _necessarily_ produce a sequence of numbers that satisfy certain statistical properties, over a long enough run.
It's unlikely, but possible, to roll a 6-sided die 100 times in a row and get 100 6's.
In basically every PRNG algorithm, there is _no_ reachable state which will result in 100 6's in a row.
Are you sure about this? The period of a Mersenne twister is 2^19937 - 1. I'm pretty sure somewhere along that inconceivably large number of states we can find 100 6s in a row.
Expect to get length 1 in a number 10 digits long, 2 in a number 100 digits long (10²), 3 in a number 1000 digits long (10^1000)... So expect to find a run of 100 6s in a random number 10^googol digits long I think.