Non-Rigged Elections, Or how I learned to stop worrying and love Kemal

Day 2,422, 00:44 Published in USA USA by Aren Perry

A recent article was published showing simulations of an election under the independent voting assumption and that it is frequently the case that without rigging the end outcome is decided by the first third of the votes. Many people called this "good math" but "bad results" because voting isn't i.i.d. Some called this an abuse of the Law of Large Numbers (which while I will not address here-it was a perfect use of it. Those who thought otherwise were confusing Kemal's plotting of the sample proportion verses individual results...I hope.)

This all missed the main point: The behavior of the first third predicting the end outcome is NOT evidence of rigging because such behavior is possible and plausible when we do not have any rigging whatsoever.

So, in support of my brother in Statistics (though technically in Economics), I thought I would show two things below:

One: The vote total being predicted by the first third is not what we would generally expect from voting rigging from say, MU's or any group of individuals (bot or not) voting in blocks together or systematically in clusters.

Two: While Kemal's simulations did not include dependence, and that was not the point (see main point above), let's add some dependence and see what happens anyways.

I have included tldr tidbits you can look for if you are not interested in the details.

Assumptions I assumed in all cases that "true" end vote percentage for one candidate is 55%. The vertical line is where the first third of them have voted. The horizontal line is the "true" end vote percentage on average (55😵. The different color lines are multiple different simulations so we can see what happens more than once.


One:


First, as a baseline, here is the same thing Kemal di😛



Things are very "tight" as expected for this ideal situation. However, it shows that it is not unexpected that the first third would predict the end outcome even with a close margin of 5% difference. Compare this to the random walk model we expect a fixed election to (possibly) follow:

The simplest model (but effective!) we can simulate from that has clusters of voting of similar types together is one where we make the probability of the NEXT vote being for one candidate being linked to the probability that the PREVIOUS vote is for that candidate.

That is, if the previous vote was for DMJ with say, probability 80%, then the probability that the next vote is for DMJ should be around 80% as well. It is called the "locally clustered" model as referenced by my major professor. These clusters can represent either gradual systematic bot programming or "hit at once" scenarios just by changing the variability associated with the movements. Either way, the behavior is still the same.



This is called a Random Walk model on the mean of the binomial process. See the graph here:




The tl/dr for the above is the second graph is representative of a possible rigging mechanism either by bots or by mass voting.

As you can see there is much more variability in the movement of the overall voting proportion. Also, at the 1/3rd mark, two of the 8 simulations (both green ones by total coincidence) are NOT with the end winner. This means there is a 25% chance that the end result is not correctly predicted at the 1/3rd mark. Thus, it seems that the ending NOT being predictable is better evidence for rigging!



The tl/dr for the above is the second graph is representative of a possible rigging mechanism either by bots or by mass voting. The first represents Kemal's simulations. The first graph says that independent votes are a VERY plausible scenario for why we can predict the outcome off of the first third, the second says that rigging the election means that prediction based off of the first third is plausible, but LESS plausible than under totally random voting! Current voting behavior we have observed suggests we have a random voting procedure (most likely)!




Two: Some criticized Kemal not so much because he didn't simulate a "rigged" case, but because the independence assumption isn't realistic to the situation because voting is a process that has some dependence. Thus, here are two types of models that have dependence but are not indicative of behavior you would expect of rigging.

They are the AR and MA time series models applied to the mean of the binomial process. If you want more detail I will gladly give them (I will say I chose dependence parameters of 80😵, but here they are:





tl/dr
The above show dependence but not rigging. Compare these to the two prior graphs. Like I commented in Kemal's article, they show less certainty (less effective sample size) in the first third of the end results but with much more consistency than the "rigged" model.



In conclusion: Dependence of voting results in the same general behavior as observed by Kemal validating his conclusion and simulations that showed that being able to predict the winner with the first third is very, very plausible without rigging.

Also, rigged elections are more likely to be hard to predict with the first third than non-rigged. Thus, the consistency within the first third is evidence FOR non-rigged elections more than rigged. Just for the sticklers I will note that one could build more complicated models to more accurately mimic an infinite number of possible rigged scenarios more closely (literally infinite). My example is just one of those possibilities and is the most generally plausible and general scenario statistically speaking (random walk).

tl/dr[\b] Kemal was right-dependence or not his methods AND conclusions were appropriate.



Aren Perry
Frustrated Stat Guy