A Guaranteed Way to Win the Lottery
The clash between probability and number of trials
I always enjoy talking to people about probability because most people have weird ideas about how it works and in general our lizard brains aren’t really prepared to deal with probability, especially at extremes. You may have heard about the most common human misconception about probability called the gambler’s fallacy — basically the idea that random events like numbers on a particular die roll have “memory” — for example, if you roll a bunch of 4’s then 4’s are cold and are less likely to occur. This is a fallacy of course because the die has no memory of past events. If you roll a 4 six times in a row you still only have a 1/6 chance of rolling a 4 on the next roll. But humans are so wonderful at recognizing patterns that we see them even when they aren’t there, so you can see how this fallacy originated.
Talking about things that might happen is always tough for humans — we would really prefer to know what will happen. But probability is the only mathematical way to talk about the future. I have found when I talk to people about probability there is often confusion about the probability of an event and the probability of an event after a lot of trials. This does have a lot to do with biometrics, I promise, and I will get to that after a quick refresher.
If you have two dice, the probability of different rolls works out like the following:
There are 36 different ways you can roll 2 dice. There are lots of ways to roll a 7 , but only one way to roll snake eyes (2 ). So the probability of rolling a 7 is 1/6, but the probability of rolling snake eyes is 1/36. Snake eyes is much less probable outcome than a 7. But if you roll snake eyes your probability of rolling snake eyes again is still 1/36; the fact that you just did it is totally irrelevant to future outcomes.
Where people get confused is looking at the chance of a sequence of events and getting that confused with the gambler’s fallacy. The probability of rolling snake eyes on a single roll is always 1/36. But the probability of rolling snake eyes two times in a row is a totally different thing. When looking at a sequence of events, the probability of the sequence is the product of the probability of the events. So the probability of rolling snake eyes twice in a row is (1/36)*(1/36) or (1/1296) which works out to about .07%, which is pretty improbable. But this doesn’t invalidate the gamblers fallacy — the probability of a sequence of events is different from a single event.
To add more complexity to the mix, there is a second difference between the probability of an event happening and how likely that event is to happen after a number of trials. Keeping with our dice example, rolling snake eyes is relatively improbable, but if you keep rolling dice you will probably roll snake eyes eventually. Note my caveats — you might not ever roll snake eyes, ever. I can’t tell you what will happen, but I can tell you what will probably happen. Nothing in the universe prevents you from rolling dice 10,000,000 times and never getting snake eyes. Anything can happen. But things that are improbable will likely happen with enough trials. Winning the lottery is extremely improbable (1 in 292 million) for a person, but someone wins the lottery almost every week because there are so many trials. However, buying 292 million random lottery tickets will not guarantee that you will win (although you probably will win). To guarantee you win the lottery you need to actually buy one of each ticket. That’s the only way to win for sure. Feel free to send me 10% when you use this strategy.
So back to biometrics. When we look at biometric matching, we never know for sure whether two biometrics match — we just have a probability. This is usually expressed as a dimensionless number which is a similarity score. But it traces back to a probability, calculated by the people who wrote the algorithm. For example, in most of our systems a score of 60 is representative of a .001% probability of an error. We call this kind of error a false match. In other words, if I get a match score of 60 between two biometrics, there is only a .001% chance it isn’t a match. It is important to note that the numbers are just a convenience — a score of 60 on our system may be the same as a score of 9,000 on another system. It is the probability that matters.
With a really low probability of .001% of being wrong, it seems like we are safe declaring this to be a real match, right? .001% is a 1 out of 100,000 chance.
How could we be wrong?
Well, if we were doing a 1:1 comparison, like when you use your fingerprint to unlock your phone, this would be a great score and we could call it a day. This is called biometric verification.
But the problem of biometric identification is harder. For identification, you are comparing a single biometric against every record in the data set. In our projects, this is at least 1,000,000 background records, often far more. Remember, even improbable events happen if you have enough trials. So if we compare a biometric to a million records, we can expect a false match 1 out of every 100,000 records. With a million record data set, will will probably get 10 false matches. Not a useful score at all.
Want to hear more about the intersection of identity & privacy? Subscribe to my blog.
Originally published at www.tacticalinfosys.com.