How I Would Use Machine Learning To Build A Really Good Lie Detector
Shouldn’t machine learning technology be able to create an almost perfect lie detector?
If I were Palantir or the Director of the CIA or some similar agency I certainly would have used machine learning to attempt to create a highly reliable lie detector, but if I had done that I probably wouldn’t have told anyone about it.
Here’s how I would do it.
Start With Professional Liars
I would start by searching for the very best liars in the world — con-men, scammers, frauds and the like. There is no shortage of very successful liars.
I probably would want to recruit as many psychopaths as I could find as well. I would want at least a hundred of them and maybe, eventually, a thousand.
I would also need an equal number of bad liars — thoroughly honest people with little to no expertise in lying.
Let’s say that I was able to find 100 excellent liars and 100 terrible liars.
Expose Each To A Factual Situation
I would run each of them through a little “play” where they would enter a room and be interviewed by an actor who gave them information from a script. At several points events would occur — a glass would fall to the floor and break; someone would enter the room; a phone would ring; there would be a prominent poster on the wall; etc.
Question Them While Measuring Everything
Each of the subjects would then be seated where sensors would record all of the components of their body language, any flushing of their skin, the dilation of their pupils, the micro-tremors in their voice, their blood pressure, heart and breath rates, their galvanic skin resistance and any other factors we could think of to record and measure.
During the first test each “good liar” would be instructed to tell the absolute truth to the best of their recollection. We would delete any data associated with any inaccurate responses.
Each of their truthful answers together with all the recorded data would be included in the training data and labeled as “true” and “good liar.”
We would repeat this with each “bad liar.”
New interviews would be held and during the following questioning each participant would be instructed to lie as convincingly as possible about one or more specific facts.
We would build in a system of rewards, e.g. give each person $100 for every lie that fooled the interviewer.
Again, we would create a separate data set for each good liar and each bad liar. That data would be added to the ML database and the true and false answers appropriately labeled.
Challenge The System To Separate Lies From Truth
A third set of little plays would take place and the subjects given a subsequent interrogation during which they would be instructed to lie about a specific fact and be truthful about a different fact.
Now we would ask the machine-learning system to differentiate the truthful answers from the lies.
Eliminate Irrelevant Data Points
We would want to determine which data points were irrelevant or unnecessary. Maybe breath rate or sweating were unreliable data points yielding as much data that did not correlate with the truth or falsity of an answer as data that did. OK, stop measuring those unreliable, or more accurately, unhelpful data points.
Trim down the qualities we would want to measure to only those that strongly correlated with truth or lies.
Repeat with new scenarios and new lies until a minimum required level of accuracy was achieved.
Check For Drugs Or Alcohol Usage
Repeat the process after giving each subject some drug that people commonly use to try to beat polygraphs, i.e. certain tranquilizers and/or alcohol.
Train the system to determine if a subject appears to have taken alcohol or one of these drugs prior to the test.
Develop A Baseline Test
We would need to develop a short test to build a baseline for a new subject. For example, while being monitored, ask every subject to look at a black or white flash card and state if the card is black or white, and tell them to lie about its color for the second and fifth card or some such thing.
The system would use this data to create a rough baseline for the subject before beginning questioning on the topic at hand.
Rate On A Zero-To-100 Big-Fat-Liar Scale
Using a more detailed test interview I would want the system to be able to rate a subject on a 1 to 100 scale between scrupulously truthful and pathological liar. Yes, truthful people can lie and liars can tell the truth, but you would like to know going in what sort of witness you are dealing with.
Once you think you’ve got the system trained, start all over with a different set of liars and a different set of test scenarios and see how the system does with fresh data.
Standardize The Subject’s Environment
Of course, you would want to standardize as much of the process as possible. If I were the CIA and if I were relying on this system to discover if some intelligence officer was a Russian mole, I would have the subject take off all of their clothes and make them take the test wearing nothing but a pair of sandals and a jumpsuit.
Would It Actually Work Or Is This A Fool’s Errand?
Creating this system would take a great deal of time and money and maybe I’m underestimating the difficulty of differentiating a lie from the truth when the subject is a skilled, pathological liar, but I would think that, eventually, an ML system using a combination of body language, pupil dilation, voice readings, and the like and trained on a data set derived from questioning pathological liars, should be able to pick out a lie 98% of the time.
There’s one way to find out if that’s just a silly notion based on a naive faith in machine learning when coupled with body language and other physiological readings — build one and see if it works.