A.I. will weed-out Human Biases
~ by learning to imitate a person, the A.I. is our informant ~
TL;DR —Reward a neural network when it accurately imitates the decisions of a particular person. That artificial intelligence can then be tested for bias in ways that the person would have evaded. So, we can spot when humans are biased, and rely upon only the unbiased humans as a random sample of checks-and-balances upon other artificial intelligences.
We fear artificial intelligence’s bias — as an unending totalitarian itself, or in the hands of totalitarians. Human history is bent around the calamity and atrocity those same biases brought out of us. So, are we doomed, either to human bias, or artificial intelligence’s bias, or a combination of both?! No. We can plan carefully, and construct methods which reveal any bias we concern ourselves with identifying, without complication or innate constraint. We can be remarkably certain of those results, as well — not certain enough to indict someone for a thought, yet with enough credence to warrant barring them from holding executive or legal decision-making powers, after seeing their historical decisions.
You may scoff — “how can you eliminate all bias?” I don’t claim that. Instead: we can, with high confidence, identify people who are biased, in whichever form of bias we bother to check for. Justice is still up to us; I’ll be explaining how neural networks are a reliable tool for spotting each bias, in each person with authority.
Taking that tool a step further: once biased humans have been removed from their posts as gatekeepers, judges, and such, then the remaining trusted people can be randomly assigned each issue or task. Each query goes to multiple people, which lets us spot when a particular adjudicator is becoming lax, or giving preference. These ongoing, independent, human-in-the-loop checks feed better metrics of those peoples’ biases, and these checks also provide high-quality data for training the A.I. assistants who flag and review the torrent of incoming information and monitor the humans’ work.
First, how would an A.I. imitation of a person act as our informant?
It Forgets to Lie
The key concept here is that neural networks are only changing when they are initially trained on data. Once they are done learning, then all the connections in the neural network are ‘frozen’ — they are locked, unable to change any more. We only use the neural network to do stuff once it has been ‘frozen’ this way.
Because a neural network learns by changing the connections inside itself, then a ‘frozen’ network CANNOT LEARN. This is important: we want the A.I. system to do what we expect it to do; it should behave how it did when we had finished training it. Any change could cause unexpected errors. So, keep the neural network ‘frozen’ when you deploy it!
That also means: once you are done teaching a neural network, and it is ‘frozen’, then it cannot form ANY new memories. This is completely unlike our own brains, and that difference is critical for this next step: ask the neural network to judge two cases which are copies of each other, except for some aspect of the person’s identity. A biased human would remember both cases, and notice that you are trying to SPOT their bias! They intentionally give both cases the same verdict, knowing that you are watching for any differences.
Neural networks, unlike us, don’t know what you did with them a moment earlier; they don’t have any place to store the memory. So, when you hand them the second case, and ask for their verdict, they don’t remember what verdict they gave before. Due to the fact that they can’t remember, it’s impossible for them to keep a lie straight. (They also have no motives except the ones we specifically reward, but the key take-away is that the intelligence needs memory to make a lie self-consistent!)
So, how do you build a neural network like this, and what would it look like, applied to the courtroom for instance?
Judging the Judges
Judge Bucephalus Pontus Milquetoast has had a distinguished career of 35 years, imprisoning hundreds. But, was he just? To find out, we first gather all the records from all the cases that Judge B.M. ruled upon, and we randomly withdraw a tenth of them (these cases are called the ‘validation set’, and we use them later to check that the neural network is learning). The remaining 90% of Judge B.M.’s case history becomes the training data for a neural network.
Millions of times, repeatedly, this network is presented with a case, and asked “Red or Green?” The neural network does NOT know what ‘red’ and ‘green’ mean. All it knows is, each case is supposed to be ‘red’ or ‘green’, and when the neural network guesses correctly, it gets a treat! Over and over, finding whichever patterns work for helping to guess correctly, the neural network learns to predict what the honorable Judge B.M. had ruled in that case. The neural network is NOT learning real justice. The A.I. is learning how to be like Judge B.M.
Once the neural network seems to be doing well, we test its performance on the ‘validation set’ mentioned above. If the neural network truly does imitate Judge B. Milquetoast, then it should give identical results as he did, on most if not all of the cases in the ‘validation set’. That’s the bar that the neural network must pass, to be useful for spotting any bias in Judge B.M.
Now, finally, we can give this neural network two cases which differ only in a person or group’s identity, or any other factor we wish to check! The neural network does not remember that it was asked about two nearly-identical cases, so it gives answers as if Judge B.M. didn’t know he was being observed, and couldn’t remember the previous case. His bias will be on display. Observation of strong bias can be used to justify removal from the courts, and that should be done. To be clear, A.I. do NOT decides cases; that task is still done by humans. Yet, they’re humans you can trust just a little bit more.
Random Assignment and Reinforcement
We can advance to the next layer, building-upon that assurance of reduced bias in each measured way. Now, when a court case must be decided, it is sent to five ‘bias-network’-tested HUMAN judges at random, who lack any way to find out who else is judging that case. Those five judges make their determinations. If a few disagree with the majority, that deserves further attention. And, if the disagreements from one of those judges begin to follow a pattern of bias (detected by the imitator networks, again), that judge can be excluded.
In this way, humans still mete justice themselves. Yet, we can use A.I. to weed-out the humans who shouldn’t be entrusted. Further use of A.I. is possible, though not required, as a first-pass, reviewing the cases initially to make a preliminary determination, as well as acting as a sanity-check on the determinations of the judges, in case of the appearance of new, wide-spread biases. These techniques overcome objections and concerns of A.I. bias that I have encountered, which generally reduce to ‘biased data’, ‘A.I. in-ultimate-control’, ‘brittle models’ and ‘ulterior motives of clients’. Do you know of any others?