There generally are two good approaches to building data-driven solutions, with a bad one in between.
Good approach #1: Formulate the problem along with measurable metrics, then employ machine learning, and be open-minded at interpreting and acting upon what that machine learning has discovered, however controversial or contradictory it is.
Good approach #2: Decide up front that the problem is too socially-sensitive, and do not use machine learning to solve it. Collect data, involve humans, and have a human, or a team, responsible for each decision it makes. Use data science in the back office to make sure humans are not making obvious blunders, but don’t mix the two.
Bad approach: Claim AI / ML is used, train and validate some models, and then, upon receiving the results that are different from what were expected, tweak the way those results are used.
Generally speaking, building ML- and AI-based software reduces the risk of the developer / product manager / founder to incorporate their own biases into the product.
TL;DR how to do it right:
- If you decide to use AI/ML, be honest about it from day one, and do so openly and transparently.
- If the AI/ML solution misbehaves, don’t try to tweak its outputs to have it “conform” to some “standards” you’d like to hold it to. Instead, admit that the original goal set for the ML/AI did not account for certain types of skewed solutions, adjust the objective function, re-train, iterate, repeat.
A human’s knowledge of “outcome Y is better than outcome X” is worthless compared to the humans’ agreement on “for problem P, formulation B is better than formulation A”.
If you used ML/AI, happened to not like the outcome X, and are ready to justify Y is a better outcome, don’t argue about Y being better than X.
Rather, if X is a machine-discovered solution to problem postulated as A, find an alternate formulation, B, that yields Y or something closer to Y than to X, and then argue B, not A, is the right problem to solve.
An attempt to apply human judgement in the realm of the output of the ML/AI solution is a poor way to start an argument. When it comes to finding the best solution given the set of well-formulated constraints, the machine is better than us; and that is to be seen as the ultimate blessing, not as a source of trouble.
In order to preserve and multiply the value that ML&AI brings into our, human, world, humans are best to apply their intuition and expertise at the problem statement level. This is where morals, biases, and other constraints can and should come into play. This is the level where we can have a deep, thoughtful conversation about what the right direction is, and what is to be avoided as morally or socially unacceptable.
Then, keep tweaking the constraints until the machine-generated solution fits all of them and doesn’t make eyebrows rise. And be ready to embrace the solution, no matter what your inner voice — with its own imperfections and biases — will tell you about how unconventional might it be.
May the data be with you.