Why Using AI To Determine Relevance OR Block Offensive Content May Be Hopeless

The Basic Concepts, The Challenges, And The Nature Of The System

Decision-First AI
Comprehension 360
Published in
5 min readSep 12, 2018

--

I have been building artificial intelligence for decades, although the buzzword have changed repeatedly along the way. Today, the buzz is on “AI” and the mighty “Algorithm”. That probably isn’t helpful… but more on that later.

I am a huge proponent of AI, in certain situations. Industry 2.0 is one of the most compelling. The data is cleaner, the tasks more focused, and the control systems more mature. It is also an environment where (no offense) the human mind tends to get lazy. It is repetitive, precise, and droning. This is where AI can thrive.

This image found here.

I am more critical…

…of ideas like a robot art critic. I think it is a fun experiment. There is plenty to learn here. But let’s run this thought experiment a little further.

What if artificial intelligence was used to judge all art? Of course this would not be done with a set of rules. That is not how real AI works. Instead artificial intelligence would be designed to learn from humans.

Perhaps like Berenson here, it did so by weighting a sample of human reactions to the art? Berenson, at a most basic level, runs a census on smiles vs frowns. That might work… right?

Unfortunately before you dig too deeply on whether that is an optimal or pleasing concept, we need to add one other component. Suppose, that only art that passed Berenson’s algorithm was allowed to be displayed for the public? In other words, museums and other venues would only display the art work that sample populations said was good.

Let me pause. We could consider the implications of smiles as our measurement. We could debate sample sizes. But all of that is overshadowed by the added risk/implication. A machine taught to judge is being used to censor. In our experiment, when Berenson doesn’t like it — you don’t get to see it.

This is the model that social media companies are using. They are judging content for either its relevance or its offensiveness and then censoring. They are not offering badges. It is not a matter of things being AI approved. What the AI finds irrelevant or offensive — you never see.

A little clarification…

Facebook, Twitter, and the like aren’t counting smiles. They are counting thumbs (likes). In fairness, they are also counting reactions, shares, and other symbolic actions as well. It is a great experiment. There is plenty to learn. But the implications are way too high!

For those who may not remember. Once upon a time, Google shared its ranking score. Well beyond our Bender Approved model, they actually let users see the final weighted score. People quickly used it to game the system and reverse engineer the IP at Google. Google rankings are no more.

So while the badge solution seems like a great compromise, it is a no go. It is not in the interest of the corporation to share. Transparency is aggressively discouraged in this model. It leads to less effective models but, more importantly, lower revenues. But wait, it gets worse…

I have often railed…

…that the value of this data and these algorithms is over-hyped. But the former wasn’t exacty true, precisely because of the latter.

Cambridge Analytica didn’t gain direct leverage on Facebook users from the data they gathered. They did gain direct leverage on the algorithm. Their influence on users came from that. Let me break that down.

Cambridge was able to use the data — not to learn more about the Facebook audience, rather to learn how to game the Facebook algorithm. And unlike Google’s earliest algorithms, Facebook’s was a learning algorithm. Early Google algorithms were tested and then replaced with new ones engineered by their modeling teams. Today, many algorithms are machine learning — they are constantly adjusting and re-weighting.

Cambridge Analytica used their data leverage to influence how those algorithms learned. They didn’t just game what was there. They changed it to further their own needs!

You didn’t exactly hear Zuckerberg tell that to congress. Facebook prefers a story line where their “valuable” data was hijacked, not their vulnerable AI. Silicon Valley has drawn some dangerous battle lines. They prefer that people focus on the data, not the models. They prefer that people think of this as one giant “algorithm”. The likelihood is that it is many. But the result is that AI has been placed on a pedestal and must be protected at all costs.

At the risk of over-simplifying…

We have a system where the machine must be a black box. Algorithms are just counters and keyword lists translated to calculus. While that is a bit too simple, it emphasizes that they can be reversed engineered. It also means that if you can see behind the curtain, you can tamper with it (or steal it).

It means we have a system where companies would rather pretend that their data is powerful and their actions are fully intentional than admit their algorithms are vulnerable and often make mistakes. The fact that you can’t learn without making mistakes doesn’t fit their brand.

The idea that artificial intelligence is a good tool for controlling relevance and removing offensive material is likely hopeless. But hopefully you understand, that is not because of AI itself, it is because of the system we are trying to use it in. Personal preference, binary decisions, lack of transparency, and the need to pretend & deflect — all work to destroy the governing feedback that might allow this system to flourish and learn.

AI will soon come to dominate our factories. It may learn to steer our ships, fly our planes, and even drive our cars. But the path to controlling what we see and where we see it is only going to get bumpier. Thanks for reading!

--

--

Decision-First AI
Comprehension 360

FKA Corsair's Publishing - Articles that engage, educate, and entertain through analogies, analytics, and … occasionally, pirates!