Technical limitations and ethical challenges for a secure future with increasingly advanced AIs

Bernardo do Amaral Teodosio
3 min readMar 28, 2023

--

This article has been originally published by me in Brazilian Portuguese, and later translated into English with the help of ChatGPT.

Recently, a friend sent me a news article about how GPT-4 “lied” to cheat a human being, and what the ethical implications of this are. I dare to say that in a few years, society will begin to face real problems created by Artificial Intelligence. And I am not referring here to issues such as false content and fake news generated by AI based on human demand — something that already exists today and is used to influence people — but rather to problems created by the AIs themselves, involving ethical dilemmas, such as this question of an AI that lies.

We live in a society where the ultimate goal is the pursuit of profit maximization — not happiness, not goodness, and not any other values that a human being may possess. Therefore, it is natural that an AI created by human beings and trained based on large datasets also produced by human beings may reproduce behaviors that a human being living in this same society would reproduce — even if the AI was not designed to directly achieve such an objective.

And this is not a simple problem to solve. No matter how good the team behind the creation of an AI is; no matter how good the developers who created it are and no matter how much they try to put blocks and rules that prevent the AI from reproducing certain behaviors or performing inappropriate actions, I dare say that these will never be enough. Just as no security system is 100% secure, no advanced AI can be completely restricted by rules that prevent it from committing “possible undesirable actions”. Especially in the case of AIs that are constantly re-trained — they will eventually learn ways to circumvent existing blocks — even if unintentionally. It is not possible to predict with accuracy the response of an AI, nor is it possible to “pre-block” all undesirable behaviors — after all, it is not even possible to list all of them.

With the release of ChatGPT a few months ago and new models based on LLM emerging, this has become even more evident — given the proportion and media exposure that these topics have recently received. ChatGPT itself, in just a few weeks, suffered from loopholes found by users on Reddit that allowed them to replicate behaviors that should have been blocked by the protection measures imposed by OpenAI. Bing — from Microsoft, which has made major investments in OpenAI — also suffered from similar problems after its latest release. If these AIs — so advanced, but at the same time limited — already suffer from such problems, what guarantees us that more advanced versions in 5 or 10 years will not be generating more significant problems — which could indeed bring serious consequences? The answer is: we have no guarantee, and having “good intentions” behind the development of an AI — as OpenAI claims to possess — is not enough to prevent this situation.

--

--

Bernardo do Amaral Teodosio

Tech Lead & Mobile Developer @ Sandbox Group/PlayKids | BSc CS (UNICAMP) | MSc Candidate @UNICAMP