Why AI won’t be able to replace Social Interactions
Full disclaimer, the title may have been a little misleading, but it definitely sounds way cooler than “Why Reinforcement Learning Agents most likely can’t train their state processing algorithm to understand the nuances of human to human interaction.” That, however, is entirely the topic of this article, where I hope to convince you that if anything is certain about the future, it is that your relationships with other humans can’t be replaced.
Reinforcement Learning is the act of training models, or “agents” to perform certain tasks by giving them a reward when they complete said task.
Usually, the models have a cost function, which tells them the cost of performing some actions, and a reward function which determines the rewards they get from certain tasks. Although most of the times this cost function is not available to the agent at the start, it doesn’t mean that the agents will perform worse than humans. Sometimes they even find ways to perform better than humas, which could be very scary at times.
The problem with trying to get a model to understand and mimic human interaction is multi-fold.
First, real life is not episodic, ie you can’t sample a set of actions to the end of the episode. In other words, life doesn’t repeat over and over again, the model will never get a chance to see if something else it had done would have been better.
Second, defining a reward function, or even a cost function is extremely difficult. Consider this example:
It is common etiquette for people to insist upon paying for other people’s food. In the eyes of an RL agent, this is not only not good, it is bad. You are wasting money when someone else could have saved you that money. Then again, if you tell the agent that social interactions are more important than money, it may start to spend all its money on social interactions, since that is more important and gives a higher reward. Eventually, the model goes bankrupt and realizes that the history of actions it took was not such a good idea.
But, this makes the model less likely to spend money at all. A way to fix this is to weight the decisions closer to a state as more important than the decisions made farther away. In reinforcement learning, this is known as the discount factor, and often times you will hear about the “discounted sum of rewards” as a great way to understand the rewards of an episode.
But, discounting has its flaws too. If you are extremely nice to a person, and then for 2 years you don’t meet them, the RL model will consider those actions that you took 2 years ago as obsolete. Then, the next time you meet that person, they remember your kindness, and you get a reward. However, there isn’t a way to associate this reward with the actions you took 2 years ago.
To fix this, you might think to break relationships down individually. Yet again this has its flaws. Person A might gossip about you to Person B. Your interactions with Person A then affect your interactions (and hence reward) with Person B.
And finally, suppose a model overcame all of this. The biggest problem with such a model would then be acceptance. It’s hard to imagine (as of now) that a robot could possibly be your best friend. Even if the robot itself does everything your friend does, mental barriers in yourself will prevent it from ever developing the same kind of relationship between you and your friend.
If you don’t believe me, then consider this. Most of us have a smart phone, and most smartphones have hyper-intelligent models, virtual assistants. You would never consider them as your counselor, or even an assistant. But, they probably know more about you than any other person, simply due to our extensive use of smart phones and apps.
So maybe 4 or 5 generations down the line, with the next generation fighting for AI equality and mingling with robots, RL agents will be able to form (or at least mimic) human to human interactions.
But until then, you can rest easy knowing that the relationships you developed with the people close to you are not going to be replaced… yet.
If you enjoyed reading this article, or learned something new, don’t forget to clap, leave a comment, and follow me!