Ethical Considerations of AI for Online Dating

Published in

Eureka Engineering

9 min readDec 2, 2023

I’m James Neve, of the AI team at Pairs. I have a PhD in Artificial Intelligence, and have published about 10 papers in conferences and journals on the use of AI in online dating. For this second installment of the advent calendar, I’ll write about ethics in AI, which is something of a hot topic recently. While the exact implementation of our AI systems is often a sensitive issue (especially in the case of bad actor detection, where giving details of our models might help malicious operators to evade them) and therefore not feasible to publish, the ethics surrounding these issues do make for interesting discussions without going into implementation details, and will also allow me to introduce some of the work we do on machine learning here.

Computer Science and AI Ethics

Ethics in computer science is a rapidly evolving and relatively contentious field, with a number of issues that specialists do not reach consensus on. When I was at university in the early 2010s, before streaming services, piracy and file sharing services seemed to be ubiquitous, and this made intellectual property a hot topic: is it ‘stealing’ if I download a movie illegally, given that the person who made the movie still has it? As big tech evolved in the following years, the focus shifted to personal data and privacy. A few companies control the personal information of billions of people, based on users clicking “Agree” to a 50,000 word Terms & Conditions statement that none of them read. Is it theirs to do with as they please? What are the consequences to the company if they don’t protect it effectively against hackers? And so on.

In the last couple of years, that focus seems to have shifted to AI. Part of this is due to LLMs — Large Language Models, which are trained on vast swathes of text and mimic human conversation much more effectively than before. Other advances in generative models have been increasingly effective at simulating creativity (something often considered uniquely human) by creating original visual artworks, music and stories that are indistinguishable from those written by people. The unfortunate consequences of these fascinating developments have been sensationalist newspaper articles about imminent AI takeover, and large numbers of people in power from politicians to social media influencers giving uninformed opinions.

To those of us who work with AI, the soundbites produced by public figures may not seem important, but the ethical guidelines produced by government science committees do often reflect the public mood, which as developers of consumer-facing products we have an obligation to consider. Widely cited studies have considered the terms commonly found in official guidelines such as government documents (Jobin et al., 2019) as well as guidelines published by companies (Hagendorff, 2020). A number of terms recur across large numbers of guidelines, implying that they are important to both companies that develop AI and the general public at large:

Transparency: the places where AI is used, and how it reaches the results it does should be clearly stated on the service.
Privacy: a large amount of user data is often used to create AI models. This data should be protected, and should not be able to be reverse engineered from the model.
Safety: models should not cause foreseeable or unintentional harm to users.
Fairness: models should not show bias — for instance, towards or away from particular races, genders or other social groups.

In the next few sections, I’ll introduce some of the areas where online dating services use AI, within the context of the above principles.

Moderation

Moderation is one area where online services have an ethical obligation to provide the best tools possible. We have a responsibility to protect users from certain types of content, to make them feel comfortable and positive about using the service. This is especially true in Japan, where more than other countries, online dating services have a less than stellar reputation due to disreputable early services, and some users may therefore be apprehensive about signing up and interacting.

In some cases, these messages may simply represent misunderstandings from the sender about the rules of the app with no malicious intent. For instance, users are not allowed to request or send personal information such as phone numbers in the first message, as a preemptive measure against bad actors or people who might wish to harvest personal information from a large number of users. Many users break this rule by accident, and so systems are trained to remind them of it.

In other cases, messages are more malicious. Thankfully, a vanishingly small number of users are aggressive in their messages. However, some of them do make sexual remarks which have the potential to make the user they matched with uncomfortable. In other cases, users may have entered the service with an intention besides forming a serious relationship, for instance using online dating services to sell their company’s products. In either case, these types of messages undermine the user’s trust in the system: the objective of an online dating service is to help users find a serious relationship, and if they frequently have interactions that could not lead to that and make them feel uncomfortable, they are more likely to give up on online dating.

In the context of the ethical principles above, moderation is relatively straightforward to evaluate, and all four principles effectively apply. The rules for messages are generally clearly stated on the service. Text moderation models are trained based on the rules, and in that sense they are transparent: users are clear on why their message was rejected. User messages are used to train models, but there is no way to reverse engineer messages or personal information from the models — the output is a binary “OK” or “NG”. In order to combat possible bias from the model, we also incorporate human moderation. The vast majority of messages break no rules and are immediately passed; some messages very clearly break rules and are immediately rejected; and the remaining messages that fall in a grey area where there is not 100% confidence in the model are reviewed by operators, therefore preventing the possibility of frustrating users by rejecting innocent messages where the system made an error.

Bad Actor Detection

A similar field to moderation is bad actor detection. These are people who enter the service with objectives besides searching for a romantic partner. In most cases this means trying to sell their or their company’s products or services, attempting to recruit them to a group such as a multilevel marketing scheme, or asking for money under the guise of being a real romantic partner. Bad actor detection is less straightforward than moderation, and brings up an interesting ethical dilemma with regard to the principles above.

“Success” in the context of online dating is finding a serious relationship or even a life partner. This can only be seen as a positive outcome. Bad actors deprive people of this outcome. This might be simply through wasting people’s time with pointless messages, or in the worst case enticing them to meet and even spend money under false pretenses. In either case, they prevent people from achieving this positive outcome. In addition, bad actor tend to be much more aggressive than users sending malicious or sexual messages: they will contact as many other users as they possibly can before being banned, and therefore a single one evading tools can be relatively damaging to a large number of users. We therefore have an ethical obligation to find and remove bad actors as quickly and effectively as possible.

Any knowledge that bad actors have of the models and techniques we use to catch them can be used as a resource to evade capture when making subsequent accounts. Telling them why they were banned won’t improve their behavior in future; it just gives them the knowledge of what behavior they need to change in order to evade detection. Privacy is also less of a concern: once a user is considered a spammer, the personal information that they submitted, including ID documents, become a useful resource for catching others like them in future. There are pitfalls in blindly applying overarching ethical principles to machine learning models: transparency in this case could prevent people from using the service normally and finding serious relationships. Whether or not that is more valuable than users’ comfort in interacting with an opaque model which determines the likelihood of them being malicious is an interesting and difficult question.

Recommendation

Recommendation has been around in some form since the early years of the Internet, and therefore ethical concerns around this area are relatively advanced, and a number of papers provide concise and readable explanations of the ethics of recommendation (e.g. Milano et al. 2020). In the context of online dating, recommendation means presenting users with other users whom they might like and, crucially, who have a higher than usual probability of liking them back. This is called reciprocal recommendation, and is an interesting subcategory of recommender systems, the details of which are outside the scope of this article. I’ll discuss this in the broader context of person-to-person recommendation, without giving any specific details of how we recommend users to each other.

Considering the ethical principles outlined in the introduction. Where possible, we are transparent with recommendations: users are given some context for why they were shown another user (for instance, shared interests). It would be nice if this was always possible, but modern recommender systems are often based on large-scale preference correlations and therefore are difficult to distill down to a single communicable reason. It may also not be ethical to explain exactly how recommendation models work to users: this sometimes leads to users trying to game the system to get themselves recommended to certain other users, even if they may not be suitable.

While the principles of privacy and safety overlap somewhat, a recommender system is usually trained based on publicly available data, so privacy is less of a concern. Safety, on the other hand, is a worthwhile concern of recommender system design, because of the potential for the recommender system to directly harm the user. If a user is recommended a number of other users who are less suited to him, or to whom he is less suited than the users that appeared on his search results, the recommender system has had a negative impact: it has reduced the user’s chance of finding a serious relationship by suggesting suboptimal connections. While the average for this is easy to measure, the outliers present an interesting question: if a recommender system improves the average user’s success rate, but negatively impacts certain users, has it caused harm, and if so does this represent an ethical violation for the AI? Or is the value of the average user’s experience the most important?

The final principle, fairness, is a particularly difficult problem for developers of all recommender systems, but in particular for those of us who are tasked with recommending people rather than objects. Many recommender systems that are trained on large bodies of data have a natural bias towards recommending that which is already popular. While it may not be a problem for an online shopping service to recommend popular products, it is a big problem for any service which relies on people matching. Initiating contact with an extremely popular user is often a waste of time for both parties: the popular user already has thousands of Likes and doesn’t want any more, and the person who initiated never gets a response and feels disillusioned. Recommending popular users naturally induces biases towards certain groups; in the case of online dating, this is young, rich and attractive people. There are a few proposed solutions to this problem in the research literature, for example improving reciprocal outcomes by training a model to balance the importance of the recommender and receiver (Kleinerman et al., 2018), this is very much an ongoing research area for both dating recommendation and recommender systems in the broader context.

Conclusions

What ethical principles are appropriate for AI is an evolving discussion. It is clear that certain principles outlined by companies and governments are useful general guidelines in certain situations — for example, all four principles apply neatly to moderation. It is just as clear that there are certain situations where they conflict — in the case of bad actor detection, increasing transparency decreases safety. Finally, there are cases where ensuring that the principles are perfectly met all the time is not possible, or compromises the effectiveness of the system. A perfectly fair recommender system shows you a completely random selection of users each time; a useful recommender system is not perfectly fair.

As AI engineers working with data that impacts the general public, with the spotlight currently focused on our field, it is more important than ever that we continue to consider the ethics of the models we develop. As skeptical as we may be of the idea of the imminent emergence of terrifying machines possessing general intelligence, very public accidents such as those recently involving self-driving cars in California stoke public fears of AI, and encourage knee-jerk reactions from governments to rush through regulations. In order to be able to continue to research and develop useful and productive machine learning models that help people find serious relationships and life partners, it is important for us to consider the ethical considerations around the models we develop, and make sure to mitigate as far as possible the potential they have to cause harm to users.

Ethical Considerations of AI for Online Dating

Written by James Neve