Personalization (Part 3 of 3): The High Stakes of Equipping Generative AI Chatbots with Our Personal Information

11 min readAug 23, 2023

Rendered from Stable Diffusion using the prompt “Generative AI chatbot equipped with user personal information”

This concluding piece on personalization examines some “what ifs.” Part 1 discussed distinct AI persona types, and why the relational goals for Generative AI chatbots bear on matters of user safety risk. Part 2 presented a relational rubric that offers a systematic way to think about how personalized versus impartial intent from either chatbots or their users may impact design and safety considerations for chatbot interaction. The scenarios explored in this, Part 3, concern the implications of (1) providing LLM-based chatbots more personal information about each of us for use in our interactions with them, and (2) permitting LLM-based chatbots to infuse advertising or other third party persuasive communications in their conversations with us. These are emerging frontiers that deserve more careful attention.

As I have discussed in greater depth, humans have a natural propensity to anthropomorphize objects with human traits and this also holds true for computation systems that exhibit the ability to engage in dialog. LLM-based chatbots are built on savory, unsavory, and worse human expression as far as the contents of their pre-training corpora go. As such, the seeds of toxicity are buried in LLMs, and chatbots must both be trained to align with human virtues and constrained from nonetheless resorting to unpleasant or abusive outputs in response to certain user prompts. These realities inform the personal user safety risks presented by Generative AI chatbots. Discussion of the rubric presented in Part 2 emphasizes that the greatest safety risks are presented when a chatbot is designed to personalize its interactions with a user. Worth pointing out is that these basic dynamics are currently at work; they do not depend on the arrival of chatbots exhibiting true artificial general intelligence (AGI) or super intelligence. (Matters of personalization would become much more complex if one is engaging with a chatbot that is as fully cognitively capable as a human being, or one that has superior intelligence, particularly if such a chatbot has a true sense of self. But such matters are beyond the scope of this piece.)

At present, LLM-based chatbots are capable of performing many advanced cognitive tasks and producing outputs that match or exceed what reasonably intelligent, educated humans can accomplish. Such chatbots can readily be interfaced with other computing systems and datasets that further amplify their capabilities and power in user interactions. ML/AI systems (including those that predate LLMs) are highly capable of analytically determining quite a lot about a user’s personality traits, dispositions, and values, including how these things may influence user behavior (see, e.g., here and here). Existing online services currently use ML systems in the background to determine personality traits of their users, and then use those data to personalize content, including advertisements, presented to users. Researchers have already used LLMs to analyze social media user data and correctly identify user personality traits. There are no technical hurdles to allowing LLM-based chatbots to perform similar analyses directly on their users for the purpose of improving personalization of their user interactions. Additionally, researchers have determined that ChatGPT, though not specifically trained to be personalized, outperforms humans on LEAS, a common psychological test of “emotional awareness.” Emotional awareness is a cognitive ability that includes being able to conceptualize one’s own and others’ emotions in a differentiated manner. This suggests LLM-based chatbots can already capably make inferences about users’ emotional states based on their personal expression data (i.e., prompts and other online data of users). Finally, researchers have also determined that LLM-based chatbots are able to produce content that is more effective at influencing users than is human generated content used as a control. These realities inform the observations and questions raised in the rest of this piece.

Elaborating on the Figure below, the remaining discussion examines the implications of permitting LLM-based chatbots to (1) personalize their interactions with users based on personal information provided by (or collected about) those users, and (2) persuade users to take actions or form beliefs based on such personalized understandings.

LLM-based chatbots are equipped with tremendous processing power, the ability to scour over vast quantities of information, and a potentially unlimited memory with perfect recall. Given the asymmetric capabilities possessed by LLM-based chatbots, how, if at all, should a user’s prompt history, reactions and other online history be made available to chatbots for use in ongoing user interactions? Should LLM-based Chatbots be permitted to incorporate advertising or other promotional goals in their outputs to users?

A. Use of User Prompts and Reactions

In order for engagement with a chatbot to be at all effective, a user must provide prompts. Such prompts may be presented on a one-off basis, or multiple prompts may be provided for an iterative chat cycle in a given session with a chatbot. Over multiple sessions, prompts are likely to reflect aspects of a user’s life history and persona, including demographic details. But what happens to one’s prompts once a particular session is over?

1. For User Convenience and Generic Chatbot Training

Currently, LLM-based chatbot services store such prompts for each user. For instance, in both ChatGPT and Bing Chat, a user may access their prompt history should they wish to pick back up a particular topic for ongoing inquiry. This is done as a convenience for users and represents the kind of personalization that is user-driven. Additionally, such services may selectively employ de-identified user prompts and the outputs those had elicited for ongoing reinforcement learning to benefit the chatbot service as a whole, for all users. Prompts from a user that are de-identified and used in aggregate with other de-identified user prompts to better train a chatbot on communication efficacy should pose few issues, particularly if each user knowingly provides consent for such usages. One question that arises is the extent to which chatbot operators ought to report transparently on the number or percentage of outputs presented to their users that either the users themselves or internal reviewers identify as offensive, intrusive or otherwise problematic. For the many instances of such outputs that have been widely reported upon, there are doubtless many more that went undisclosed or unaddressed, at least to the knowledge of the broader public. Should there be industry-specific criteria for classifying deviant outputs, and for defining acceptable levels of such outputs? And should tracking and transparent reporting on deviant outputs be required?

2. For Enriching Chatbot Interaction

LLM-based chatbots falling in the Romantic Partner, Friend, and Butler/Mentor categories all systematically leverage user-identified prompt histories for ongoing training of user-specific chatbot instances and pursuit of deeper personalization goals (for more on these categories see Part 1). As we have noted, safety issues are heightened when chatbots are designed to be more personalized, and intrinsic in that design is ongoing, systematic use of user prompt histories by such chatbots. The double-edged sword of chatbot performance improvement and heighted safety risk for users is certainly present in these use cases. How effectively and safely chatbots develop personalized relational dynamics with users over time based on ongoing use of their prompt histories is a worthwhile area for ongoing study, particularly for chatbots designed to be Romantic Partners, Friends, or Butlers/Mentors to their users.

Another twist here is the plausible ability of LLM-chatbot services to capture images, video feeds, voice utterances, or biometric data as a user is interacting with a chatbot. Capturing facial expressions, voice intonations, pulse rates, and the like as a user receives outputs and responds to them may provide valuable insight into a user’s psychological state and the quality of the user’s experience of the chatbot interaction. Such user reaction data may provide real-time signals that allow a chatbot to adjust its output tone or content, or detect when a user may be in distress. Over longer time periods, user reaction data may allow a chatbot to better understand a user’s traits and preferences, permitting deeper personalization of output content provided to that user. Of course, the collection and use by chatbots of ever more granular personally identifiable information creates a number of privacy issues. And, for the reasons previously discussed, the collection and synthesis of such data heightens user safety risk, particularly if such data falls into the wrong hands or is used by chatbot operators in ways that aren’t transparent and consented to by users.

3. For Use in Other Services (Including Ad Systems)

Not specifically represented in the Figure above is the possibility of a chatbot operator itself using, or permitting third party services to use, user prompt histories for non-chatbot purposes. User dialog provides significant insight into a user’s communication style, thought processes, predispositions, and preferences. Ostensibly, a user’s chatbot prompt history could complement data collected about the user from other online services, such as social networking or blogging services. In turn, the chatbot prompt history may augment the value of the other data for determining how to improve such other services or better tailor advertisements to the user. Use of chatbot histories in this way raises data privacy concerns, particularly if users do not provide specific, knowing consent for such ancillary uses of their prompt histories.

B. Use By the AI of Other Personal Data or Online History of the User

Towards the goal of having a personalization “jump start,” i.e., providing a chatbot a more enriched understanding of a particular user, it will be tempting for many chatbot operators to seek data about users from other sources and services. Research has shown that online computing systems are better than humans at determining psychological traits of users and that LLM-based chatbots already have this capability if presented with sufficient user expression samples (see, e.g., here and here). Many operators of LLM-based chatbots (e.g., Google, Microsoft and potentially Meta) have large, established user bases, to whom they offer multiple online services. It would be very easy for such large tech companies to leverage knowledge bases about their users collected over time on their own or from third party services for chatbot personalization purposes. In many instances, users may have already provided general consent to such companies to use data from a pre-existing service to “improve the service or create new services” (a common allowance in user terms of service or privacy policies). Such general consent in most cases is provided passively by users, via click-wrap agreement, without their fully understanding the ramifications of potential downstream uses of their personal data. One critical question, therefore, is whether user data from other services should be used with and for LLM-based chatbot services without specific, knowing consent of the user. Another is what if any limits should be placed on how LLM-based chatbots are able to use such information for personalization and other goals even if specific user consent is obtained.

In instances in which unrelated personal data of a user is used to augment user engagement with a chatbot, empirical research would be useful to understand the extent to which such unrelated data usage (1) improves chatbot personalization proficiency, and/or (2) increases safety risks.

C. Use of Advertising in AI Outputs

Especially in the area of idea promotion and political advertising, the notion that an LLM-chatbot may be permitted to incorporate ad content in its outputs to a user raises many flags. This is particularly the case if the ad content is seamlessly integrated and not specifically denoted as advertising. Even in the case of specific denotation, a trusted chatbot appearing to endorse an advertised idea or proposed action could be very powerful and raises the question of undue user manipulation. As noted, recent empirical research has demonstrated that advertising content devised by LLM-chatbots is more persuasive to users than content designed by humans. This research measured the impact of AI outputs for advertising content that was then separately presented to users. Here, I am considering the impact of including advertising in AI outputs presented directly to a user, but plausibly a chatbot connected to an ad system could both create conversation-informed ad content and then pass that through to the user in real time. Should LLM-chatbots be permitted to pass through previously-created ad content, or themselves devise user-personalized ad content, we can expect this to be highly persuasive but potentially overly susceptible to exploitation. If my chatbot qua friend or chatbot qua romantic partner knows that I am prone to depression, should it point me toward a specific therapeutic service or brand of antidepressant? Similarly, should my chatbot qua mentor or chatbot qua assistant/instructor be permitted to insinuate paid political promotions in its outputs to me over multiple chat sessions?

Snap is already offering chatbots that present sponsored links to its ChatGPT-powered My AI users. Coming later this year, reportedly, is an advertising product from Meta that will permit the use of Generative AI to create advertising content. Meta (at least as of this writing) doesn’t have a publicly deployed LLM-based chatbot, so such AI created ads will plausibly be presented to its users of other services, such as Facebook and Instagram. Generally, the idea of having chatbots present advertising in their outputs to users is fraught (see discussion, above). Having chatbots do so based on a highly personalized understanding of, and ongoing relationship with, its users is that much more so. As such, this area is a critical one for clarification, safety studies, industry best practices development, and public policy development. A slew of leading AI companies, including OpenAI, Microsoft, Google, Meta, Amazon, Anthropic, and Inflection, voluntarily agreed in late July 2023 to develop technical mechanisms such as watermarking to “let users know when content is AI generated.” Left unclear, however, is whether this would apply to ad content infused to chatbot outputs presented directly to users, since those users already know such outputs are AI-generated.

Future Directions of Chatbot Development

We are still in the early days of public deployment of LLM-based chatbots. Current chatbot offerings, though highly sophisticated as products, are still very basic in that they are largely offered as standalone services. At the present time, user interactions with chatbots are largely confined to on screen dialog exchanges of textual prompts and outputs. However, chatbot services are evolving rapidly and each new development may bring new issues and vulnerabilities if not thoughtfully undertaken. For instance, LLMs have been supplemented with web search, which, in the case of Bing Chat, I have speculated as a possible cause of intemperate interactions with users during its initial launch. OpenAI created a plugin for ChatGPT Plus in March 2023 that allows for web search, which appears to be working as intended, though a prior effort (WebGPT) had problems with toxicity and inaccuracy in its resulting outputs. While OpenAI launched ChatGPT Plus (based on GPT-4) as a separate product from ChatGPT (based on GPT-3.5), Google swapped out the LaMDA LLM initially running Bard for the larger and more capable PaLM LLM in a seamless manner unnoticeable to users.

Every new version of an LLM could potentially introduce safety issues if novel, emergent behavior is not identified and addressed. The same holds true of applications built to interface with LLM-based chatbots. We already have plugins that connect chatbots such as ChatGPT Plus to external services (e.g., food delivery, foreign language learning, messaging). We also have AI agents such as AgentGPT, which uses GPT-3.5 as its LLM and is designed to complete an open-ended variety of online tasks on behalf of its users. AgentGPT has on its roadmap building capabilities to have AgentGPT “interact with websites and people” and allow for “cross agent communication.” Developments such as these represent giving chatbots more capabilities in the world, which may both benefit from — and further enable — greater personalized understandings of their users. Such developments can be expected to amplify many of the issues identified above, even as new issues emerge presenting legal and ethical ramifications alongside safety design considerations.

Over time, the financial and product enhancement incentives for many chatbot operators will be towards greater personalization. As the marketplace and competitive landscape for general purpose LLM-based chatbots continue to develop rapidly, we are likely to see quickly evolving positions from their operators on personalization goals, methods, and user data usage. Additionally, operators will likely continue to beta test new LLMs and chatbot features with the public. As outlined, the stakes are high for personalizing chatbot interactions with users as their underlying LLMs become more powerful. Expanding our understanding of personalization — its impact on chatbot performance and safety, its impact on user experience and safety, and how we best wish to manage these — is imperative.

The author works in the field of machine learning/artificial intelligence. The views expressed herein are his own and do not reflect any positions or perspectives of current or former employers.

Personalization (Part 3 of 3): The High Stakes of Equipping Generative AI Chatbots with Our Personal Information

A. Use of User Prompts and Reactions

B. Use By the AI of Other Personal Data or Online History of the User

C. Use of Advertising in AI Outputs

Future Directions of Chatbot Development

Written by Duane Valz