Critical Summary on the Ethical Use of NLP

Safety Issues of Chatbots in the Healthcare Industry

13 min readApr 8, 2023

Introduction

Applications of Natural Language Processing (NLP) have been increasingly prevalent over the last few years such as sentiment analysis, language translation, speech recognition, chatbots, named entity recognition, text summarisation and so on. Chatbots are one of the most used applications in a variety of industries to automate repetitive tasks and increase customer engagement.

With the wide use of chatbots, several ethical issues are drawn attention and need to be addressed to ensure they are developed and used responsibly. The main ethical issues associated with chatbots include transparency, bias, safety and privacy.

Transparency — Being transparent about the status of a dialogue agent as non-human and the limitation on its capabilities enables users to make informed decisions and helps to build more trust with users (Ruane et al., 2019). This is especially important where sensitive information is provided by users such as financial details, and health concerns.

Bias — Henderson et al. (2018) suggested that chatbots can inherit biases of gender, race, personal viewpoint, and several others from the data used to train and that biases are likely to be encoded in language models and dialogue systems.

Safety — In chatbots, safety risks lie with the possibility of offensive or harmful effects on human interlocutors (Henderson et al., 2018). As explained by Henderson et al. (2018), stability and coherent output are difficult to achieve in dialogue systems and thus model divergence could possibly lead to undesirable behaviours without performance guarantees. One example is the Microsoft Tay Chatbot in March 2016, which generated offensive and inflammatory tweets including racist, sexist and anti-Semitic statements within hours of its launch. The Tay chatbot scandal emphasised the importance of considering safety concerns when chatbots and other AI systems are designed and implemented.

Privacy — The interaction between humans and chatbots also raises privacy concerns about data collection, access, storage and usage. Ruane et al. (2019) indicated that users may be inclined to self-disclose considering the user-agent interaction is anonymous and the self-disclosure helps to improve user experience. To avoid users’ information being leaked or misused, chatbots should be designed and implemented to comply with legal requirements such as GDPR or any other privacy guidelines.

Admittedly, each issue described above can play a vital role in the development of ethical dialogue systems applied in various industries. Further discussion will be concentrated on safety issues of chatbot applications in the healthcare industry by reason of the following considerations:

The potential safety risks associated with chatbots are significant as they are likely to result in physical or emotional harm to users. For example, harmful medical advice provided by chatbots could put a user’s health at risk. Similarly, chatbots that are designed to manipulate or deceive users could have negative consequences for users’ emotional and psychological well-being such as anxiety, stress or mistrust.
In terms of ethical concerns for chatbots, safety issues are rarely discussed in the literature or in practice (Henderson et al., 2018). The exploration and discussion on how to improve the safety of chatbots could potentially contribute to the refinement of dialogue systems in the healthcare industry and make a positive impact on patients.

Approach

To carry out the initial research on the ethical use of chatbots, ChatGPT was used to gain general ideas on the types of ethical issues associated with chatbots and the impact of these issues on the healthcare industry. With a better understanding of the issues, a number of relevant journals were identified by searching the keywords in Google Scholar such as ‘safety issues of chatbots in healthcare’. By reviewing the journals and existing research, key findings were summarised relating to the current development of chatbot applications in the healthcare industry and served as evidence for the Critical Discussion in the following section.

The approach used combines the application of ChatGPT and the perusal of relevant research papers with the benefits and limitations of ChatGPT taken into consideration. On the one hand, ChatGPT is a powerful language model which has a strong capability of generating responses to questions, summarising long text, and helping users to explore information or topics (Tarapara, 2023). This serves as a more efficient way to understand an unfamiliar topic or area compared to researching from scratch and summarising the information by ourselves. On the other hand, as indicated by George (2023), ChatGPT has some limitations for research writing such as the incapability of creating original ideas, potential risks to generate inaccurate or inconsistent research content, limited knowledge of events that occurred after 2021 and so forth. Considering the existing limitations of ChatGPT, a detailed review of research papers was performed to guarantee the accuracy of the content and minimise the risks of plagiarism.

Critical Discussion

Findings

Researchers have raised several concerns about safety risks for chatbots in healthcare. One main area of safety concern is in medical domains, where chatbots are used to manage emergency situations or provide medication recommendations (Bickmore et al., 2018). Bickmore et al. (2018) pointed out the potential safety risks when users act on the incomplete or inaccurate information provided by chatbots without consulting medical professionals. In their study where Siri, Alexa and Google Assistant were asked to decide on an action for medical problems, 29% of reported actions are likely to cause patient harm, including 16% that could result in death. Parviainen & Rantala (2021) suggested chatbots lack the capability in assessing patients accurately and may cause indirect harm to patients by not knowing the full details of personal factors associated with them.

Another concern is over-reliance on chatbots by users who may become over-attached to chatbots and avoid visiting mental health professionals in person. Parviainen & Rantala (2021) expressed their concern — ‘There are risks involved when patients are expected to self-diagnose, such as a misdiagnosis provided by the chatbot or patients potentially lacking an understanding of the diagnosis’. Patients’ trust in a chatbot may impact how they value a doctor’s views and impair their trust in physicians. Also, the use of chatbots in the long term might have a negative impact on patients’ behaviour such as a lack of personal contact, and loss of capability in managing conflict (Denecke et al., 2021).

Discussion

The safety issues of healthcare chatbots could have a widespread impact on society and affect different stakeholders in the healthcare industry including patients, healthcare providers, chatbot developers and regulatory agencies.

Patients — Chatbots could potentially provide patients with inaccurate or incomplete information, resulting in misdiagnosis, incorrect treatment or delays in receiving necessary care. The safety issues could also raise questions about the reliability and accuracy of chatbots and decrease patients’ trust in healthcare chatbots or technology-based healthcare solutions more broadly. Based on the study run by UserTesting of five healthcare chatbot apps (Ada, HealthTap, Mediktor, Your.MD and Symptomate), consumers do not fully trust the diagnosis and information provided by the chatbots (Dietsche, 2019). They were cautious about the unfamiliar drug brands recommended by the chatbots and questioned whether those brands comply with HIPAA standards. The potential safety risks could increase the distrust of chatbots among patients and discourage them from using chatbots for healthcare services.

Healthcare providers — Healthcare providers who use chatbots in clinical practices could be at risk of liability for any incorrect or harmful advice provided by chatbots. The incapability of chatbots may also impair trust in healthcare services (Parviainen & Rantala, 2021). In addition, as advised by Parviainen & Rantala (2021), the safety issues of chatbot applications are likely to increase the workload in clinical practices instead of a reduction. Due to patient self-diagnoses, healthcare providers may need to spend time and effort to convince patients’ potential preliminary misjudgement, leading to an increase in their workloads. Apart from this, clinicians lack trust in the capabilities of chatbots and have concerns about potential clinical risks and issues of accountability (Parviainen & Rantala, 2021). The distrust and uncertainties about chatbots give rise to a higher ‘cognitive load’ (Zhou et al., 2017) of professionals and affect their work memory.

Chatbot developers — Chatbot developers take responsibility for designing and programming chatbots that can function with safety, reliability and accuracy. They may face legal action if their chatbots cause harm to patients or healthcare providers. At the same time, the safety issues associated with chatbots can jeopardize the reputation of the developer and the company they work for. The loss of trust in chatbots among patients and healthcare providers may lead to decreased adoption and use of chatbots, which affect the developers and companies that are specialised in chatbot development.

Regulatory agencies — The safety issues with healthcare chatbots can have an important impact on regulatory agencies related to overseeing the development, deployment and use of chatbots. Regulatory agencies may enforce legal and financial penalties against developers and companies that fail to comply with the regulatory requirement. Also, there could be increased regulatory scrutiny of chatbots to ensure the relevant laws, regulations and standards are complied with. For example, more frequent and rigorous audits and evaluations can be conducted by regulatory agencies. Meanwhile, regulatory agencies may develop new regulations or guidelines specifically for healthcare chatbots considering their great impact on patients and healthcare providers.

Critical Analysis

There are several approaches to counter the effects of safety issues associated with chatbots in the healthcare industry. Pros and cons are identified for each approach below.

Developing a framework for governing chatbot use in healthcare

One of the key approaches is to develop a framework for governing the use of chatbots in healthcare services, which requires input from various stakeholders including healthcare providers, regulatory bodies, and technology experts. In December 2020, a framework named ‘Chatbots RESET’ was created to govern the responsible use of chatbots in healthcare by the World Economic Forum and Mitsubishi Chemical Holdings Corporation together (World Economic Forum, 2020). The framework introduced ten principles for the responsible use of chatbots and recommended actions that can be taken to implement the principles by three types of stakeholders including developers, providers, and regulators (Venkataraman & Arunima, 2020). Providers consist of hospitals, insurance companies, technology providers (direct to patients) and the government (as a provider).

The approach has both advantages and disadvantages. On the one hand, it could have a crucial impact on the improvement of healthcare services. The framework that includes appropriate safety measures can minimise the risks of inaccurate or unsafe medical advice provided by chatbots and build trust in the use of healthcare chatbots from both patients and healthcare providers. Concurrently an established framework could provide better guidelines to a variety of stakeholders in developing, deploying and regulating the responsible use of healthcare chatbots. On the other hand, developing and implementing a framework for the use of chatbots in healthcare may require significant resources in terms of time, money and personnel. The slow implementation of a framework along with the complex regulations and guidelines could create barriers for healthcare providers and constrain the innovation of chatbot solutions.

Designing a reliable handoff system in healthcare chatbots

Another approach is to design a reliable handoff system in chatbots so that a patient can be transferred from a chatbot to a human healthcare provider when necessary, ensuring that patients receive the appropriate level of care and attention. There are several ways to trigger the handoff system, for example, the patient may request to talk with a human healthcare provider or the chatbot may recoginise its inability to address the patient’s needs. Alternatively, when a certain amount of time has passed or if the patient’s concerns meet certain severity criteria, the transfer will proceed.

Designing a reliable handoff system in chatbots is beneficial to both patients and healthcare providers. The handoff system could enable patients to choose the level of care they need and improve patients’ outcomes by providing the proper medical advice or managing emergency situations. In addition, the system could contribute to the efficiency improvement of healthcare services by enabling chatbots to handle routine or simple tasks and allowing human healthcare providers to focus on more complex cases. However, there are also some drawbacks to this approach. Firstly, implementing a handoff system could be costly and require additional personnel to handle the handoff process. Secondly, the patient experience could be negatively impacted when the non-functioning handoff process leads to delays in response times. Besides, there is a potential risk of miscommunication between the chatbot and the human healthcare provider, which could cause errors in diagnosis or treatment.

Other approaches

Other approaches to reduce the effects of safety issues include providing clear disclaimers to users by clarifying chatbots are not a substitute for professional medical advice, monitoring and testing healthcare chatbots regularly and training healthcare providers to use and interact with chatbots.

Reflection on Ethical Use of NLP

With the rapid growth in NLP over the recent few years, an increasing number of organisations have applied NLP techniques to improve work efficiency, enhance the customer experience and optimise marketing strategies. Some commonly used cases which are perceived as acceptable include:

Chatbots — used to assist with customer services by answering customers’ queries, completing routine tasks, and ensuring the improvement of customer satisfaction.
Feedback Analysis — used to analyse the customer’s feedback and generate insights, such as knowing customers’ sentiments about an event and understanding the pros and cons based on product reviews.
Social Media Monitoring — involved in analysing social media data to get information about companies’ products and services to make improvements or amendments.
Text Summarisation — used to summarise useful information from long reports or other types of documents which enables a more efficient way to do research and make business decisions.
Hiring and Recruitment — using techniques of information extraction and named entity recognition to extract candidates’ information during the process of selection, contributing to higher efficiency and cost saving.

However, organisations may also make some unacceptable use of NLP to achieve their business purposes. One example is that NLP can be used to generate fake news to deceive or manipulate customers aimed to boost business growth and profit. Another example is some companies may utilize NLP to eavesdrop on private conversations or analyse personal data without users’ consent. Additionally, NLP applications could be misused to discriminate against individuals or groups with a certain race, gender, religion or other characteristics in the process of screening job applicants or evaluating loan applications.

In the long term, the development of NLP applications could cause several concerns around social connection, cultural diversity and innovation. Social connections between people could be weakened with the advancement in conversational AI. Individuals may feel more comfortable to disclose their true feeling with chatbots rather than human friends to avoid any personal judgements or conflicts. Cultural diversity could be undervalued and endangered especially languages. As Ruder (2020) pointed out, there are over 7000 languages spoken worldwide but most NLP research is focused on English. The language models trained with data in English may not reflect the cultural norm, common sense knowledge and values of a specific country or language community (Ruder, 2020). The popularity of NLP applications may attract more people to learn English rather than study their local languages and cultures. Also, users may not be aware of any cultural bias built into the model and thus create misunderstanding of cultures or values in specific regions. In addition, innovation and originality could be discouraged and negatively affected. Since the advent of ChatGPT, we have seen its great impact on changing the way people do research, write and communicate. Admittedly, it is more efficient to use ChatGPT to generate a well-written paragraph than to think by ourselves. Howbeit, people may get used to relying on the tool and be unwilling to learn new knowledge and generate new ideas themselves. More weight could be attached to efficiency and productivity compared to innovation and originality.

Conclusion

The report consists of three parts including a review of ethical issues in chatbots, an introduction of the approach used for critical summary and detailed discussions on the safety issues of chatbots in the healthcare industry by covering:

Findings about the safety issues in healthcare chatbots
Impact of safety issues on key stakeholders such as patients, healthcare providers, chatbot developers and regulatory agencies
Approaches to counter the effect of safety issues in healthcare services with analysis of pros and cons
Reflection on the acceptable and unacceptable use of NLP in organisations as well as its potential implication and unintended consequences in the long run.

The key insights about this report are highlighted in the picture below. Potential improvement and future work of this critical summary are to explore the feasibility of each approach discussed under Critical Analysis and provide practical suggestions in healthcare services.

References

[1] Bickmore, T. W., Trinh, H., Olafsson, S., O’Leary, T. K., Asadi, R., Rickles, N. M., & Cruz, R.(2018). Patient and consumer safety risks when using conversational assistants for medical information: An observational study of Siri, Alexa, and Google assistant. Journal of Medical Internet Research, 20(9), e11510. https://doi.org/10.2196/11510

[2] Denecke, K., Abd-Alrazaq, A., & Househ, M. (2021). Artificial intelligence for chatbots in mental health: Opportunities and challenges. Multiple Perspectives on Artificial Intelligence in Healthcare, 115–128. https://doi.org/10.1007/978-3-030-67303-1_10

[3] Dietsche, E. (2019, January 21). Consumers don’t fully trust healthcare chatbots, study finds. MedCity News. https://medcitynews.com/2019/01/consumers-healthcare-chatbots/

[4] George, E. (2023, January 20). ChatGPT for research writing: Game changer or ethical risk? Researcher-Focused Products And Solutions For Your Research Publication Journey | Researcher.Life. https://researcher.life/blog/article/chatgpt-for-research-writing-game- changer-or-ethical-risk/

[5] Henderson, P., Sinha, K., Angelard-Gontier, N., Ke, N. R., Fried, G., Lowe, R., & Pineau, J. (2018). Ethical challenges in data-driven dialogue systems. Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 123-
129. https://doi.org/10.1145/3278721.3278777

[6] Parviainen, J., & Rantala, J. (2021). Chatbot breakthrough in the 2020s? An ethical reflection on the trend of automated consultations in health care. Medicine, Health Care and Philosophy, 25(1), 61–71. https://doi.org/10.1007/s11019-021-10049-w

[7] Preston, R. (2023, January 17). The shortage of US healthcare workers in 2023. Oracle | Cloud Applications and Cloud Platform. https://www.oracle.com/au/human-capital-management/healthcare-workforce-shortage/

[8] Ruane, E., Birhane, A., & Ventresque, A. (2019, December). Conversational AI: Social and Ethical Considerations [Paper presentation]. AICS — 27th AIAI Irish Conference on Artificial Intelligence and Cognitive Science, Galway, Ireland. https://www.researchgate.net/publication/337925917_Conversational_AI_Social_and_Ethical_Considerations

[9] Ruder, S. (2020, August 3). Why you should do NLP beyond English. ruder.io. https://www.ruder.io/nlp-beyond-english/

[10] Tarapara, K. (2023, February 10). What is the difference between ChatGPT and search engines | Bosc tech labs. BOSC Tech Labs. https://bosctechlabs.com/chatgpt-vs-search-engine/

[11] Venkataraman, S., & Arunima, S. (2020, December). Chatbots RESET A Framework for Governing Responsible Use of Conversational AI in Healthcare. World Economic Forum. https://www3.weforum.org/docs/WEF_Governance_of_Chatbots_in_Healthcare_2020.pdf

[12] World Economic Forum. (2020, December 7). Chatbots RESET: A framework for governing responsible use of conversational AI in healthcare. https://www.weforum.org/reports/chatbots-reset-a-framework-for-governing-responsible-use-of-conversational-ai-in-healthcare/

[13] Zhou, J., Arshad, S. Z., Luo, S., & Chen, F. (2017). Effects of uncertainty and cognitive load on user trust in predictive decision making. Human-Computer Interaction — INTERACT 2017, 23–39. https://doi.org/10.1007/978-3-319-68059-0_2