A Human-AI Knowledge Interaction Model for Trustworthy Generative AI

4th #AI4BetterScience series blog

Quentin Loisel
4 min readJul 3, 2024

In the previous blog of our #AI4BetterScience, we introduced the transformative potential of artificial intelligence (AI) for science by generalising it with generative AI (GenAI) and how the current guidelines struggle to cover all usages, leading to extensive grey areas. Thanks to our previous blog, I also expect you to know the fundamentals of AI and GenAI since I am going to use a bit more specific vocabulary, such as “model”, “output”, or “data”. Now that we have stated the situation, I want to explore some considerations and solutions to implement GenAI optimally in science and beyond. Let’s start with one major problem: You might be unable to trust the output.

Indeed, these technologies can react to our inputs, generating plausible outputs. However, despite plausibility, they are sometimes factually wrong. For example, if you ask about the birth year of Charles Darwin, you might get 1910 (instead of 1909). While we call it model “hallucination”, it is pretty standard regarding how these models process data. It is a lesser problem when the factuality is less critical, for example, when generating a poem or a fictional story. But sometimes, the tiniest error can jeopardise the entire production, like the p-value in a scientific study.

While engineers and scientists are working hard to improve model accuracy and solve this hallucination problem, I have only found one solution to prevent damage to my work: proofreading the output. However, to do this, you must have the necessary knowledge to assess the accuracy of the output. However, the quality of the interaction between you and the model will depend not only on your knowledge of the topic but also on how the model reacts. Then, we can consider multiple interaction situations. To explore them and understand the interaction implications, we will examine a personal model called the “Knowledge Model for Human-AI Interaction.” or HAKIM.

Human-AI Knowledge Interaction Model (HAKIM)

The figure above represents our HAKIM model. The blue circle illustrates all humanity’s knowledge with a frontier delimitating it from the unknown. In orange, we have your knowledge. Then, we have the model data with the pattern on our figure. You can observe your knowledge, or the model data can overlap. Nonetheless, they are contained in humanity’s knowledge, and none covers it.

It is good to note that we speak about “data” to differentiate it from human knowledge. However, the AI model processes these data through specific learned parameters to produce an output. I prefer to speak about model data since their processing implies a “black box,” so we don’t know how the data has been treated. If the user wants to assess the accuracy of the output, they will refer to the common knowledge domain, where the original data are coming from.

Diver situations of interaction

This model supports the identification of diverse human-AI interaction situations, which are displayed in yellow circles below.

These possible interaction situations can be organised in a matrix of the two variables we mentioned: user knowledge & the model data.

HAKIM’s Practical Implications

As we said, our problem is that the AI model can hallucinate. Then, the user must know about the topic to assess the accuracy of the output. Considering this and our precedent matrix, we can infer some implications. Firstly, an overlap between the model data and user knowledge [S5] will allow an optimal interaction potential, where the model will likely provide relevant outputs, and the user can assess them. Secondly, interacting on a topic with user knowledge but not model data [S2] will lead to irrelevant outputs but a safe interaction since the user can assess them as such. Thirdly, a topic where the model has the data but the user does not know [S7] will lead to relevant outputs. Still, the user won’t be able to assess their accuracy and take responsibility for their usage. Fourthly, a topic without user knowledge or model data [S4] is just a useless and unsafe interaction when expecting accurate information. Finally, situations implying limited data and knowledge provide intermediary situations.

What comes next?

This model can help identify how to interact optimally with the generative AI model when expecting information from the output. It still needs improvement, but you have the practical substance and can use it to question your use of generative AI.

If you want me to specify this model further, let me know! But many good blogs will come to continue our journey regarding the importance of not humanising AI, how it taught me humility, the three keystone questions, the accessibility illusion, the importance of double expertise, etc. There is more than enough for the following weeks, and I hope to see you there!

Thank you for reading. Please comment below or on LinkedIn/Twitter to share your experience, constructive feedback on this blog, and suggestions for future issues. See you soon!

Thank you for reading!

Keeping going with the following issue: Do Not Humanise Artificial Intelligence (in research, at least)
Or read the previous issue: Before we dive deeper… What’s generative AI already?

--

--

Quentin Loisel

Just questioning how generative AI impacts Science, human cognition, and our civilisations. Let's dig into it!