Why AI Writing Tools Are Useless for Science News
There has been some discussion recently about ChatGPT, or its competitors, being used to take the place of human writers. I have good news: if you write about issues related to science, health or technology, you have nothing to worry about. These tools are simply not up to the task (yet). I want to talk about why.
Chatbots and other tools that use artificial intelligence (AI) to generate written content have garnered an enormous amount of attention, largely due to media coverage of ChatGPT and its competitors. The most recent version of ChatGPT, for example, is apparently capable of identifying billing errors, writing computer code, and even suggesting accurate courses of medical treatment. All of which is remarkable.
However, existing AI writing tools have some key limitations. And those limitations make the tools effectively useless for writing about important subjects in general, and about new or forthcoming research findings in particular.
These limitations — particularly for science, health and tech writing — all revolve around the fact that information in AI-written material is unreliable.
Sources of Information
When asked to write about a specific topic, ChatGPT and other AI content generation tools draw on a wide variety of data sources to inform their writing. For example, as BBC’s Science Focus explains: “[ChatGPT] was trained using text databases from the internet. This included a whopping 570GB of data obtained from books, webtexts, Wikipedia, articles and other pieces of writing on the internet.”
But while these tools draw information from a variety of sources, it is currently impossible for users to know how reliable any of that information is. There is, for example, a lot of inaccurate information on the internet. People often put things online — or in books, for that matter — that are designed to intentionally mislead.
What’s more, even information from reliable sources is subject to change. New discoveries often mean that things we thought we knew were wrong. Journal articles are retracted, due to human error or intentional fraud.
And it’s not clear how, if at all, AI writing tools are able to distinguish between any of these sources. A good science writer, for example, can be relied upon to do the critical thinking necessary to identify wildly inaccurate information in an online story or to avoid citing a journal article that has been retracted. We have no reason to believe that AI content generators are doing the same things. In fact, at present, we have every reason to believe that AI content generators are very bad at sorting reliable information from misinformation.
Things can also become tricky when AI tools are asked to write about topics where names or terms overlap with identical (but unrelated) names or terms. For example, many people share the same name (I share a name with both a voice actor and a musician). It is easy for an AI tool to incorporate details related to one individual into a piece of writing about a completely different person with the same name. And this phenomenon can be particularly problematic when writing about technical subjects, in which widely used terms take on very specific meanings.
Anyone who has written about science knows that something can be statistically significant and also relatively unimportant; context matters. But because the primary definition of the word “significant” in conversational language is “important” or “noteworthy,” this is a distinction that AI tools are unlikely to make. In other words, when drawing data from a journal article, an AI content generator may decide that something described as “significant” (in statistical terms) is “significant” (important), regardless of whether it is actually noteworthy.
If you don’t like that example, just think of some of the terms used in physics, which have wildly different meanings in conversational language: flavors, strange, charm. For that matter, think of all the words in the English language that mean two opposite things — such as cleave (to join or to split) or sanction (to approve or to penalize). Professional science writers tend to be pretty good at distinguishing between these things. The same cannot be said for the existing suite of automated tools.
Making Stuff Up
What is more troubling is that some AI-writing tools also have a habit of fabricating information. For example, if users ask AI content generators to write about a subject, the tools don’t always rely solely on the data that is available. The AI tools will try to extrapolate from the existing data or, from the point of view of the user, simply make things up. This is called “hallucination.” (And, yes, even the most recent version of ChatGPT still hallucinates.)
For example, an acquaintance of mine asked an AI content generator to write a brief biosketch about him. On its first try, the AI said he had attended two universities. Both were wrong. On its second try, the AI said he attended two different universities. Both of those were also wrong. Out of curiosity, I tried to figure out where he had gone to school. Using a search engine, I found the correct information in about 15 seconds.
For any nonfiction writer, incorporating incorrect information into a story is bad. For people who write about research, getting the facts wrong is completely unacceptable. Science writers are tasked with communicating clearly, effectively, responsibly and accurately. And for people who write about subjects like public health or medical care, giving people wildly wrong information can have serious consequences.
New Knowledge
The problems associated with AI tools drawing on unreliable data and fabricating information are exacerbated when AI content generators are asked to write about research. That’s because there is little or no pre-existing data for the AI to draw on.
To be clear, new research findings do not emerge from a vacuum. Science is an iterative process, and any new findings build on previous work. However, research findings are inherently new. They mean that researchers have learned or discovered something that was previously unknown. It is, quite literally, new knowledge.
For example, I write for a university. One aspect of my job requires me to work with researchers to write news releases about forthcoming research. I love this aspect of my work, because it means that I am one of the first people on the planet to learn whatever it is the researchers have discovered. I can look at the journal article they’ve written, but which is not yet published, and I can ask them questions about the work to make sure that I understand it properly.
If you asked an AI content generator to write about the subject, it would have nothing to go on. The findings are not yet online. At best, the tool would notify users that it could not write about the subject. At worst, it would hallucinate, and write something that may or may not have any basis in reality.
Now, think of an instance where the new research has just been published. The journal article is online. An AI writing tool would at least have some data to draw on, but that data set would be small. At best, it would be irresponsible to assume that the AI tool would be able to place the work in context and explain to readers what the findings were and why they were important.
Critical Thinking Matters
AI is credulous. It believes what it is told. It is not going to interrogate the data. It won’t ask why researchers did something or question how researchers reached their conclusions.
Good human writers do those things.
Years ago, I wrote this about journalism: Journalism is NOT the simple act of passing along information. It is more than the regurgitation of facts. Journalism should offer the reader a service. It should not only share information, but place that information in context. Does the information come from a trustworthy source? Could it be verified by other sources? Why is it important? Who says it is important? Who is it important to? Why are you telling me about this now? How did this happen? What might happen next?
Journalism should answer our questions. It should answer questions we haven’t even thought of yet. Because journalists should be skeptical critical thinkers, putting facts through the mental wringer before passing them on to their readers/viewers/listeners. They should help us sort the facts from the half-truths and the lies. They should help us figure out the angles that people or organizations are playing when they give us information.
All of that was true then, and it’s true now. What’s more, I’d argue that most (if not all) of what I just wrote is critical for good nonfiction writing in general. And, at this point, AI writing tools are not up to the job.