LLM, explain yourself!

Dean Allemang
9 min readJun 2, 2023

One of the the things that you hear about the dangers of LLMs is that they can’t explain themselves. There’s a lot of truth to this, but I’ve found that actually, there is a lot about LLM behavior that it can explain.

Understanding the Capabilities of LLMs

Probably the most productive use of LLMs I make is to use ChatGPT to write programs and queries. I have found it quite adept at explaining itself. For example, in my blog about Evaluating linked data using the Good Growth Plan, I created a couple of heat maps; a snapshot of one of them is shown here.

Heatmap of Syngneta’s Good Growth Plan implementation

This was created by telling ChatGPT to pick an open source solution that would let me plot this data on a heat map of the world, and put it in a web page. I got these images by asking ChatGPT (3.5) to do the following work:

Build a web page that will show a map of the world, where each country is 
color-coded in grayscale; the darkest ones have the largest sum of
benefitedhectares, and the lightest have the least benefitedhectares.
Use whatever open source mapping software you please.
Make it into a web page I can just drop into my browser and see the map.

It was able to generate a page that did just that. This is an example of a sort of Deus ex Machina by the LLM; I ask a question, it gives me an answer, and I hope it’s right. But not surprisingly, it wasn’t right at first. When I ran it, I got an error. I reported the error to ChatGPT:

Dean: On that line where you defined worldGeoJSON, I got an error that said 
"Invalid GeoJSON Object". Can you fix that?

ChatGPT:
I apologize for the oversight. It seems the GeoJSON data provided by the
world-atlas package is not directly usable with Leaflet.js. To resolve
this issue, we will use the TopoJSON library to convert the TopoJSON
data from world-atlas into the GeoJSON format that Leaflet.js can work with.

Now, I don’t know about you, but that sure sounds like an explanation of what went wrong, a suggestion for how to fix it, and an explanation for why we think the solution will work.

I could go on and on with this; this particular map took me over a day to program (still a lot less time than it would have taken me to become familiar enough with the API to do this). One of the problems was that the map of the world had countries that crossed the 180° meridian; and the graphing package didn’t handle that correctly. ChatGPT tried a dozen different fixes, explaining at each step what it was trying to do and why. None of them worked. But I learned enough about the problem that I was able to write a program to fix it myself. Well, I didn’t write the program, of course, I don’t do that any more. I told it to do the following:

I want some javascript code that will take an array of lat/long pairs,
and return a new array, just like the old one, but if any lat is less than 0,
it is replaced with 179.00

How did I know to do this? Because of the explanations that ChatGPT gave me while we were debugging the original. Also, I made a decision that I don’t really care about the bits of countries that cross the 180 meridian; if you look closely at the heat maps, you’ll see that the colors don’t go beyond a few degrees from 180.

A user interacting with ChatGPT (image by midjourney)

Beyond Programming: Explaining Queries

The sort of explanation that ChatGPT exhibits here is explanation of program code, but it arguably is actually explaining its own actions; it reports on what it is doing, and gives a justification for it. A sufficient justification that I was able to pick up the code and start working on it myself. Another kind of explanation that I have used it for is post-hoc justification of code. This is a query of the form, “explain what this query does”.

This is a pretty straightforward usage, but I want to show a surprising example of it. One of my colleagues approached me a while back, asking for a SPARQL query that would find a set of strings from the graph he was querying, sort them, but also include a variable that indicates the index of each item in that sorted list. This is difficult to do in SPARQL. I asked ChatGPT to explain why. I don’t usually use ChatGPT to write my blogs, but its explanation is as good or better than anything I can come up with:

SPARQL is a query language specifically designed for querying data stored in
the Resource Description Framework (RDF) format. RDF is a graph-based data
model, which is different from traditional relational databases.

In SQL, which is used for relational databases, it's common to use features
like ranking and sorting because the data is tabular and these operations
are built into the SQL standard. However, SPARQL operates on a fundamentally
different data model, where data is represented as a graph rather than a table.
This graph-based model has many strengths when it comes to representing and
querying complex, interconnected data, but it also means that some features
common in SQL, like ranking, are not directly supported.

In a graph, data is not naturally ordered as it is in a table, and concepts
like "row number" or "rank" don't really apply. That's why SPARQL doesn't
have built-in support for these types of operations.

These things aren’t supported in SPARQL, but is there a way to do it? I came up with one, which I gave to my coleague. I consider this query to be pathological; it is using a string as an auxiliary data store in order to pass information from one subquery to another.

SELECT ?who ?howmany
WHERE {
VALUES ?who {"Mike" "Duy" "Rob" "Kurt" "Dean" "Gordon"}
{
SELECT (GROUP_CONCAT(?x) AS ?ixs)
WHERE {
{
SELECT ?x
WHERE {
VALUES ?x {"Mike" "Duy" "Rob" "Kurt" "Dean" "Gordon"}
}
ORDER BY ?x
}
}
}
BIND (STRLEN(REPLACE(STRBEFORE(?ixs, ?who), "[^ ]", "")) + 1 AS ?howmany)
}

I used the VALUES statement to simulate actually getting results from the graph; my colleague already had the graph pattern for doing that. The part I was giving him was the logic to do a ranked sort.

This query is very difficult to understand; it sorts the items, puts them in a space-delimited string (the fact that the delimiter is a space is implicit; that’s the default delimiter for GROUP_CONCAT), then it figures out the rank by getting rid of everything but the space. Then it counts the remaining spaces to see where in the list each name appeared. I figure I could give this to an experienced SPARQL author and they would puzzle over it for quite a while. I asked ChatGPT what it does, and it came back right away:

The corrected query ranks individuals (Mike, Duy, Rob, Kurt, Dean, Gordon) 
based on their lexicographical order, returning each person's name and their
respective rank in the sorted list.

It refers to the “corrected” query because my original made use of some non-standard syntax that some SPARQL processors forgive. ChatGPT corrected my syntax, then it explained what the query does.

This query has a lot of weaknesses besides its inscrutability. The most obvious one to my mind is that if one of the sorted strings were to include a space, it would get the count wrong. I asked ChatGPT for the error modes. It had a bunch of them, all of them correct, but included this one:

Special Characters: If the names contain special characters, especially 
spaces, the REPLACE function might interpret these as separate names,
leading to incorrect ranks.

You might wonder exactly what the error mode would look like. I asked it what would happen if one of the names had been “Anna Marie”.

If one of the names is "Anna Marie", this query will incorrectly interpret 
"Anna Marie" as two separate names, "Anna" and "Marie", because it uses a
space as a delimiter for GROUP_CONCAT. This will disrupt the lexicographic
sorting and the rank calculation.

Specifically, the REPLACE function will count the space within "Anna Marie"
as a delimiter, effectively treating "Anna Marie" as two names when
determining the rank. As a result, the ranks of all names that should come
after "Anna Marie" will be shifted by one, and "Anna Marie" itself will be
given two separate ranks instead of one.

I don’t think I could have explained that better myself.

LLMs and Common Knowledge: The Snow White Problem

So ChatGPT seems to be pretty good at explaining code, and explaining what it does when it writes code. How about explaining other things? A few months ago (I can’t seem to find the exact date), the folks at Oxford Semantics published a blog about ChatGPT’s Snow White problem. The problem is that ChatGPT very confidently answers the question, “What was the first feature-length animated movie?” with “Snow White and the Seven Dwarves.”

Experiments of this sort are very easily replicated, so I asked GPT3.5, “What was the first feature-length animated film?”. And it carefully replied

The first feature-length animated film is widely considered to be 
"Snow White and the Seven Dwarfs," produced by Walt Disney Productions
and released in 1937. The film was based on the German fairy tale
"Snow White" and was a groundbreaking achievement in animation, as it was
the first feature-length animated film with synchronized sound, color,
and music. It was also the first animated film to be produced in the English
language and was a massive success, earning over $8 million at the box
office during its initial release.

That’s far from the sort of confident assertion that we often see GPT make. But this blog is about explanation, so I asked GPT 3.5 why it believed this. It gave quite a long and thoughtful response. I might as well quote it all here:


ChatGPT:
I believe that "Snow White and the Seven Dwarfs" is widely considered to be
the first feature-length animated film because it was a groundbreaking
achievement in animation and had a significant impact on the film industry.

Prior to the release of "Snow White and the Seven Dwarfs," animated films
were generally short, and many people did not consider animation to be a
serious form of filmmaking. However, Walt Disney and his team of animators
spent years developing new techniques for creating animated feature films,
such as the multiplane camera, which allowed for more complex and dynamic
shots.

When "Snow White and the Seven Dwarfs" was released in 1937, it was a
massive success both critically and commercially. The film received
widespread acclaim for its groundbreaking animation, memorable characters,
and charming story, and it became the highest-grossing film of the year.

In addition to its artistic and commercial success, "Snow White and the
Seven Dwarfs" was also significant because it paved the way for future
animated feature films. It demonstrated that animated films could be a
viable form of entertainment and inspired other studios to create their
own feature-length animated films. Overall, I believe that the combination
of its artistic achievements, commercial success, and industry impact make
"Snow White and the Seven Dwarfs" the first feature-length animated film.

This has everything you’d want from an explanation; it backs up its conclusions and expresses them clearly, giving you an opportunity to disagree; but if you do disagree, you get to know exactly what you are disagreeing with.

The Implications of Explanation Capabilities

These are but a few examples of how ChatGPT routinely explains itself, even when it isn’t explicitly asked to do so. In my earlier blog about the Love Song of Chat G.P.T., the LLM routinely provides an explanation of the suggestion it makes. It even routinely explains MY move, which makes it seem a bit pedantic and repetitive.

What does this explanation capability mean for using LLMs for various real tasks? Since it is possible for the LLM to explain why it did something, it is also possible for it to compare what it did to what was intended (by comparing its explanation to the original explanation of what was was desired). It also allows human being to audit its work; to track the explanation for why it did what it did. Contrary to common objections, ChatGPT is pretty good at explaining, and this can help us to improvde the reliability of its results.

--

--

Dean Allemang

Mathematician/computer scientist, my passion is sharing data on a massive scale. Author of Semantic Web for the Working Ontologist.