How good is ChatGPT at Chemistry?

Ivan Reznikov, PhD
4 min readDec 12, 2022

--

Recently, I have been using ChatGPT to test its abilities on tasks that can theoretically be solved using a language model. I have previously published a LinkedIn post on using ChatGPT to solve pub quiz problems, but it was only able to solve 0 out of 5 of them.

I have also written an article on using ChatGPT to explain an opening in chess.

I have seen successful experiments where ChatGPT was able to pass coding and other IT-related tests. I am curious to see how well the hype language model can handle solving a simple chemical equation. Specifically, I will ask ChatGPT to predict the outcome of a chemical reaction, classify it, and balance the coefficients.

Task 1. Predicting reaction products.

A simple sodium combustion reaction will be tested. As one can notice, ChatGPT does correctly answer “the reaction of sodium and oxygen will produce sodium oxide (Na2O)”. The system may not be as precise when asked for specific information, but it seems to be adequate with its general responses.

Task 2. Balancing a chemical equation.

OK. Although the instructions on balancing are correct, the balance itself is poor. On the other hand, if the model believes there are 2Na atoms on both sides, it did the balance correctly. We will attempt to reconvince ChatGPT by providing the number of sodium atoms and rebalancing the equation.

With 2 atoms of sodium on the left and 4 atoms of sodium on the right, will there be any change?

No. ChatGPT text model is quite sure the balance is correct. Unfortunately not.

From an ML perspective, the low chance of seeing an incorrect equation can be considered a sign of not overfitting in this case. This means that the coefficients are indeed generated, which is interesting.

Task 3. Reaction classification.

To our surprise, the chat model seemed unaware of the reaction we were discussing previously. Let’s remind him.

Despite initial doubts, the reaction was correctly classified and the explanation is sound. It is surprising that the model was able to score 2 correct answers out of 3. This may lead one to believe that the model has some knowledge of chemistry, but is this really the case?

Task 4. Bonus. Balancing an equation with artificial elements.

The chemist does not care about balancing the equation. We will create some fictional elements to throw off the model. There is no way it has seen our reaction before.

Miss.

Another miss. It can neither classify nor balance an invalid reaction as well. Although the answer is quite obvious for any chemist.

In my previous article “How good is ChatGPT at playing chess?” I discussed the concept of the uncanny valley in relation to machine learning models.

ChatGPT, for example, is designed to answer questions in a human-like manner, but it is not always accurate. It is crucial to carefully review and validate the output from language models to avoid falling into the valley and use such models for purposes beyond their intended function.

As a reminder, the screenshot below shows an example of ChatGPT’s impressive but completely fake output.

Till next time!

#chatgpt #chatgpt3 #openai #machinelearning #datascience #ml #ds #nlp #language #chemistry #lifesciences #lifescience

About the author: PhD, Lead Data Scientist.
More content: https://www.linkedin.com/in/reznikovivan/

--

--