Working with Structured Data on LLMs: ChatGPT, Bard and Bing

Luiz
Blue Orange Digital
10 min readNov 13, 2023

Leveraging Large Language Models (LLMs) for working with structured data has become increasingly feasible and practical. These models, such as GPT, MS Bing, and Google Bard, can process and understand structured data by interpreting the context and relationships within the information. One approach is to convert structured data into natural language prompts that the LLM can comprehend. For instance, instead of providing a raw SQL query, a user can frame a question in plain English, and the LLM can generate the corresponding query. This makes it more accessible for users who may not be familiar with complex data query languages.

Prompt engineering is a vital aspect of effectively utilizing Large Language Models (LLMs) for structured data tasks. It involves crafting clear and contextually rich prompts or queries that guide the LLM in generating accurate and relevant responses, such as SQL queries or data transformations. This process includes providing detailed instructions, iteratively refining prompts based on the model’s behavior, and incorporating domain-specific knowledge to enhance the model’s understanding of the structured data domain. By customizing language and refining prompts iteratively, users can optimize LLMs for specific data tasks, ensuring the generation of precise and contextually appropriate outputs in various structured data scenarios.

Additionally, integrating LLMs with existing data processing tools and frameworks provides a seamless workflow for working with structured data. This integration allows users to harness the power of LLMs within their preferred data environment, making it easier to incorporate natural language understanding into data analysis pipelines. The ability of LLMs to generate human-readable queries or provide insights in natural language facilitates collaboration between data scientists, analysts, and domain experts who may not have a deep technical background in data manipulation. Overall, combining converting data tasks into natural language prompts, fine-tuning for specific structured data tasks, and integrating LLMs with existing data tools opens up new possibilities for efficient and accessible interaction with structured data.

A Practical Example using ChatGPT, MS Bing, and Google Bard

When writing this article, the US dollar against the Brazilian Real exchange rate (USDBRL) was about 4.90, and the Brazilian stock market index was at 120,000 points. So, let’s ask the chat tools what its expectations are for the exchange rate in the near future.

Starting with ChatGPT:

Our first question as follows: “What is the USDBRL quote for the expected value when IBOV reaches 130,000?”

“I’m sorry, but I don’t have real-time data access, and I can’t provide current or future exchange rates or financial market quotes. Exchange rates and market values are subject to constant fluctuations based on various factors, including economic indicators, geopolitical events, and market sentiment.

To obtain the most up-to-date information on the USD/BRL exchange rate and the expected value when the IBOV (Ibovespa, the main stock index of the São Paulo Stock Exchange) reaches 130,000, I recommend checking reliable financial news sources, consulting with a financial expert, or using a financial platform that provides real-time market data”

As ChatGPT is trained with data up to early 2022, it doesn’t have access to up-to-date over the web. We can validate this information with a follow up question:

“Can you get the USDBRL and IBOV daily quotes from Jan to Oct?”

ChatGPT:

“I don’t have the capability to fetch real-time data or historical financial information. However, you can easily find historical USD/BRL exchange rates and IBOV (Ibovespa) daily quotes from various financial news websites, financial data providers, or financial platforms that offer historical data. These sources may include financial news websites, investment platforms, or central banks.

You can check financial news websites like Bloomberg, Reuters, or Yahoo Finance. Alternatively, financial data providers such as Investing.com or TradingView often provide historical price charts for various financial instruments.

If you have access to financial analysis tools or platforms, you can also retrieve historical data for the USD/BRL exchange rate and IBOV index for the specified time frame (January to October) from these sources.

Keep in mind that financial data is subject to copyright, and you may need to comply with the terms of use of the respective platforms when accessing and using historical data”

Moving to Microsoft Bing

As you may know, MS Bing Chat uses the capabilities of ChatGPT to understand and generate human-like text based on the user’s input, contextualized by up-to-date web content indexed by Bing. Let’s try then the very same first question:

“The USD/BRL quote when the Ibovespa (IBOV) reaches 130,000 is not a straightforward calculation as it depends on a variety of factors. The USD/BRL exchange rate and the IBOV index have a positive correlation1. This means that the Real (Brazil’s currency) tends to appreciate as funds flow into the market and stocks rise1. However, other factors such as changes in interest rates, inflation, and economic policies in both the US and Brazil can also affect the USD/BRL exchange rate2. Therefore, it’s not possible to provide an exact quote for USD/BRL when IBOV reaches 130,000 without considering these factors. For the most accurate information, I recommend consulting with a financial advisor or conducting a detailed financial analysis.”

Even having access to up-to-date financial information does not retrieve or create a process to infer the expected information. So, let’s ask the second question in order to request the data retrieval directly:

“Here are the daily quotes for USDBRL and IBOV from January to October 2023:

**USD/BRL:**

- Highest: 5.4799 BRL on January 3, 2023

- Average: 5.0110 BRL over this period

- Lowest: 4.7264 BRL on July 31, 202³²

**IBOV:**

- The IBOV index was at 120,294.69 BRL on November 13, 202³⁴

- For more detailed historical data, you can check the historical data section on Yahoo Finance⁵, ADVFN⁶, or Investing.com⁷.

Please note that these are historical data and the actual rates can vary based on market conditions and other factors. For the most accurate information, I recommend checking the latest data on a reliable financial news website or consulting with a financial advisor”

That is better already. Not only does Bing fetch the data, but it also shows a graph of it. Let’s try then to ask for an analysis of this structured data with a third question:

“What would be the expected USDBRL rate when IBOV reaches 130,000?”

Bing:

“The expected USD/BRL rate when the IBOV reaches 130,000 is not a straightforward calculation as it depends on a variety of factors. The USD/BRL exchange rate and the IBOV index have a positive correlation. This means that the Real (Brazil’s currency) tends to appreciate as funds flow into the market and stocks rise. However, other factors such as changes in interest rates, inflation, and economic policies in both the US and Brazil can also affect the USD/BRL exchange rate. Therefore, it’s not possible to provide an exact quote for USD/BRL when IBOV reaches 130,000 without considering these factors. For the most accurate information, I recommend consulting with a financial advisor or conducting a detailed financial analysis”

It couldn’t directly infer the value, but maybe we just have to do a little bit more prompt engineering and direct it to create an inference model (probably a linear regressor) so try to infer the expected value:

“Create an inference model and forecast the expected USDBRL rate when IBOV reaches 130,000”

Unfortunately, it couldn’t do it, so let’s move to Bard. After that, we can come back to Bing and try something else.

“I’m sorry, but as an AI, I don’t have the capability to create an inference model or forecast the expected USD/BRL rate when IBOV reaches 130,000. The relationship between the IBOV index and the USD/BRL exchange rate is complex and influenced by many factors, including economic policies, interest rates, and global market conditions. I recommend consulting with a financial advisor or using specialized financial software for this purpose. Please let me know if there’s anything else I can assist you with”

Let’s try Google Bard

To be fair, let’s repeat all four questions. Starting with the first one, here is Google’s response:

“To answer your question, I would need to know more about the specific context. For example, are you asking about the expected USD/BRL quote for the expected value when IBOV reaches 130,000 points in the next 12 months? Or are you asking about the expected USD/BRL quote for the expected value when IBOV reaches 130,000 points at any point in the future?

If you are asking about the expected USD/BRL quote for the expected value when IBOV reaches 130,000 points in the next 12 months, then the answer is that it is difficult to say with certainty. However, according to a recent survey of economists, the median forecast for the USD/BRL exchange rate in 12 months is 5.5. This means that there is a 50% chance that the USD/BRL exchange rate will be higher than 5.5 in 12 months, and a 50% chance that it will be lower than 5.5. …

If you are considering investing in Brazilian assets, you should consult with a financial advisor to discuss your individual circumstances and risk tolerance”

It is not a piece of precise information on the result of processing structured data, but it seems better and fair enough for an unstructured data analysis from an LLM. Let’s ask specifically about structured data about it with the second question.

It fetches the data and enables export to a Google Sheet so that we can do further analysis with that outside of Bard.

Let’s now try the third question and see if it leverages this data for a more precise inference analysis.

“The expected USDBRL exchange rate when IBOV reaches 130,000 points is difficult to predict. However, based on the current trend of the Brazilian economy and the relationship between the USDBRL exchange rate and the IBOV index, it is possible that the USDBRL exchange rate could be between 6.00 and 6.50 when IBOV reaches 130,000 points. It is important to note that this is just a forecast, and the actual USDBRL exchange rate could be significantly higher or lower than this forecast …

It is important to consider all of these factors when forecasting the future USDBRL exchange rate. If you are considering investing in Brazilian assets, you should consult with a financial advisor to discuss your individual circumstances and risk tolerance.”

It is not precisely yet processing the structured data, but it already has a narrower and clear expected value. Let’s be more direct on the ask and request an inference model to do the job using the fourth question.

“To create an inference model to forecast the expected USDBRL rate when IBOV reaches 130,000, we can use the following steps: collect data, choose a machine learning algorithm, train the model, evaluate the model, and deploy the model. To forecast the expected USDBRL rate when IBOV reaches 130,000, we have to feed the IBOV index value of 130,000 into the trained model, giving us USDBRL = 4.68

Therefore, the expected USDBRL rate when IBOV reaches 130,000 is 4.68.”

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
# Collect data
df = pd.read_csv('https://raw.githubusercontent.com/lcbjrrr/btt/main/usd_brl%20-%20bov.csv')
# Choose a machine learning algorithm
model = LinearRegression()
# Train the model
model.fit(df[['ibov']], df['usdbrl'])
# Forecast the expected USDBRL rate when IBOV reaches 130,000
forecast = model.predict(pd.DataFrame({'ibov':[130000]}))
print('Expected USDBRL rate when IBOV reaches 130,000:', forecast)

That is great! Bard not only created a Python code to process the structured data but actually executed it for us, providing a pretty precise response (still unrealistic, but precise given the limited data we requested to process. See more about it in the final considerations section). So, let’s go back to Bing and try this code-based approach.

Going back to Bing

We never explicitly asked about creating and running a Python code to create the expected inference model. Let’s give it a try with a fifth question, proving the data on a CSV Google Sheet generated by Bard.

“Generate a Python code that creates an inference model (linear regression) and forecasts the expected USDBRL rate when IBOV reaches 130,000, using the following dataset https://raw.githubusercontent.com/lcbjrrr/btt/main/usd_brl%20-%20bov.csv”

Bing generated an almost identical code as Bard, which is very good.

“Sure, here’s a simple example of how you might create a linear regression model using Python’s sklearn library and pandas to read your data. Please note that this is a simplified example and may not give accurate predictions for the USD/BRL rate when IBOV reaches 130,000 …”

Unfortunately, it couldn’t run it on the fly. Anyhow, it is fine we could easily run it elsewhere.

“I’m sorry for the misunderstanding, but as an AI, I don’t have the capability to execute Python code or any other programming language code. I can help generate the code, but you’ll need to run it in your own Python environment. Please make sure you have the necessary permissions and environment to run the code. If you encounter any issues or errors, feel free to ask! I’m here to help. :)”

Final Considerations

This example demonstrates the potential of LLMs to work with structured data to perform complex tasks. However, it is essential to note that LLMs are still under development, and their performance on structured data tasks can vary depending on the specific task and the amount and quality of training data. In the future, LLMs will likely play an increasingly important role in working with structured data. LLMs can help businesses to automate tasks, improve efficiency, and gain new insights from their data, for example: develop new data mining and machine learning algorithms, create new data visualization tools, or new tools for data quality assurance.

In the next few years, we expect to see LLMs used in various new and innovative ways to process structured data. One area where LLMs are likely to have a significant impact is the development of new data mining and machine learning algorithms. LLMs can be used to learn from large datasets of structured data and develop new algorithms that are more accurate and efficient than existing algorithms. This could lead to breakthroughs in various fields, such as fraud detection, medical diagnosis, and customer segmentation. LLMs are likely to play an essential role in developing new data visualization tools. LLMs can be used to generate interactive and informative visualizations of structured data, making it easier for businesses to understand and interpret their data. This could lead to better decision-making and improved business outcomes.

Finally, LLMs are also likely to be used to develop new tools for data quality assurance. LLMs can be used to identify inconsistencies and errors in structured data, helping businesses ensure their data quality. This could lead to improved data accuracy and reliability. Overall, the potential of LLMs to revolutionize the way we process structured data is vast. As LLMs continue to develop, we can expect to see them used in various new and innovative ways to help businesses work with their data more effectively and efficiently.

--

--