# Coding with chatGPT

## Using OpenAI’s language model to write Python code for geo-data science

Recently, I found out about ChatGPT, an ai chat bot developed by OpenAI that is capable of responding to users in a remarkably human manner. Not only can chatGPT give responses using text, you can ask it to write code as well. In fact, an acquaintance recently told me that she used chatGPT to do some of her homework for her Masters in urban data science, both for writing essays and writing code.

As someone who spends most of her working hours coding, having a chat bot write code for me sounds both incredibly exciting (think of all the hours I could save!) and a little bit scary (does this mean I’ll be out of a job in a few years?). I decided to try out the coding capabilities of chat GPT, specifically for spatial data analysis with Python. This article shows some first results, and my first thoughts.

For this article, I asked Chat GPT to do a typical task I have to do often: turn a polygon (could be the shape of a country, city, or municipality) into a grid of cells where statistics could be easily aggregated.

Before we start, what is Chat GPT, actually? I asked this question to the chatbot itself, and this is its response:

GPT (short for “Generative Pre-training Transformer”) is a type of language model developed by OpenAI. It is a neural network trained to generate human-like text by predicting the next word in a sequence given a large dataset of examples. Chat GPT refers to the use of this model in chat applications, where it can be used to generate responses to user messages in real time. It is designed to be able to carry on a conversation by understanding the context of the conversation and generating appropriate responses.

# Let’s start!

First, I asked Chat GPT, “make a geodataframe grid within a polygon”, and this was its response:

`import geopandas as gpd# create a polygonpolygon = gpd.GeoDataFrame({'geometry': gpd.GeoSeries([Polygon([(0,0), (0,1), (1,1), (1,0)])])})# create a grid of geodataframe within the polygongrid = gpd.gridify(polygon, cell_size=0.1, geometry='Polygon')# plot the gridgrid.plot()`

I was quite impressed with the response, especially its suggestion to use `gpd.gridify()`, a `geopandas` attribute I’ve never heard of before. However, when I tried to run the code, I found out that the module `geopandas`has no attribute `gridify`. In other words, the chat bot suggested for me to use a tool that looks really handy, but doesn’t actually exist.

There is, however, another python module called `gridify`, so perhaps the chatbot got confused by that, see link here. It also forgot to import the `shapely` attribute `Polygon` before using it, but that was just a minor issue.

The interesting thing about chat GPT is that you can have a continuous conversation with it, and it will remember your previous interactions. So I asked a follow-up question to try to get a working piece of code. So I asked, “I tried this code but got the error “module ‘geopandas’ has no attribute ‘gridify’”. can you fix this for me?”, and this was its response:

While the bot claimed that `shapely` has the attribute `gridify`, I found out that wasn’t true when I ran the code. `shapely` doesn’t have the attribute `gridify` either.

So I asked another follow-up question, “I tried the code but got the error message “No module named ‘shapely.gridify’”. Any other options?”

The bot then proceeded to confidently tell me that `gridify` IS part of the `shapely` library, but has been removed in newer versions — that’s also not true as far as I know.

The code it wrote produced … a square loop, which is far from what I was looking for. Honestly I found it pretty hilarious.

# Second attempt: it worked!

My first conversation with chat GPT clearly wasn’t going very well, so I decided to start again, and ask my question more clearly. I asked it, “python code for generating a grid of cells as a geodataframe”, and here is its response:

`import geopandas as gpdfrom shapely.geometry import Polygon# Define the bounds of the gridmin_x, min_y = 0, 0max_x, max_y = 10, 10# Define the size of the cells in the gridcell_width, cell_height = 1, 1# Define the projection to usecrs = "EPSG:4326"  # WGS84# Initialize an empty list to store the grid cellsgrid_cells = []# Iterate over the bounds of the grid and create a Polygon object for each cellfor x in range(min_x, max_x, cell_width):    for y in range(min_y, max_y, cell_height):        # Define the coordinates of the cell        cell_coords = [(x, y), (x+cell_width, y), (x+cell_width, y+cell_height), (x, y+cell_height)]                # Create the Polygon object        cell = Polygon(cell_coords)                # Add the Polygon to the list        grid_cells.append(cell)# Create the GeoDataFramegdf = gpd.GeoDataFrame(geometry=grid_cells, crs=crs)# Visualize the gridgdf.plot()`

…and it works!! The code above managed to create a grid of cells as a geodataframe, which could be cropped to fit onto a specific area (e.g. a country, municipality, or city). I could probably ask chatGPT to do this next step as well, but unfortunately the bot is currently unavailable because too many people are using it — something that’s been happening more and more often lately.

So chat GPT is indeed capable of writing simple code in Python for spatial data science, as long as it is given a well written and clear question.

# Final thoughts

This brief experiment with chatGPT highlights two interesting things about the chatbot, and perhaps AI in general.

Firstly, we can see that AI can generate artifacts that may look legitimate at first glace, but turns out to be wrong under further investigation. A common example is images of faces generated by AI, where the faces may look realistic at first glance, but turn out to have strange anomalies when looked at more closely.

When writing our code, chat GPT did the something similar by making up a `geopandas` module called `gpd.gridify()` — a module that seems to make sense, but doesn’t actually exist.

Secondly, the purpose of chatGPT (and perhaps other AI bots) is not to present accurate information, but rather what looks like accurate information. According to the FAQ page of chatGPT, the model is trained to mimic human speech as much as possible, and not to provide us with the truth.

This is why it so confidently claims that `gridify` is part of the `geopandas` library, and when proven wrong, claims that `gridify` is part of `shapely` library instead. Neither of these statements are true, but perhaps that is not the point of chatGPT.

# What’s next?

I’m definitely very excited about the capabilities of chatGPT, and will be using it to work on other more difficult coding tasks in the future.

I also found that it is quite handy at generating ideas — I asked for ideas for data science projects for sustainability + spatial data science, and while it didn’t produce ideas that were novel to me, it provided quite a good overview of the sustainability / spatial data science world, which I can imagine could be quite handy for someone new to the field.

Here’s chatGPT’s response when I asked for project ideas:

• Sustainable transportation planning: Develop a tool that helps planners optimize public transportation networks to reduce greenhouse gas emissions and improve accessibility. This could involve analyzing patterns of travel demand, integrating with public transportation data sources, and implementing algorithms for network design and routing.
• Sustainable land use planning: Develop a tool that helps planners optimize land use patterns to minimize environmental impacts and enhance sustainability. This could involve analyzing patterns of land use and land cover, integrating with land use data sources, and implementing algorithms for land use allocation and optimization.
• Environmental impact assessment: Develop a tool that helps organizations assess the environmental impacts of their operations or projects. This could involve integrating with data sources on environmental indicators, such as air and water quality, and implementing algorithms for impact assessment and prediction.
• Renewable energy site selection: Develop a tool that helps organizations identify suitable sites for renewable energy generation, such as solar or wind farms. This could involve analyzing patterns of solar insolation, wind speed, and other relevant factors, and integrating with data sources on land use and topography.
• Sustainable supply chain optimization: Develop a tool that helps organizations optimize their supply chains to minimize environmental impacts and enhance sustainability. This could involve analyzing patterns of material and energy flows, integrating with supply chain data sources, and implementing algorithms for optimization and risk assessment.

ChatGPT is open for everyone to use for now on chat.openai.com. To use it, all you need is an internet browser. Feel free to try it for yourself, and let me know what you find :)

# Acknowledgements

I’d like to thank my friends Alexey Ilyshkin and Alexis Oh for sharing their thoughts on chatGPT and AI with me. Our coversations were integrated into the ‘final thoughts’ section of this article.