Developing a GPT Tool for Game QA — Innovation Friday

Published in

AI and Games — By Regression Games

7 min readNov 13, 2023

Introduction

At Regression Games, we’re always looking for ways to integrate new technologies into game development. Every Friday, our team members have the freedom to try out new models, experiments, AI, and any other idea that comes to mind related to our mission of building the best tools to make the jobs of game developers easier.

Last Friday, we experimented with OpenAI’s recent addition to ChatGPT — Custom GPTs. We created a GPT for quality assurance (QA) in game development, a tool designed to generate QA testing instructions based on game information. We managed to get a prototype up and running in just under an hour, with promising results!

You can try it out at this link for yourself, and we’d love to hear your feedback at info@regression.gg. If you want to learn more about automating your QA processes with AI agents, make sure to visit our site at https://regression.gg

→ https://chat.openai.com/g/g-LrhUBrExn-game-qa-strategist (requires a ChatGPT plus subscription)

The Concept

Our team started with a clear goal: to see if we could use GPT to automatically provide QA strategies in game development. We envisioned a system that could take a game’s description, screenshots, recent code changes, and then provide a structured QA testing plan.

Brainstorming and Planning

We began by outlining what we wanted our tool to achieve.

Purpose: Create an AI tool to assist in QA process planning.
Inputs: Game description, screenshots, recent code changes.
Output: A QA plan for testing the game. This would be a list of specific actions a QA tester/engineer can take to test the recent changes.

Our challenge was figuring out how to guide the GPT to ask for and use this specific information, and then have the GPT come up with a useful plan.

Research and Inspiration

Before implementing the GPT, we looked around for existing GPTs, using social media and articles to discover what others are up to with the new tooling. There were countless “Top 10 GPTs built this week” posts that gave us some great ideas, such as this post below:

https://twitter.com/NickADobos/status/1723746210543214799

The Implementation Phase

Once we gathered some ideas, we moved on to implementation. It was easy to get started — the interface for building a GPT is essentially a chat bot. We began by giving it some high-level information about what we wanted.

It prompted us with further requests for information — what should the name be? What sort of QA tests should we focus on? What content should we highlight? We told the model to do a few specific things:

1. Uploading Context and Knowledge on Effective QA Practices

We wanted the GPT to reference knowledge and materials about game QA. We copied information like this from sites which provide general information about effective QA testing, specifically in engines like Unity. Our hypothesis is that giving it this information would further refine its QA strategy.

2. Requesting Specific Context on the Game and Recent Changes

Next, we told the model to ask the user for specific information, specifically the game description, screenshots, and git history. This information can then be used to recommend specific tests for a specific set of changes.

3. Iterating and Adjusting

With the GPT development experience, you can preview your GPT at any point. We found a few issues, such as not prompting for game-specific information from the user, and not connecting tests back to commits that support those test cases. We informed GPT of these issues, and it was able to correct them.

The Results

Once we were satisfied with the results, we took our GPT for a spin with a city simulator game. We explained the game’s concept, provided a screenshot of its Steam page, and then grabbed a git history screenshot that mentioned specific game-logic changes. Overall, the GPT gave some great QA recommendations:

A screenshot of the GPT (left) and the git context we provided (right)

Specific QA Tests for CitySkylines:

- UI/UX Testing:
  - Test UI response to user inputs after commit 2de93914.
  - Verify UI compatibility across devices post commit ebdff92.
  - Assess UI accessibility following commit e2879f5.

- Integration Testing:
  - Integrate and test new building prefabs from commit 1257cce.
  - Confirm real-time UI updates from commit b04c5a8.

- Feature Testing:
  - Validate police car functionality, especially after commits 36da340 and e5d5ed3.
  - Check economic system for regressions after commit b93e5d6.

- Performance Testing:
  - Conduct load testing to assess game performance under heavy loads (reference commit 8ce6e54).
  - Execute stress tests on new UI elements from commit 6bd7d3d.

- Regression Testing:
  - Automate regression for bugs fixed in commits 1181d59 and 8f36d2a.
  - Run full regression tests to ensure existing functionality remains unaffected by new changes.

- System Testing:
  - Test scene transitions and ensure state maintenance after commit 6a8a0f71.
  - Validate canvas functionality and element scaling after commit b93e5d6.

- Exploratory Testing:
  - Perform freeform play to uncover unexpected issues.
  - Create and test in-game scenarios like natural disasters or economic challenges.

- Save System Testing:
  - Ensure save/load functionality is intact after UI changes from commit 8ce6e54.
  - Test for data integrity during save state conflicts.

- Multiplatform Testing:
  - Execute tests on different consoles and mobile devices, if applicable.

- User Feedback:
  - Implement beta testing and collect user feedback to inform QA processes.

Note: Each test should be clear and concise with expected outcomes for efficient issue reporting.

There were a few things that surprised us:

GPTs are fantastic at understanding multiple modalities of input. Rather than using git log to provide context on my recent code changes, a screenshot of the git history from the web worked just as well.
File manipulation via the Python interpreter is a great way to export data from the GPT. We found that it was very good at taking instructions to output a file with the information for download.
Using a chat bot to build a chat bot is surprisingly efficient. While it doesn’t let you do too much customization, it seems to be great for simple tasks.
The knowledge upload section of the GPT is a simple way to provide it with deep knowledge on the task at hand. Because our knowledge base has information about different types of QA testing (such as functional testing and regression testing), GPT was able to create compelling and useful tests.

Limitations

The GPT needed to be reinforced a few times to prompt the user for specific information. For example, in our first preview test of the GPT, it provided QA recommendations without any game context, opting for generic QA testing advice. After reinforcing that it needs to provide that information, it seemed to get the point.
There were a few instances where the GPT seemed to glitch out during development, and it would respond to its own output. This seemed to be resolved if we stopped the generation and then provided our own responses.
GPT does have the ability to run code, but it does not have internet access. This made it tough to do some operations, like pulling the git history for a repository, rather than us uploading it manually.

While these are not dealbreakers for getting a useful GPT up and running, you should keep them in mind when running your own experiments.

Next Steps

We had a lot of fun developing this GPT, but there is still a lot to explore! Some questions to tackle in the future:

How can we update the model to output tests in a format that can be used and ingested by existing QA teams processes? Can we send emails or make Jira tickets automatically?
Can the GPT generate code for a testing framework to implement some of these tests?
What other QA testing practices should the agent be aware of?

Join us every Friday for our Innovation Friday streams at twitch.tv/regressiongames, where we will be conducting more experiments within AI and games. In the meantime, visit our website at https://regression.gg, and we hope to see you next time!

Our website: https://regression.gg
Twitter/X: https://twitter.com/RegressionGG
Facebook: https://www.facebook.com/RegressionGG
YouTube: https://www.youtube.com/channel/UCzOma99Ix8tD2cX2lo5lO6A
Twitch: https://www.twitch.tv/regressiongames
Our documentation site: https://docs.regression.gg/