In my last article on the subject, I outlined a bot to answer trivia questions from HQ, a live-trivia app for mobile phones. I described the basic inner-workings of how my bot accepts a screenshot of an HQ question, OCRs it to extract the question and three possible answers as text, then uses a variety of techniques to come up with the most likely answer.
The bot has grown quite a bit since then.
Before We Go On…
While this bot does run on live HQ games, I don’t use it to win, only to test the bot’s accuracy. I find the challenge of writing this tool far more rewarding than actually winning the game. My goal for the project is to make the ultimate HQ bot, not cheat to win (it’s worth noting that my winnings from HQ total a whopping $0). That all said, there is evidence of many players Googling to find answers, so I suppose cheating in HQ is inevitable anyway.
The bot has been completely rewritten in my favorite programming language, Scala. I switched because the project started heavily leaning on multithreading, and Python was unable to keep up. The port went quickly and in a few hours I was up and running again.
My old bot uses OCR to extract the question and three possible answers. This is messy for two reasons: 1) human error in taking the screenshot, and 2) there is a half-second time cost to OCR. Point 1 is worth fixing since the less human interaction, the better. Since the goal for the bot is to return the correct answer in under 5 seconds, fixing point 2 can save a lot of time. If there was a way I could get the data directly from HQ, I would save myself a lot of trouble and improve the bot’s response time.
The HQ API
When the HQ app starts, it contacts a REST endpoint to get live game information. If a live game is running, the server includes a broadcast object in the response that contains information about the currently running game, such as stream URLs, timestamps, game IDs, and a curious entry called
I quickly wrote some code to connect to the websocket (setting the proper headers), and I started getting back a bunch of JSON encoded events.
They were chat messages (HQ has an in-game chat that thankfully you can hide). Everything was scrolling pretty quickly, but I could spot other types of events. I dumped all the output to a file and started digging.
Even My Side Projects Have Side Projects
To test my websocket code, I thought it would be interesting to make a word cloud out of the chat from a game. I quickly wrote a tool to aggregate chat messages and dump out word frequencies. I put that output into a word cloud generator and made this:
This is all fun and good, but there’s a more important event for me to look at:
This is a pretty typical example of the question payload. This payload includes the question text, the category (which doesn’t show up in the app anywhere), and the three possible answers. Bingo. Decoding JSON is a lot quicker than OCRing a screenshot and also confers a few more advantages:
- My phone no longer has to be plugged in to the machine this code is running on
- I don’t have to wait for the question to render. On iOS, the question renders word for word, so you have to wait for the entire thing to draw before taking the screenshot (about 1 second!).
- I can run this bot on a cloud instance, which means my latency to Google’s search services and Amazon’s language services is much shorter.
To The Future[T]
All of the solvers have been reimplemented using Scala’s Futures. Each solver now extends a trait that forces the solver to define a name and a function that takes in a question and returns a
SolverResults object contains a few things: the name of the solver that produced it, the question information, and the weights that the solver assigned to each answer (these weights are not the same as — but can be mapped to — probabilities of each answer).
My quorum resolver is the bit of code that takes all of the results from the solvers and consolidates them down to one answer. The quorum has two components: the initializer and the resolver.
The initializer is run in parallel with all the solvers. It takes the text from the question and attempts to classifies it so the resolver can better under the results from the solvers (for example, the Wikipedia-based solver is more likely to be correct on questions where each answer is a proper noun). The resolver takes the data gathered during initialization and a list of
SolverResults from all of the solvers and decides on a final answer.
The high-level workflow is now clear:
- Contact the HQ API and find out if there is a game going on. If there is, go to step 2, otherwise exit.
- Listen to the game’s websocket, wait for a question to come through.
- Decode the question from the HQ JSON in to a
Questionobject for the solvers.
- Pass the
Questionobject to every solver we have and await their responses. At the same time, pass the object to initialize the Quorum system.
- Consolidate these results in to one overall answer and confidence value.
- Listen for the next question.
Playing on the Go
Before we start this part, I need to explain something. The goal of this project is to create the Ultimate HQ bot…and the following idea came to me in the shower and was too good to ignore. I had never done any work with iOS Push Notifications, so I thought “why the hell not!?” And with that, on we go…
If this bot were to be used for actual gameplay, it wouldn’t work unless the user was in front of a computer. There’s no way for a player to both focus on the HQ app and read the bot’s response at the same time from their phone. Luckily, Apple provides a solution.
I created a small, stub iOS application with the sole purpose of receiving push notifications from the bot. Now, after the bot has figured out which answer is best, it can send a push notification to a phone, which pops up right on the screen.
The end result is something like this:
Defeating My Bot
Other articles discuss preventing OCR as the main way to stop botting. One article proposed disabling video output during games, another proposed obscuring the questions by changing fonts and using shading. Both of these are defeated by reading directly from the API.
Of course, obscuring the API is an option, but security through obscurity isn’t really secure. Instead, I believe it all comes down to the questions. There are question styles that my bot doesn’t handle well. For example, the following question (which I have submitted to HQ using their “suggest a question” feature) my bot is unable to answer:
Based on their directors, which film is the odd one out? a) Inception, b) The Prestige, c) Arrival
There’s no concrete topic for my bot to latch on to. None of my solvers can handle questions where the answers all need to be taken in context together. It also has trouble with questions that impose a strict ordering (“Of these, who was the first…”, “Which of these is the biggest…”). This is where I am focusing my effort next, since questions like these represent the majority of what my bot gets incorrect.
Including 3 or 4 of these questions in each game would eliminate most of the threat from bots like mine. Randomly guessing 4 questions correct is a 1/81 chance.
Testing and Configuration
In my previous article, I talked about the question bank that I use for testing. That’s still there, but I am working on expanding it to include tests for the Quorum resolver. Now that my bot has many different solving strategies, I have discovered that running the bot through the question bank in its entirety costs a lot of API hits.
My bot now has several command line arguments that can be used to select questions to test against. For example,
hqbot -t -c questions.csv -n 20 runs question on line 20 from the file
hqbot -t -c questions.csv --run-everything will run the entire file. I can also include a
-p switch if I want to test push notifications.
While this is great for testing the solvers, my quorum code has lots of room for improvement. My current plan for this is to aggregate several sets of
SolverResults so I can test the quorum without actually hitting APIs. That said, the solvers and quorum are integrated parts, so testing them separately may introduce more issues. I’m not quite sure to handle this at the moment.
Even with its limitations, my bot averages 10 out of 12 questions correct in any given game with it topping out at 11 out of 12. I am still improving it and my ultimate goal is to be able to answer every question in at least one game.
I like using side projects to learn and writing an HQ Bot covered a lot of ground: Apple Push Notifications, Natural Language Processing, Statistics, Machine Learning, and a whole lot more. I’m not sure what I’ll be working on next, but there’s still a lot here to chew on.