Day 6 of the OpenAI Retro Contest: playback tooling

How to tell if your computer is really learning, or just slacking off.

Now that wehave gotten some agents working in the OpenAI Retro Contest (the jerk on day 3 & a Rainbow DQN on days 4 & 5) I wanted to take a moment to build out some of the tooling we were using.

The agents are now too complex to watch them realtime and have them get anywhere in their learning, I am going to have to check into how my agent is playing sonic a different way. Luckily Gym Retro has a handy record feature that spits out a .bk2 file every time the game is played.

Turning .bk2 into something useful

The .bk2 format seems to be a log of instructions that the environment can use to deterministically recreate the play-through with. It is great in that it is super small ( each one is only ~45kb) and it can be used to replay a session for more learning. The downside is that you can’t just double click it and watch your agent’s sad attempt to beat the first level of Sonic. (Though I would love for someone to make a QuickLook extension! )

The Gym Retro provides a couple of tools for this. One is the that will convert your .bk2 into a .mp4 like this one with ffmpeg:

Wow!Pixels were never meant to be so large.

That is kind of neat, but the transformation to video is kind of time consuming and also increases the size on disc 100x+. It did not really seem like a good option for watching very many replays. I did however take fellow contestant Lyons suggestion and created a local script to make that conversion easier to invoke:

Now I can run

python3 ./scripts/ ./results/bk2/SonicTheHedgehog-Genesis-GreenHillZone.Act1-0001.bk2

to convert any of the runs into a .mp4

A better solution

What I ended up using quite a bit more was the playback interface provided in the code for Gym Retro. You can load the file, and step through it watching the environment render. Playing one file at a time though was slow to see the difference between runs, so I added some basic functionality to ingest a whole folder of .bk2 files and play them one after another:

With python3 ./scripts/ ./results/bk2/ I can watch all of my replays at a faster speed (depending on the framerate variable), and one after another like this:

I did run into one issue where my installed version of Gym Retro didn’t have this commit, so my script would open an additional window for each playback rendered. Luckily one of the admins in the Discord pointed me to the fix and I tried upgrading with pip, but it seems like the binaries don’t have the fix, so in the end I just applied the patch to my local code instead.

There was a tiny bug that I spotted in the Gym Retro readme that I fixed as well with this pull request:

What’s the plan for tomorrow?

In my mind there are a couple workstreams to move forward on for Ben and I on team Bobcats.

  • I would like to spend a couple days digging into the rainbow agent’s code to start understanding it, hopefully starting on the path of being able to start creating some new agents.
  • When running TensorFlow, a waring pops up that:
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
  • so it seem worthwhile to figure out how to compile tensorflow in my Docker build process for the speed improvements, and hopefully learn something about Docker in the process.

Thanks for reading! I hope this helps others who are competing and as always, if you have any questions, comments, concerns, excitements feel free to drop me a line.