OpenAI Retro Contest Day 3

Alternate title: running commands until they work.

Day 3 for the OpenAI Retro Contest started off with just buying the actual game from Steam, since all of the previous day’s work was devoted to avoid a $5 fee. With that hurdle solved, the rest of the official instructions were pretty easy to blindly follow, though that would come back to bite us. (It was awesome to see the docs embed our credentials into the instructions, since that is a touch you don’t even see much for professional API documentation.) After installing Docker, waiting a loooong time for Docker things to download, upload, build, or whatever Docker things Docker-do we were able to get our random agent evaluated:

At this point 2 teams had something better than a random agent submitted

JERK it up

With our new knowledge, and break time over, we continued our learning journey with Gotta Learn Fast which was super helpful, despite its spelling mistakes, and attempting to implement some of the other baseline implementations. The jerk agent (Just Enough Retained Knowledge) seemed simple enough in its code that we could understand how it works since it ultimately was a slightly more advanced random controls.

backtracking due to negative reward: 0.000000
backtracking due to negative reward: 0.000000
backtracking due to negative reward: 0.000000
...
{
"done": {
"variables": {
"lives": {
"op": "zero"
}
}
},
"reward": {
"variables": {
"score": {
"reward": 10.0
}
}
}
}
{
"done": {
"variables": {
"lives": {
"op": "zero"
}
}
},
"reward": {
"variables": {
"x": {
"reward": 75.0
},
"screen_x": {
"reward": 100.0
}
}
}
}
Always go right!

Beating ourselves into submission

The logical capstone to a day is submitting your agent for evaluation, so that is what we ended up spending the next half of our day doing. Our near zero conceptual and practical knowledge of Docker was not an asset here. We started off by changing our agent to use the remote environment, and tried running the provided commands again. Here are just some of the issues we ran into, while trying to start our victory celebrations.

  • Eventually we got everything packaged up and off to the evaluation server… but it errored out immediately. We decided we needed to try to run it locally first to see if it worked.
  • Local execution errored too, so I guess that is progress.
  • Went back and forth quite a bit about which env to run, since we kept getting a socket connectivity issue.
  • We didn’t realize that you had to build each time, so we spent a while making changes to our code and docker file, but not building, and not seeing any updated results
  • retro-contest --help is totally worthless, but I can’t find the source to make a pull request. Turns out the undocumented --use-host-data command is essential if you want to get sonic to run.
  • Docker wants to use the sonic data from python installed in your user directory. That is not where I have python installed, but since I don’t know where to change that config, I just copied it to where Docker wants it to be.
Looks like we were middle of the pack as far as jerks go.

🎉Second Place 🎉

Now we were totally done for the night, but still had some open questions:

  • How to or even if we should make scripted reward functions?

Software Lead at NorthPoint Development. When I’m not helping automate a real estate company, I’m growing succulents in my back yard. https://tristansokol.com/