From Ballerina to AI Researcher: Part IX

OpenAI Five benchmark game and my tenth week as a scholar

Published in

buZZrobot

3 min readAug 7, 2018

This past Sunday was super fun! OpenAI held the benchmark game with Dota pros as a part of preparation for the main game at The International late in August. It’s really hard to flesh out all the emotions we experienced while preparing for the game. But it was really a moment of reunification for the entire company where everyone had made their contribution to make the benchmark game happen, from actually working on the OpenAI Five training to putting things together at the venue.

The OpenAI Five benchmark game (Credit to OpenAI)

OpenAI Five won two games out of three playing against a team of top-99.95th percentile pro Dota players that included Blitz, Cap, Fogged, Merlini, and MoonMeander. The human team won game three after the audience was asked to adversarially select OpenAI Five’s heroes.

By the way, for this game the team revealed a new OpenAI Five capability — drafting, an extremely challenging part of Dota as heroes interact with each other in complex ways.

Since the compute plays a crucial role in AI advancement, we estimated it to train various Dota systems:

1v1 model: 8 petaflop/s-days
June 6th model: 40 petaflop/s-days
Aug 5th model: 190 petaflop/s-days

Here you can read the blog post with more highlights of Sunday’s game.

My tenth week as an OpenAI scholar

I’ve spent the last week working on building an LSTM model for a classification task (sharing here the major parts of it).

LSTM preprocessing.

The model itself.

Running the model.

Working on LSTM is a part of my learning process in the NLP domain. Besides that, I’m looking carefully at interesting papers as inspiration for my future work. I’d like to highlight the DeepMimic paper released by the Berkeley AI Lab. The paper discuss how to apply deep reinforcement learning to train an agent to do physical-based movements like martial arts. As as result, authors train a humanoid how to performing a cartwheel or Atlas robot to do a spinkick. I highly recommend this paper to anyone who is interested in deep RL and planning to work with motion-capture datasets for physically simulated characters.

Also, I’ve continued my education in RL, and as a part of that learning process I’ve referred to the following papers, articles, and courses:

Proximal Policy Optimization Algorithms

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control

UCL Course on RL

Lessons learned reproducing a deep reinforcement learning paper

Deep reinforcement learning doesn’t work yet

If you have any questions or comments, feel free to ping me. You can learn more about me at Twitter.

You can check out my previous articles here:

From Ballerina to AI Researcher: Part VIII

From Ballerina to AI Researcher: Part VII

From Ballerina to AI Researcher: Part VI

From Ballerina to AI Researcher: Part V