Speed up your embodied AI training with AI2-THOR 2.7.0

Eric Kolve
Feb 10 · 5 min read
Image for post
Image for post

AI2-THOR is AI2's open-source interactive environment for training and testing embodied AI. We’re pleased to announce the 2.7.0 release of AI2-THOR, which contains several performance enhancements that can provide dramatic reductions in training time. This release introduces improvements to the IPC system between Unity/Python, serialization/deserialization format, and new actions that provide better control of the metadata. Dig into the details below, or jump to our TL;DR at the bottom to grab the update command.

FIFOServer

With the latest release, the FIFOServer backend replaces the legacy HTTPServer(WSGIServer)/JSON backend. To understand why this is significant it can help to understand the relationship between Unity and Python components of AI2-THOR. When the AI2-THOR Controller is launched, a server is launched for Unity to communicate camera parameters (depth, RGB, segmentation, etc.) and metadata about the scene after an action is performed by the agent.

Image for post
Image for post
Lifecycle of an AI2-THOR Action

Once an action has completed, a component within Unity will collect the RGB frame from the camera, and metadata about each object and agent within the scene. The metadata (note the RGB frame is not encoded) is then serialized to JSON (legacy backend) and then the entire payload is sent to Python over HTTP. During performance analysis, both the JSON serialization/deserialization and socket IO were identified as bottlenecks. To address these bottlenecks, the serialization format was switched from JSON to MsgPack and the WSGIServer was replaced with a Named pipe server along with a purpose-built protocol to handle the payload. MsgPack was chosen for several reasons: extremely fast serialization/deserialization in both Python and C# (Unity), robust/mature libraries, schema-less. Due to being schema-less, migration from the JSON format to MsgPack format was easy to validate as they both generate identical data structures that could be compared against each other during development. Using MsgPack we found that the serialized metadata size was reduced by 50%, serialization time was reduced by 40% and deserialization time was reduced by 60%. Named pipes were chosen for similar reasons, but the primary reason was speed. With small payloads (< 128 bytes), we observed performance reaching 100k messages per second (~10μs per message). During testing of the WSGIServer, we could only achieve around 1k messages per second with a similarly sized small payload. Overall, we have observed (depending on scene and type of action being performed) between a 1.5x to 2x increase in FPS just by switching to the FIFOServer.

SetObjectFilter

name
position
rotation
visible
receptacle
toggleable
isToggled
breakable
isBroken
canFillWithLiquid
isFilledWithLiquid
dirtyable
isDirty
canBeUsedUp
isUsedUp
cookable
isCooked
ObjectTemperature
canChangeTempToHot
canChangeTempToCold
sliceable
isSliced
openable
isOpen
pickupable
isPickedUp
moveable
mass
salientMaterials
mass
salientMaterials
receptacleObjectIds
distance
objectType
objectId
parentReceptacles
isMoving
axisAlignedBoundingBox
objectOrientedBoundingBox

Once collected for each object in the scene, these are serialized and sent over the pipe to the Python controller for AI2-THOR. (For more details on the meaning of any of the properties please consult the documentation).

For tasks such as PointNav or ObjectNav, you may only care about zero or one object in the scene during an episode. To better support this use case a new action was added: SetObjectFilter.

controller.step('SetObjectFilter', objectIds=['Mug|0.1|0.2|0.3|'])

This limits the metadata to only include objects that have been explicitly specified. To remove the filter, simply call:

controller.step('ResetObjectFilter', objectIds=['Mug|0.1|0.2|0.3|'])

We have observed increases of 50% in FPS when using this filter, but this will vary depending on the number of objects in the scene and the types of actions being performed.

FastActionEmit

AllenAct Results

Image for post
Image for post
Image for post
Image for post

TL;DR

pip install ai2thor --upgrade

Take a look at our documentation to learn more about how you can get started using AI2-THOR!

Follow @allen_ai on Twitter and subscribe to the AI2 Newsletter to say current on news and research coming out of AI2.

AI2 Blog

AI for the Common Good.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store