Driven by Data. Enabled by Infrastructure.

From car to keyboard and back again: The average daily journey of a data package.

Published in

Five Blog

4 min readSep 13, 2019

Developing self-driving technology and bringing it safely onto public roads is one of the most significant research and engineering challenges of our time.

At Five, We’re building self-driving software and development platforms to help autonomy programs solve the industry’s greatest challenges.

We’re a diverse team of engineers solving a range of challenges from design and development, configuration and management of cloud services, tools, simulation environments, pipeline orchestration, as well as local environments for crucial jobs like annotation, characterization, simulation, testing and data management. These challenges benefit from operating in cloud-based architectures with high availability due to the scalability and the distributed nature of the system we’re building.

Enabled by infrastructure.

Our infrastructure team enables multidisciplinary people to work together virtually in real-time across multiple sites. Transferring terabytes of data between them both efficiently and quickly, serving meaningful car data to our data-hungry ML engineers is a significant challenge.

To enable the best work in our teams, we have embraced the DevOps culture, removing the barriers for our engineering teams to design, develop and deploy our systems quickly and easily. We have created squads with diverse skill sets to tackle challenges across a variety of domains, as well as guilds who share expertise across our different roles.

Vehicles: a primary source of data

Each day begins with our fleet of self-driving vehicles being initialized and set up for the day with their safety drivers.

Onboard, we have high density compute, making trillions of calculations each second. We capture data from various sources, for example, cameras, LiDAR, radar, GPS, IMU and vehicle telemetry.

An hour later, the cars return to the garage, where we are connected, and our focus shifts to extracting, tagging, prioritization, storage and distribution of terabytes as quickly as possible so our cars can get back out on the road.

Enabling every team with data

Each garage is enhanced with our edge compute powered by Kubernetes, the start of our data’s journey to the cloud. Data is filtered, prioritized and compressed before being transferred to our Data Platform running in AWS, providing efficient access to the latest test data to our other teams.

Here’s a snapshot of a few use cases we underpin:

Enabling Machine Learning Models
The data moves into our Data Platform, where it’s ingested into our Domain Analysis Library. Here all team members have direct access, kicking off parallel pipelines of processing. Our annotation pipeline feeds automated machine learning models annotating the scenes and providing context to the raw sensor data. Our inhouse annotators validate and enhance the subsets of data, improving our annotation models.

Once annotated and in context, we train our machine learning models, many in parallel using Horovod & Spark.

Enabling safety testing & verification
Data is also made available to the safety & verification team after each testing mission.

Enabling simulation
Another use case is in simulation, which you can read more about here. Running a broad range of scenarios provides software assurance, so each package enables us to test rigorously at scale.

Curious to learn more?

The humble journey of a data package is our lifeblood. Helping gather and move it quickly and efficiently powers our entire company.

This is the work of our Infrastructure team. They power every Five team member to operate and to solve what we consider to be the most significant and hardest engineering challenge of a generation.

If you’re excited by what you’ve read, and you want to be part of a growing team of engineers, then please get in touch via our Careers page, say hello to talent@five.ai or have a look at these open roles:

Senior Infrastructure engineer, Cambridge

Senior Infrastructure engineer, London

Infrastructure engineer, Cambridge

Driven by Data. Enabled by Infrastructure.

From car to keyboard and back again: The average daily journey of a data package.

Written by Team Five