This is the first in a series of blogs we will be writing about some of the problems we are trying to solve at Playment and how we’re doing our bit to advance the AI age. You can read the rest of them here: https://medium.com/playment
Playment: Advancing the AI age
A lot of companies are trying to automate the tasks that are currently done by humans but consume lot of their time, energy or money.
These tasks vary across a wide array of domains like Autonomous Vehicles, Delivery Drones, VR / AR applications, Robotics, Manufacturing, Medical Diagnosis, Surveillance.
Applications of AI will revolutionise and advance numerous fields and industries–including finance, healthcare, education, transportation, and more.
Playment is building the core infrastructure which goes into building AI applications. We provide high quality training data to companies and startups in the AI space, so that they can train their models.
In the entire AI industry, nowhere is the need for good quality training data more evident than the autonomous vehicles sector. The margin of error for an autonomous vehicle is very low: a single misprediction can cause grave danger to human life.
At Playment, we aim to build out the training data infrastructure to accelerate the AI development process for autonomous vehicles.
Autonomous Vehicles
An autonomous vehicle is a vehicle that is capable of sensing its environment and moving with little or no human input.
Autonomous vehicles stand to revolutionise the transportation and logistics industry in innumerable ways. Benefits of autonomous vehicles include reduced costs, increased safety, increased mobility, reduced crime. Automated cars are predicted to increase traffic flow, provide enhanced mobility for children, elderly, and disabled. They will also increase the fuel efficiency of vehicles, and facilitate business models for transportation as a service especially via sharing economy.
The disruption in Autonomous vehicles has a very large footprint. It affects not just transportation – but also car manufacturing, insurance, travel, retail and logistics industries as well. This is one of the reasons why investors and analysts have been so bullish about it.
This is a brief primer on how autonomous driving vehicles work. And how training data is a very integral part of developing this technology.
Broadly put, self-driving vehicles have four key functions:
Mapping and Localization
Locating the vehicle’s exact coordinates and positioning on the road. This provides vehicles with context of where it currently is as well as where it is are driving towards.
Object Detection (Perception)
Advanced signal processing and ML algorithms to detect and classify objects as well as identify drivable space, lane boundaries, etc.
Trajectory Planning
A short and long term route planning system that uses data from maps, driver and perception models to generate trajectories and choosing the optimal path dynamically.
Motion Control
Producing a “human-like” driving experience by actively managing the actuators — thereby controlling the steering wheel, brakes and accelerator.
The Perception Problem
One of the key things which an autonomous vehicle needs to understand, is the ability to perceive the surrounding environment around itself, so that it can take informed decisions and plan its path accordingly.
That is the perception problem.
If you think about it, the self-driving car needs to accurately detect and track the objects around itself in all scenarios. It has to work perfectly in the glaring afternoon sun, as well as in pitch darkness. Even when it is snowing, or raining, or foggy — the perception systems need to work impeccably.
Autonomous Vehicles have usually have a slew of sensors attached with them. The most common ones are:
LiDAR Sensors
The LiDAR sensor fires rapid pulses of laser light, sometimes at up to 150,000 pulses per second. A sensor on the instrument measures the amount of time it takes for each pulse to bounce back — which is used to create 3D models and maps of objects and environments
The fast spinning bob you see on top of most of the self-driving cars is the LiDAR device. It is by far the most expensive sensor mounted on self driving cars.
A great video on how LiDAR works: https://www.youtube.com/watch?v=EYbhNSUnIdU
For a more in depth reading, you can visit: https://arstechnica.com/cars/2019/02/the-ars-technica-guide-to-the-lidar-industry/
Cameras
Autonomous vehicles usually have a suite of cameras to provide visibility of the environment around itself. Wide angle cameras provide broad visibility around the car, whereas narrow cameras provide a focused, long-range view of distant objects.
Radar Sensors
The RADAR system works in much the same way as the LiDAR, with the only difference being that it uses radio waves instead of laser. Since, radio waves have less absorption (due to large wavelength), it can detect objects through fog, dust, rain, snow. However, because of large wavelength, precision of objects detected from RADARs is quite low. Although, they play an important role in detecting and responding to motion of forward objects.
Ultrasonic Sensors
Some cars also use ultrasonic sensors for close-range work. They help detect nearby cars in dense places, and also provide guidance when parking.
All these sensors have different input types associated with them.
An average self-driving car with all the sensors captures data at the rate of 1 GB per second. All this data is fed into the perception models which then detect and track objects around the car.
This is where training data comes into play. The larger, the better, and more diverse dataset the perception models are trained on, better the vehicle is able to detect and track the objects, and subsequently, better is the performance on road.
Annotation is a critical part of the self-driving cycle. For every one hour driven, it takes approximately 800 human hours to label it.
– Carol Reiley (Cofounder, Drive.ai)
Training Data
Machine learning systems today rely heavily on deep learning algorithms. The catch here is that these algorithms have to be taught. They must digest vast amounts of labeled data (tagged and annotated photos, videos, etc) before they are useful.
Using training data, algorithms can be developed to find relationships, detect patterns, understand complex problems and take decisions.
Better the training data, the better the algorithm or classifier performs.
Quality
,Variety
, andQuantity
of training data determine the success of machine learning models.
However, high quality training data is very scarce and is very difficult & time-consuming to generate.
How Playment helps?
This is where Playment comes in the picture. Playment is a fully managed data labelling platform which provides high quality training data for computer vision models at scale.
Our community of 250,000+ skilled workers, who work through their mobile or laptop, our state-of-the-art tools, and our stringent quality assurance workflows enable us to create high quality, high volume training data sets.
We set up training data pipeline which the AI companies can then integrate with their development workflow.
On a broad level, the way the entire system functions is:
- AI company shares the raw dataset & labelling guidelines.
- Playment setups the project and distributes the work to 1000s of annotators.
- The annotators complete the work, and earn money in return.
- The data is collated, checked for quality — and finally sent back to the AI company.
This is a win-win situation for everybody.
The AI companies usually do not want to add the overhead of maintaining large operational teams for data labelling requirements (which are often fluctuating).
Playment provides them with full tech support, annotation process consulting, and dedicated project management — for any of their training data needs.
On the other hand, our users in India mostly come from tier 2 and tier 3 cities. For them, an additional source of income is life-changing in most cases.
If you find these problems interesting, and are looking for new challenges – we’d love to have you here at Playment. You can know more about the open positions at Playment at https://playment.io/jobs/
Further reading & references
https://www.thinkful.com/blog/what-is-data-science/
To know why we machine learning requires so much data.
https://www.ben-evans.com/benedictevans/2017/3/20/cars-and-second-order-consequences
http://on-demand.gputechconf.com/gtc-il/2017/presentation/sil7142-efrat-rosenman-leveraging-ai-for-self-driving-cars-at-gm.pdf
https://medium.com/swlh/everything-about-self-driving-cars-explained-for-non-engineers-f73997dcb60c
https://www.sensorsmag.com/components/lidar-vs-radar
https://www.wired.com/story/guide-self-driving-cars/
Playment Blogs:
https://blog.playment.io/what-is-training-data/
https://blog.playment.io/autonomous-driving-levels-0-5-explained/
https://blog.playment.io/training-data-machine-learning/