Neuromation Story: From Synthetic Data to Knowledge Mining

Neuromation
Neuromation
Published in
8 min readJan 16, 2018

--

Sergey Nikolenko, Chief Research Officer at Neuromation, has shared with BS how artificial intelligence and neural networks help saving manual labor, and why it is increasingly more efficient to use mining farms for data processing rather than cryptocurrency mining.

My name is Sergey Nikolenko, and I am writing this as Neuromation’s Chief Research Officer. Our company is based on two main ideas, and there is an interesting story of how they followed from one another. In my opinion, this story reflects upon the two main problems to be solved in any applied machine learning project today. It is this story that I will tell you today.

First Problem: Labeled Data

Neuromation began with working on computer vision models and algorithms based on deep neural networks (deep learning). The first big project for Neuromation was in the field of retail: recognize the goods on supermarket shelves. Modern object detection models are quite capable to analyze shelf availability, find free space on the shelves, and even track human interaction. This is an important task both for supermarkets and for the suppliers themselves: the big brands pay good money to ensure that their goods are present on the shelf, occupy some agreed upon part of the shelf, have the right side of the label facing the customers — all of these little things increase sales by dozens of percent. Today, a huge staff of merchandisers are going from supermarket to supermarket, ensuring that everything is right on the shelves; of course, not all of their duties are “monkey jobs” like this, but it is a big part of the day for many real human beings.

Our idea for retail is to install (cheap off-the-shelf) cameras that can capture and transmit to a server, for example, one frame per minute for recognition. This is a very low frequency for an automatic system, causing no overload on either the network or the recognition model, but it is a frequency completely unattainable with manual checks, and it solves all practical problems in retail. Moreover, an automated surveillance system will save a lot of effort, automate meaningless manual labor — a worthwhile goal in itself.

A specialist in artificial intelligence, especially modern deep neural networks for computer vision, might think that this problem is basically solved already. Indeed, modern deep neural networks, trained on large sets of labeled data, can do object detection, and in this case, the objects are relatively simple: cans, bottles,packages with bright labels. Of course, there are a lot of technical issues (for example, it is not easy to cope with hundreds of products on one photo — usually such models are trained to detect fairly large objects, only a few per image), but with a sufficiently large labeled data set, i.e. photos with all goods labeled in the layout, we could successfully overcome such issues.

But where would such labeled dataset come from? Imagine that you have a million photos of supermarket shelves (where to get it, by the way, is also a hard question), and you need to manually draw such rectangles as on the image above, on each one of a million photos. Looks like a completely hopeless task. So far, manual labeling of large sets of images has been usually done with crowdsourcing services such as Amazon Mechanical Turk. Manual work on such services is inexpensive, but it still does not scale well. We have calculated that to label a dataset sufficient for recognizing all 170,000 items from the Russian retail inventory (a million photos, by the way, would not be enough for this) we would need years of labor and tens of millions of dollars.

Thus, we faced the first major challenge, the main “bottleneck” of modern artificial intelligence: where do you get labeled data?

Synthetic Data and the Second Challenge: Computing Power

This problem led to Neuromation’s first major idea: we decided to try to train deep neural networks for computer vision on synthetic data. In the retail project, this means that we create 3D models of goods and virtually “place them on the shelves”, getting perfectly labeled data for recognition.

Synthetic data have two main benefits:

  • first, it requires far less manual work; yes, you need to design a 3D model, but this is a one-time investment which then converts into an unlimited number of labeled “photos”; in case of retail the situation is even better since there are not so many different form factors of packaging, and you can reuse some 3D models by simply “attaching” different labels (textures) to them;
  • second, the resulting data is perfectly labeled, as we are in full control of the 3D scene; moreover, we can produce labeling which we would not be able to produce by hand: we know the exact distance from the camera to every object, the angles each bottle and each carton of juice are turned by, and so on.

Of course, this approach is not perfect either. Now you have to train networks on one type of data (renderings) and then apply them to a different one (real photos). In machine learning, this is called transfer learning; it is a hard problem in general, but in this case we have been able to solve it successfully. Moreover, we have learned to produce very good photorealistic renderings — our retail partners even intend to use them in media materials and catalogs.

The synthetic data approach has proved to be very successful, and now the models trained by Neuromation are already being implemented in retail. However, this led to the need to process huge datasets of synthetic images. First, they have to be generated (i.e., one has to render a 3D scene), and then used to train deep neural networks. Generating one photorealistic synthetic image — like the one shown above — usually takes a minute or two on a modern GPU, depending on the number of objects and the GPU model. And you need a lot of these images: millions if not tens of millions.

And this is only the first step — then we have to train modern deep neural networks on these images. In AI research, it is not enough to train a model once: you have to try many different architectures, train dozens of different models, conduct hundreds of experiments. This, again, requires cutting edge GPUs, and training deep networks requires even more computational time than data generation.

Thus, we at Neuromation have faced the second major challenge of modern artificial intelligence: where do we get computing power?

Neuromation Knowledge Mining Platform

Our first idea was, of course, to simply purchase a sufficient number of GPUs. However, it was the summer of 2017, the midst of the cryptocurrency mining boom. It turned out that graphic cards with the latest NVIDIA chips are not just expensive, but they are virtually unavailable at all. After we had tried to “mine” for some GPUs through our contacts in the US and realized that this way they would arrive only in a month or more, we switched to plan B.

Plan B involved using cloud services that rent out complete and set-up machines (often virtual ones). A cloud especially popular with AI practitioners is Amazon Web Services. AWS has become a de-facto industry standard, and many new AI startups are renting computing power there for their development tasks. However, cloud-based services do not come cheap: renting a machine with several GPUs for training neural networks costs a few dollars per hour, and you need a lot of these hours.

We at Neuromation have spent thousands of dollars renting computational power on Amazon — only to understand that we do not have to use them. The prices of cloud-based services are acceptable for the buyers, only in the absence of other alternatives.

And when we started thinking about potential alternatives, we recalled the reason we could not buy enough high-end GPUs. This led to the second main idea of Neuromation: repurposing GPU-based mining rigs for useful computing. ASIC chips that have been designed specifically for Bitcoin mining are not suitable for any other computing tasks, but GPUs that are used to mine Ethereum (ETH) and other “lightcoins”, are the exact same GPUs we need to use to train neural networks. Moreover, cryptocurrency mining generates an order of magnitude less income than the clouds charge for renting an equivalent GPU farm for the same period.

We realized that there is a very powerful business opportunity — a huge gap between prices — and also simply an opportunity to make the world better, redirecting the vast resources that are currently searching for collisions in hash functions to more useful calculations.

This is how the idea of the Neuromation platform was born: an universal marketplace for knowledge mining that would connect miners who want to earn more on their equipment and customers and AI startups, researchers, and basically any companies that need to process large datasets or train modern machine learning models.

Now we are already working with several mining farms, using their GPUs for useful computing. This is 5 to 10 times cheaper than renting server capacity from cloud-based services, and even at that price it is still much more profitable for miners. With their GPU-based rigs, miners can earn 3 to 5 times more by “knowledge mining” than they would get from the same setup by cryptocurrency mining. Taking into account that the complexity of calculations for cryptocurrency mining is growing with each new coin, the benefits of “knowledge mining” will only increase with time.

Conclusion

Right now we are presenting the idea of this universal platform for useful computing to the global market. The use of mining rigs for useful computing benefits both parties: miners will earn more, and numerous artificial intelligence researchers and entrepreneurs will receive a considerably (several times) cheaper and convenient way to implement their ideas. We believe that such “AI democratization” will lead to new breakthroughs and, ultimately, fuel the current revolution in artificial intelligence. Join us, and welcome to the revolution!

--

--