How to Train Your AI

Published in

Revain

5 min readMar 28, 2019

By Sherise Tan

When most people think about artificial intelligence (AI), ideas of self-driving cars and robots often come to mind. However, many do not stop to think about the process of how AI works to make these conveniences come to life. By being fed large amounts of data, AI is trained through machine learning (ML) and deep learning to gather insights from data and automate tasks at scale. The machines learn how to analyse and make predictions — to “think” as much like humans as possible.

So how long does it take for AI to be trained? It could take hours, weeks, and even longer. The answer really comes down to factors such as hardware, optimisation, number of layers in the neural network, size of your dataset, and more.

To get a better idea of this, let’s examine how AI training works and what it requires.

Machine Learning vs Deep Learning

The whole process of training AI is a highly complex and fascinating one. Within the field of AI research, machine learning is garnering much attention and accolades.

Machine learning is a subset of AI that allows computer systems to automatically learn and improve, without being programmed by a human. Machine learning uses algorithms that discover patterns, then modifies itself as it is exposed to more data — it adjusts in accordance to the data that it has been exposed to, much like a child who learns from his experience.

Subsequently, deep learning is a more specialised approach to machine learning that uses artificial neural networks to mimic the human brain in processing data. The computers learn through positive and negative reinforcement, relying on continual processing and feedback. Examples of applications of deep learning are image recognition technology and speech recognition voice assistants like Siri or Cortana.

Deep learning relies on its “deep” and multiple layers of neural networks. Each neuron on the network consists of a mathematical function that is fed data to be transformed and analysed as an output. The computer learns how to weigh the importance of each link between the neurons to create successful predictions. Deep learning is especially helpful when it comes to solving complex problems with different variables.

In machine learning, relevant features are extracted from the images, whereas deep learning is an end-to-end process where features are automatically extracted. Deep learning also has the ability to scale with data as networks improve and when the amount of data increases. However, with this increased input of data, comes an increased computing power and training time.

What is AI Training Like?

The actual process of AI training itself involves three steps: training, validating, and testing. By feeding data into the computer system, it is being trained to produce a particular prediction with each cycle. Each time, the parameters can be adjusted to ensure that the predictions become more accurate with each training step.

The algorithm is then verified by running validation data against the trained model. New variables may need to be adjusted to improve the algorithm at this stage. Once it has passed the validation stage, the system can be tested with real-world data that have no tags or labels. This is the time to see if the algorithm is ready to be used for its intended purpose.

Of course, there are ways to shorten the timeframe of AI training. Creating a deep learning model from scratch can take days or weeks to train, because of the large amount of data and rate of learning.

Instead, most deep learning applications use a process called transfer learning, where adjustments are made using a pre-trained model. By tweaking an existing network, fresh data can be added and new outcomes or tasks can be trained. Not only does this require less data — from millions to thousands of images — training times can also decrease to minutes or hours.

Feature extraction is another method of deep learning. It involves extracting a layer from the neural network that is tasked with learning a certain feature from the images and using that feature as an addition to another machine learning model.

What Needs to be in Place?

While the process of AI training can be time-consuming, there are also several other requirements that can affect the AI training process:

Data

As data is an integral piece of the algorithm puzzle, having a clean and accurately labelled dataset is key. By inputting accurate data into the algorithm, it follows that you will have accurate outputs, resulting in a more efficient and timely training process. For example, for driverless car development, a dataset can include millions of images and thousands of hours of video.

Hardware

Deep learning requires vast amounts of computing power. This means having high-performance Graphics Processing Units (GPUs) combined with clusters or cloud computing to reduce deep learning training time from weeks to hours. As AI training can be done in parallel, setting up a system where you can train on multiple GPUs, or in a cluster can help accelerate training.

Having special equipment like the Nvidia Tesla V100 GPU and DGX1 server is essential for heavy-duty training needs. These can cost a pretty penny, from about $10,000 USD for the GPU and $149,000 USD for the server. Alternatively, you can rent hardware that is platformed in the cloud from providers like Amazon Web Services, Google Cloud, and Microsoft Azure.

Software

When renting computing infrastructure, each cloud provider has its own automated machine-learning software such as Microsoft’s Machine Learning Studio, Google’s Cloud AutoML and AWS SageMaker. Others may also utilise deep learning software such as Google’s TensorFlow and Pytorch to design training models.

Developers

There is a shortage of experienced AI developers in the world, with an estimate of less than 10,000 people who have the relevant skills for AI research. AI specialists are so highly sought after that they are paid anywhere from $300,000 to $500,000 per annum in salary and company stock, and academics from top universities are being lured away to work for tech giants. Developers are not only required to have a Ph.D. in computer science, but also need expertise in different disciplines like C++ programming, STL, and have a background in physics or life sciences.

How Blockchain Can Help with AI training

With all that said, blockchain technology is a new innovation that can expedite AI training. Since obtaining large datasets for AI training can be difficult, emerging blockchain startups are finding new ways for blockchain and AI technologies to work together to decentralise the ownership of data, making it available to the masses for collaboration and expansion.

For example, some startups like Datum, Synapse, and Computable are building data marketplaces, where participants receive tokens in exchange for sharing their data with businesses. Other startups like Numerai allow data scientists to propose models to solve machine learning problems in exchange for compensation with NMR tokens.

There are many benefits that occur when blockchain and AI converge, including the fact that blockchain can help to decentralise AI, keeping it independent and autonomous from any particular corporation, as well as ensuring the process is protected with encrypted data.

Thus, while AI training requires technical expertise and oftentimes huge investment in the process, new innovations like blockchain-based data marketplaces are emerging to supplement this area. The convergence between blockchain and AI seems to be the next natural step that could be the catalyst for a new pace of innovation for AI training.

Join us!
|Telegram | Bitcointalk | Facebook | Twitter | Reddit | Website | Slack|