The Biggest Challenge in Machine Learning is Other People

Brian Ngo
Acerta

--

Written by Mahmoud Salem

The trend toward digital transformation has brought the manufacturing sector to the cutting edge of innovation: additive manufacturing, augmented reality, the Industrial Internet of Things (IIoT), and, of course, artificial intelligence (AI).

Any one of these technologies has the potential to revolutionize industrial production but combining them all together is what’s truly game-changing; that’s why you often hear the digital transformation described as a fourth industrial revolution or Industry 4.0.

Ironically, as Alasdair Gilchrist points out in his recent book Digital Success: A Holistic Approach to Digital Transformation for Enterprises and Manufacturing, “One of the biggest issues with digitalization is the focus on technology and trying to be modernized.” Sounds crazy, right?

How can you effect a digital transformation if you don’t focus on the technology?

I think what Gilchrist is saying is that the fourth industrial revolution isn’t just about technology: you can have the best tech on the planet, but it won’t do you any good unless your people understand how to work with it and — more importantly — how to work with it together.

We’ve seen this at Acerta first-hand.

Data Analytics is All About Communication — What!?

Between our Data Science, Infrastructure, Implementation and UI teams, Acerta combines a diverse range of backgrounds encompassing data engineering, software and data science expertise. One might think that the hardest part of what we do when applying machine learning to data is implementing the right models, e.g., Do you need an autoencoder, a convolutional neural network, a LSTM or all three for this particular use case?

Believe it or not, that’s the easy part.

It may come as a surprise that, in my opinion, a much bigger challenge comes from communicating and interpreting the machine learning results. Indeed, the diversity of our team’s knowledge base means that communication has always been a necessary step on the path to success when we’re delivering the results of our machine learning models to meet our customers’ needs.

I have a computer engineering background, so the first few years of my studies were very general: mechanical, civil, etc. It wasn’t until my PhD that I started doing anomaly detection and focusing specifically on machine learning. That experience lets me work as a bridge between our different groups.

Machine learning is very much a team sport, and in most cases there are four teams corresponding to four core roles. There are the Data Scientists, who develop the machine learning techniques and write the code to implement them. Then there’s the Implementation team, which focuses on delivering meaningful results to clients who typically couldn’t care less about data science. Neither of these teams can do its job without someone handling the Infrastructure, tending the databases and backend. Finally, there’s UI, which has to make the other teams’ work as intuitive and understandable as possible.

A Metaphor for Machine Learning

It’s not easy to get all these teams working together, but they must for the company to succeed. If any of the above elements were missing, our customers would end up with results that would be basically useless to them.

Think of machine learning models as dogs in a dog show (seriously):

  • The Customers are the judges.
  • The Data Science Team is the breeder.
  • The Solutions Team is the trainer.
  • The Infrastructure Team is the caretaker.
  • The UI Team is the handler.

Now, imagine trying to run the Westminster Kennel Club with any one of these elements missing. Think of a handler trying to present their dog to the judges without knowing what commands the trainer taught it, or a breeder trying to raise a new litter of puppies without food, water or shelter. Collaboration between these groups is the key to victory, which is why communication is so important.

The trouble is, teams coming from diverse backgrounds who have different objectives are invariably going to be at odds, at least occasionally. Your UI team wants things to look polished and flashy, but the Data Scientists aren’t concerned with that, so you need to somehow tweak the outputs from latter to be useful to the former. Or you have the Infrastructure team saying, “The data needs to be uploaded in this particular way with this particular format,” while the Implementation team is saying, “No, the customer needs the results to be formulated in this other way.”

This is the true challenge for companies working with machine learning: negotiating between the goals and requirements of diverse groups of experts to deliver what the customer needs.

Soft Skills & Hard AI

Acerta faced these communication challenges head-on in one our first deployments involving all four teams. Before that, we would get the data, play with it, write a report and ship it to the customer — UI was just a demo thing. Now, we have UI taking actual data from the backend, which had a deployed model made from the data side. We knew from the beginning that we needed this collaboration to work, but it’s not easy.

One thing we learned was the benefit of having checkpoints in our documentation. The idea is that if you meet all the checkpoints, then you know your work is deliverable to the other teams. Before that, we kept running into the problem of people saying, “Hey, this is working on my machine, so you can take it,” but then they’d find out that it’s not what the other team needs. These checkpoints and specs have made a big difference in improving interoffice communication.

The other lesson we learned was the importance of having the team leads come together to discuss what they’re working on and what they’re expecting from the other teams. Everyone has their own perspective but, at the end of the day, you must find a middle ground where we all meet. There’s a tendency for coders to think along the lines of, “I’m working on my thing. I don’t care what happens afterward,” but the teams in a machine learning company aren’t isolated like that.

At Acerta, the Data Science, Implementation, Infrastructure and UI teams are not islands — and even if they were, we’d need to build bridges between them.

The same is true when it comes to working with customers. If we need to contact the supplier that develops the customer’s data storage to install an API so we can actually access that data, that means the internal spec end team has to expose that API to the supplier. In that case, Acerta is more like an archipelago that needs to build a bridge to another island chain. Obviously, this happens a lot faster internally, but the process is essentially the same.

If I could offer one other piece of advice to companies with multiple teams collaborating on machine learning projects, it would be this: plan ahead as much as possible. It’s natural that you’ll have every team doing its own thing, but if you only give yourself a few days for integration, you may realize it’s not that easy. That’s why we typically add a week or more to our deadlines to account for internal integration.

If you can plan ahead and provide as much documentation and design as possible beforehand, the bumpy road on the way to delivering actionable results from machine learning will be much smoother.

--

--