Machine Learning Challenges

Rustem Zakiev
hydrosphere.io
Published in
3 min readMay 21, 2018

Machine Learning (ML) today is less of a mystery than ever, powering a diverse set of businesses, industries and public activities. Risks management, ad targeting, recommendation services, logistics, security are a few of many areas where performance rises multiple times being augmented with an ML.

Schematically and generally the process of implementing ML into business processes clear and simple: data scientists prepare initial data set to build and train a model, then that model is deployed to serve in computer application to make inferences and predictions based on a real data fed to model’s inputs. While model works making inferences it is being monitored for accuracy and supposed to get re-trained in case of particular degree of discrepancy showed and then re-deployed into production. Train, deploy, monitor, over and again, and that is it.

That is not, of course. Building a prediction pipelines takes up to years (depending on the complexity of the tasks to be solved) a model is built, trained, checked, rebuilt and re-trained by a data scientists many times before concluded to be ready for production and that shallow mentioning covers quite a tree of scientific routines.

The industry develops dynamically and there are plenty achievements in the field of automation of model building, starting from data cleaning and to parameters optimisation. But the plenty of challenges left either.

Deploying ML models into production takes a complex engineering skills above data science — things like operational environment maintenance, data sources and destinations plumbing and a couple of more are needed to be done. And that shallow mentioning covers — well, you get it.

Once a work of building and deploying a model is done — the journey of operational sustenance is just began. The quality of model is connected tightly to the quality of data processed. Putting aside the well-known matter of training/serving skew, no matter how real-world a data set used for training was, it was limited and there is a strong possibility of meeting an unexpected artifacts in live data stream, leading to unpredictable inferences of questionable value or even to an error spreading further to a business workflow.

Taking a self-driving cars case — the AI is prone not only to deliberate adversarial attacks — messing/vandalizing road signs or putting a mirror in front of AI-sight camera; a natural environmental edge cases — pedestrian wearing a mascot costume or non-standard type of a road light — make it deliver decisions we call dumb and in some cases dangerous.

The space of possible outputs is huge and can’t be comprehended effectively by a human unarmed with special monitoring tools. Basically those tools demand to be ML-augmented themselves to be effective.

When an AI-failure detected, the model needs to be retrained and re-deployed — all the devops has to be done again (in general case for the current state of the industry), and monitored again — you can never be 100% sure in essential AI/ML applications yet.

That is the general scope of reasons for human to be present in ML empowered operations loop today. There is still a long way of perfection to go to for the AI-systems, and ML, being well understood, still delivers plenty of headaches to data scientists and engineers with its demands for serving, performance monitoring and continuous retraining and re-deploying.

--

--