Machine Learning platform — what is it and why do you need it? (Part 1)

Sunanda Garg
6 min readDec 8, 2021

--

The journey towards the future can be a long and arduous one. Businesses the world over see the benefits that data, Artificial Intelligence (AI) and Machine Learning can have. But getting there is another matter entirely.

In a crowded vendor landscape, making an informed decision (‘build vs. buy’) on a platform for scaling AI has never been more complicated. Some 84% of Executives in a 2019 Accenture survey noted that they would not achieve their growth objectives without scaling AI.

And yet, here we are with the problem still at hand.

This article is a part of a series written collectively by the ML engineering team at Accenture. This series aims to dive into the (new) age-old problems concerning scale; the ML Platform and the ecosystem surrounding it by covering a range of topics surrounding the ML platform landscape.

Our intention is to drive the conversation and set out the stall to move collectively forward better prepared to build the future.

Part 1:

What is a ML Platform and why do you need it?

1. What is a ML platform?

In many areas of research and industry, Machine Learning (ML) and Artificial Intelligence (AI) are becoming increasingly dominant for solving problems. While a lot of effort is undertaken to further improve performance of ML models, the true bottleneck has shifted towards moving such ML solutions out of the labs and into production. As ML gains foothold across organizations, teams developing ML solutions are struggling with the complexities surrounding the ML lifecycle. As a result, solutions that help to manage the end-to-end ML lifecycle are in high demand. Not surprisingly, cloud providers are offering native services to support different aspects of the ML lifecycle and an increasing number of software providers are expanding their offerings in this direction. Additionally, more and more start-ups are pushing innovative solutions and ML services to the market.

So, what is really a ML platform?

We define a ML platform as a collection of services that covers steps encompassing the end-to-end ML lifecycle and helps organizations continuously develop, deploy, integrate, and monitor their AI and ML solutions.

2. Why do we need a ML platform?

A suitable environment to develop ML models in a governed way is essential. Why? Well, Data Scientists typically begin the ML lifecycle with different sets of programming languages (e.g., Python and R) and tools (Jupyter Notebook, RStudio, PyCharm) in different versions. Data Scientists also typically use different libraries (e.g., TensorFlow, scikit-learn, XGBoost) for applying algorithms such as logistic regression, neural networks, tree-based classifiers etc., which increases the overall complexity of the solution. In many cases, the data scientists create a prototype using a small sample of data, which would normally be supported by the available resources (e.g., memory and CPUs) of their laptops or workstations. The engineered datasets (also known as features) created from raw data, are used to train the models locally and the performance and insights are visualized with the help of suitable metrics (e.g. within Jupyter Notebooks). The tools, frameworks and data requirements during this model development phase might or might not be compatible with the production-grade solution required by business.

The solutions developed in such a way generate little to no value and business impact. To unleash the full potential and generate value continuously, these solutions need to be taken out of the experimental phase and integrated deeply within the business processes of the organization. However, this is impossible without certain components such as an efficient data pipeline, continuous integration and deployment process, artefact and metadata management. Only a comprehensive set of ML components enables this transition to the next phase of ML lifecycle, in which the carefully trained and selected ML solution makes it out of the lab into the real world. In the end you do not want your models to belong to the astounding 90% which never see the light of the day.

You may build a ML platform on your own as many of the tech giants have done, e.g., Uber (Michelangelo), Airbnb (Bighead), Facebook (FBLearner), Netflix (Metaflow) and Apple (Overton) to name a few. While this might be possible for such organizations with huge engineering teams, it is not a feasible approach for majority of the organizations. To overcome this hurdle and bridge the gap, enterprise ML platforms increasingly provide an answer.

3. Which ML platforms exist in the market?

To cope up with the complexities of managing the ML lifecycle, more and more ML teams are looking towards PaaS solutions. Several vendors and cloud providers are offering end-to-end ML platforms and/or services, including AWS (Amazon Web Services), Microsoft Azure, Google Cloud Platform, Databricks, Dataiku, H2O.ai, and several others.

Whereas the cloud providers offer several possibilities to effectively support most, if not all, components of the ML lifecycle, while providing flexibility at the cost of relatively high cloud engineering effort. On the other hand, proprietary solutions like Databricks, Dataiku and Domino offer out-of-the-box services encompassing the ML lifecycle which may simplify your ML journey at a certain cost of flexibility.

4. The burning question: Which is “the best ML platform”?

Finding a ML platform is more complicated than constrained optimization

There is no shortage of choices regarding ML platforms in the market but the natural question any ML team or organization starting out may ask is — which one is the best one? We can safely assume that there is no ‘one ML Platform to rule them all’. There is no such thing as the best ML platform. No matter how good a product or service provider claims to be, there will be competitors with a similar offering portfolio. Instead, the right question to ask is: Which ML platform would be best suited for my organization?

The choice of a ML platform depends on three things:

First, the choice should be aligned with your team’s skill level. If your team is comprised of more citizen data scientists than engineers, then your choice of platform will be quite different than an engineering-heavy team. A platform is only well-suited for your team’s needs, if it simplifies their day-to-day tasks, which in turn speeds up your ML journey.

Second, the choice of the platform should support the different types of ML use cases you are trying to solve for and the ones on your roadmap. There are different types of use cases and their requirements, such as computer vision, NLP, big data analytics etc and some platforms support one (or more) better than the other.

Third, the selected ML platform should reduce the technical debt of your organization. Almost each platform offers different types of solutions, which addresses different components of the ML lifecycle (e.g., model registry, experiment tracking, feature store ). Selecting a ML platform that helps you fill in the gaps will help expedite your ML journey.

After considering all of the above, in our experience, to help find the most suitable ML platform for your organization, a ML capability framework is required, which helps you assess all the above mentioned aspects consistently across several platforms. In the end, long-term vendor lock-in can be costly if the right choice is not made.

5. What can you expect form this series?

The discussion on the most suitable ML platform for your organization has just begun. In the next article we will introduce you to our framework and show you how this framework can help you assess different types of ML platforms. This will be followed by some articles where we assess a few selected ML platforms, based on this framework. Hope to see you there!

P.S. We love sparring and are happy to discuss and move this fairly young topic to the next level with the community — if you have any comments, questions or concerns, please reach out to us!

--

--