An Open ML Platform for Scotland

How might an open machine learning platform benefit Scotland’s economy?

Matthew C. Higgs
The Data Lab
3 min readOct 12, 2017

--

Many of the biggest technology companies have developed their own machine learning (ML) platforms. For example, Uber has Michelangelo, Facebook has Flow, and Google recently published a paper on TFX. So what is an ML platform? And how might an open ML platform benefit Scotland’s economy?

What is an ML platform?

I’m going to assume you are aware of the value of combining machine learning with good data and applying it to a real problem. This combination and application is traditionally done using custom code developed by individual teams for specific problems. An ML platform standardises and systemises this process to enable machine learning models to be developed, deployed, and refreshed more efficiently for a wide range of applications. The bottom line is: an ML platform can potentially increase the speed, or reduce the cost, of solution development.

However, ML platforms are not easy or cheap to build and maintain. So:

How can we make an ML platform available to all data scientists, engineers, and developers working in Scotland?

A platform for Scotland

Regarding how we move forward, the two extremes are: (1) build something from scratch (platform creation), and (2) use something that already exists (platform consumption). Ideally, we would do something in-between and repurpose existing data science tools to build a platform that is useful for our needs. How might we do this? Some ideas:

Build a community

Platforms, by definition, are built to support multiple applications and therefore require input from multiple consumers and creators and anyone prepared to invest the time. Beyond this, the value that comes from attempting to build an ML platform is more than just the value of the platform itself. Building a platform requires a community to study what they do, and want to do, and how their processes can be standardised and systemised to be more efficient and effective.

Learn from others

It would be silly to start from scratch. There are many companies, research groups, and open-source projects working on ML platforms. Let’s study what they do and learn from their mistakes.

Be aware of Scotland’s needs

Let’s think about the existing companies in Scotland, how they use machine learning, and what types of companies Scotland will want to be known for in the future. Let’s build an ML platform that can support the application of machine learning to problems in industries we care about.

Be scientific

Let’s be scientific about how we develop an ML platform. Let’s engineer a platform for existing tasks. Let’s experiment with different standards in different contexts to understand what works where. Let’s experiment with novel ideas to support innovation.

Risks

I’m all for diving in and building an open ML platform that will benefit Scottish companies and data scientists working in Scotland, but I would be ignoring my own process if I didn’t attempt to capture some of the risks involved.

Cargo Cult

Just because everyone else is building an ML platform, doesn’t necessarily mean we should. Let’s not become a data science cargo cult, where we copy the behaviours of others in the hope of seeing the same returns. A simple way to avoid this is to understand the needs of people in Scotland and build a platform that supports those needs.

ML/AI Hype

It is possible that ML/AI is at the peak of the Gartner hype cycle and it is all downhill from here:

Though, the position of a trend on the hype cycle often relates to the use of the trend’s common name, and the technologies underlying the trend often form the foundation for the next wave of trends. Additionally, a July 2017 report from Gartner states:

The risk of ignoring potentially transformational AI exceeds the mitigated risk of fast, early failure.

Ethics and Security

The way we currently use personal data is broken, and it’s very likely this will change in the future. If we copy what others are doing now, then we risk building a platform that will not function within the boundaries of future ethical constraints. Should we build what we can now and adapt over time? Or, should we anticipate these changes and build something for the future? I don’t know exactly what this might be, but it might be something like OpenMined.

--

--