Putting machine learning to practical business use is hard. Here’s hoping that these tips help make your journey a little easier.
1) Define the Project
Create a charter for the project so everyone can get aligned from the outset. Lay out the goals, risks, and opportunities of the project as specifically as possible. What is the team trying to solve? What is the current process that’s being altered? What is the team responsible for delivering? How long is it expected to take?
By its nature, an ML project can mutate over time. While it may be the case that the team will need to explicitly change its goals down the line, it’s good to at least have written agreement on what they were when the project began.
The machine learning mission statement below may be a useful framework for your project, helping you identify obvious gaps before you get started and setting the right expectations for everyone involved.
Machine Learning Mission Statement
[Company Name] is going to start a machine learning project that helps [define business objective]. This will help grow our business by [how the ML project is expected to generate ROI]. In order to complete this project, we will need [required data] from [group(s) name(s)]. This data is stored across [one or more system(s)]. Once we have accessed the data, we will need help from [group(s) name(s)] to verify that this data is true and accurate. Machine learning expertise will be applied by [group(s) name(s)]. After multiple iterations, the project will be successful when [success metrics]. The estimated length of this project is [time].
Although it’s essential to define the problem in business terms, this should happen in partnership with your technical team. While they may define success more precisely than you do in terms of models that meet or exceed mathematical thresholds, their definition must match up with your business framework. Business and technical people may speak different languages, but now is the time to make sure they’re both pointed in the same direction.
Deriving a business problem that can be expressed in technical machine learning terms may take some back and forth between business and technical experts. It’s worth the trouble: you can’t do machine learning without this step, and at any rate, the quality of your answers depends on the quality of your questions.
2) Build Broad Buy-in Among Stakeholders
Unless you need to operate under the radar, it’s wise to bring as many relevant stakeholders into the conversation as possible to help you vet and implement your initial mission. While it may seem that this could get the process off to a slow start, failure to account for certain perspectives could ultimately bring the project to a quick end. Involving all the right stakeholders can be the difference between a proof of concept and a practical implementation.
The most obvious stakeholder is the business unit for which the machine learning is taking place: the HR department, for instance, in the case of a resume analysis project. It’s clear that although they are not ML experts, their guidance is essential, as it is their team who will ultimately use the product and judge its utility. Leaders from this unit will likely be involved with the project from the kickoff stage. They’ll attend weekly meetings and work closely with data scientists and engineers.
The full roster of technical stakeholders, meanwhile, may not be as immediately obvious. To really work in practice, a machine learning solution needs to fit within an organization’s broader technical infrastructure and process. CIOs and other innovation leaders may need to be in the mix. They will bring more requirements and essential context to shape ML deliverables. These may not be requirements that can simply be tacked on at the end of a project; in some cases, they can fundamentally change the algorithm a team selects. Algorithms require balancing different factors. In some organizations, the need for speed is paramount; in others, using less expensive resources may be the priority. The degree of desired accuracy can vary as well.
When all technical stakeholders aren’t accounted for, it’s possible to achieve narrowly-defined success with a result that’s ultimately not usable. On the other hand, involving more technical experts in an organization can reveal or clarify machine learning opportunities that business stakeholders might not have been aware of.
Much of the stakeholder discussion at this stage revolves around explicitly surfacing various assumptions about the project that can cause later turbulence if left unstated. For instance, in a project kickoff meeting, make integration and deployment central topics. What are their technical requirements for a solution? Who will actually be doing the integration? How will that work? Will existing infrastructure be able to handle it, or will someone need to do something different? If so, who will that person be?
Again, this may seem like a slow way to start a project. But the alternative may be to find yourself with a tool that works in theory and a list of previously unstated technical requirements that you should have known about from the beginning.
3) Get Data and System Access As Early as Possible
Two of the biggest risks to a machine learning project are poor data quality and the inability to integrate applications into production. To help mitigate these, obtain access as soon as possible to:
1) The data. The sooner data problems can be identified, the sooner they can be addressed.
2) Relevant engineering resources to enable integration, such as sandbox environments, APIs, and other file/data transfer points, and architecture diagrams/documentation that highlight application integration points.
Your team’s ability to access these quickly at the start of a project can be a good early indicator of whether the project will ultimately succeed. Without easy data access, you’ll have trouble doing machine learning. If integration capability is hard to come by, you’ll be hard pressed to put machine learning into practice.
4) Prepare to be Flexible
After a problem is selected and defined, all machine learning projects will go through the following stages:
Data engineers or data scientists will analyze project data, identify gaps, and get it ready for machine learning.
Data scientists design algorithms, build models, and test results to show what’s possible. This is the core “machine learning” part of the process.
Software engineers take the lead to integrate your machine learning solutions with your business systems, helping your organization achieve the true impact and scale of machine learning.
Results are measured and necessary adjustments are made. Since models may drift away from their optimal state when exposed to real world data, they may eventually need to be retrained.
While these steps seem to follow a logical progression forward, the reality is often far more complicated. You won’t necessarily do each step in order before moving on cleanly to the next. Real projects are messy.
For one thing, timelines in the field of data science are difficult to predict. Data might be in worse shape than originally thought. A data scientist might try a technique only to find that it doesn’t work, and they need to go back and try something else.
Moreover, the aforementioned project phases aren’t just interdependent. They can and do overlap with each other while flowing in both directions. Development, for instance, may actually begin during data preparation, as practitioners start to think about what algorithms they’re going to use. As a model develops, it may necessitate changes in the way the data is prepared, such that the data isn’t really ready until the model itself is almost done. Then, when a project actually begins to show results in and beyond the deployment stage, those results might necessitate further changes to the data, the model, or even all the way back to the problem definition itself.
5) Show Your Work
In a highly iterative environment, regular stakeholder meetings are crucial for checking in and showing work, even if that work didn’t lead to obvious progress. This can keep stakeholders engaged as timelines and project goals shift.
Ideally, it’s best to leverage engineers to literally show the results of ones work with understandable visuals, interactive demos, and user interfaces. The result of a data scientist’s labors may be a slide with equations on it; engineers are needed to put these into a more relatable context. Easily understandable visuals are especially important when communicating an ML project to senior executives who may ultimately hold sway over a project but aren’t familiar with the underlying details. For machine learning to work in practice, business leaders need a way to understand that it actually works.
6) Prepare to do More Than Machine Learning
Putting machine learning into practice means doing a lot of activities that may appear far afield from cutting-edge data science. For example, imagine a company that wants to use machine learning to improve the efficiency of document analysis. The previous process involves people analyzing some text and then inputting information about it into purpose-built software.
Let’s say that during the development stage, a machine learning solution is able to eliminate the need for 50% of human input for each document.
To actually deploy this solution at scale however requires the following:
A) Software engineers to make sure the ML solution can process the documents at scale. This means making sure the data goes to and from the right place and is prepared when it gets there.
B) Software engineers (possibly different ones) to alter the human interface so that people aren’t posed with questions that the machine has sufficiently answered.
C) Retraining people on new software, a new workflow, and perhaps even a new understanding of their role.
In this case, only point A is an engineering/deployment task that’s explicitly about machine learning. Point B is a non-ML engineering task, and point C isn’t even a traditional engineering task at all. However, A, B, and C are all essential for successful deployment.
For another example, imagine an ML solution that identifies potential financial fraud in bank transactions. From the perspective of business impact, such a system isn’t fully deployed until the institution has a process for confirming the fraud and dealing with the ramifications. Without these non-ML policies, the ML solution won’t help the business even if it is a stellar fraud detector. It’s easy to get caught up in the pure machine learning potential while overlooking the practical realities of how a given solution would be deployed.
7) Offer Ongoing Support
Once you’ve got a machine learning project that works in practice and delivers business impact, there will still be a need to monitor both the quality of the data going into the system and the quality of the results coming out. Eventually, models may need to be retrained after they drift due to contact with real-world data.
You must structure a machine learning project with a plan for its ongoing maintenance.
In addition, offer ongoing support to your ML team, empowering them to apply the lessons of one project to future efforts inside the organization. With their close connection to a company’s data, systems, challenges, and opportunities, they may be in a good position to suggest what comes next.
James Kotecki is the Director of Marketing & Communications at Infinia ML, a team of data scientists, engineers, and business experts putting machine learning to work. The company builds machine learning-powered applications that help businesses analyze their documents, manage their talent, and audit their AI systems.