AI In Practice: Part I of III

Bijay Gurung
fuse.ai
Published in
7 min readApr 9, 2019

Artificial Intelligence (AI) is revolutionizing a lot of industries and it is poised to make its way into more and more sectors.

As such, inevitably, a lot of people are moving towards putting AI in practice to solve business, societal as well as global problems.

However, going from learning about AI to being an AI practitioner requires a shift in mindset. In a way, it’s akin to how one needs to think differently when transitioning from programming as a hobby to programming professionally.

I have struggled with that. Quite a lot. I still do. This article, derived from experience working on in-house and client AI projects, and an assortment of helpful resources and read-ups touches upon some of the points to keep in mind (at both individual and team levels) when putting AI into practice.

It’s also meant to act as a note-to-self, a reminder :)

The Main Guiding Principle

When addressing AI in practice, our overall guiding principle should be to solve problems, add value. Nothing else.

It is somewhat related to a point from Patrick Mckenzie’s Don’t call yourself a programmer, and other career advice.

Engineers are hired to create business value, not to program things.

AI in practice is primarily not about the AI; it’s about what problems AI solves, what value it adds.

All of what follows expounds on this principle or enables us to follow it.

Outline

Part I

  • Thinking about the Problem
  • Thinking about the Data
  • Thinking about the Solution

Part II

  • The Messiness and Uncertainty of AI Projects
  • The Bigger Picture
  • Human Learning

Part III: Useful Resources

1) Thinking about the Problem

Guiding Principle: Know your problem

Know the Context

To really understand a problem, we need to understand the context around the problem. It means asking and finding out the answers to questions such as:

- What is the business? What domain does it operate in?
- What is the problem and why does it need to be solved? What’s the pain point?
- What’s the most important business goal? What are the criteria for success?

The goal is to refrain from labeling a problem as “a text classification problem” and to really understand its unique attributes. Because problems, however similar, are always unique.

Robert Chang touches on this in Getting Better at Machine Learning, how defining the problem is hard and not always obvious:

More often than not, problem formulation requires deep domain knowledge, the ability to decompose problems, and a lot of patience.

Know and manage expectations

Source : xkcd: 1425

Usually, people who need the AI solution and those tasked with delivering it aren’t on the same page in terms of expectations. The stakeholders are either skeptical about the value the solution can deliver or they expect more than what’s actually feasible. So it’s important to be clear about the possibilities and figure out what AI can and cannot do (what problems it can and cannot solve) for the business or organization.

Also, it’s hard to pin down exactly to what extent the AI system will be able to solve the problem without getting started. That means even when both parties are aware of the potential and limitations of AI right now, it might be difficult to pen down the details of the expectations.

Furthermore, progress is usually non-linear. There might be some early gains followed by some periods of stagnation (at least in terms of the main KPI). It’s important to be on the same page on that.

At any rate, in all this, good communication is key to aligning expectations.

2) Thinking about the Data

Guiding Principle: Know and respect your data

Know the technical aspects

Where’s the data? What’s the availability of the data? The quality? What’s the data infrastructure like?

Where in the AI hierarchy of needs is the data infrastructure?

Source

Knowing these help inform, among other things, the feasibility of the project (Do we have enough data? Do we have enough quality data?), the course of action to take (Should we look to acquire necessary data or Consolidate the data sources?) while also driving conversation around the problem.

Know the non-technical aspects

This is about asking questions like: How is the data created? Who are the people responsible for the data?

Maybe some part of it is manually entered and so it might have some subtle (and not so subtle) errors and inconsistencies. Eg: “NYC”, “New York”, “nyc” as address values. Knowing about these aspects can help inform how we clean the data.

Respect the data

“Garbage in, garbage out” is cliche but true. The success of an AI project does hinge on what data we use. So it is critical to respect the data. What does that mean? It means acknowledging that a lot of our time will be spent thinking about, cleaning and preparing the data. And also other “operational” aspects such as data documentation.

3) Thinking about the solution

Guiding Principle: Be open and flexible

Know what solving the problem means

It is paramount to have clarity on what solving the problem looks like. Primarily it’s about fixing the KPI (Key Performance Indicator) for the system. Eg: At an e-commerce site, it could be revenue per visitor, for a customer support system, it could be the average turnaround time of service delivery, etc.

Furthermore, there can be other additional requirements around the operation of the system. Eg: Latency, scrutability, security, etc. A solution that does well on the KPI but doesn’t fulfill these satisficing metrics isn’t a solution.

All in all, the main idea is to dispel all ambiguity about whether the solution is actually solving the problem or not.

Consider if AI is necessary

Source: xkcd; Replace with “AI system to intelligently pass condiments”

Arguably, this should be done at the start of the project. Still, after we have learned about the problem and considered possible solutions, it is worth taking the time to pause and assess what a “non-AI” solution might look like. And if it starts becoming apparent that the problem may be solved without AI, then so be it.

For instance, let’s say a relatively new company needs a character recognition system to digitize some forms that customers fill in. Looks like something AI is needed for. But could it be possible to change the customer-facing end so that they are filling in digital forms instead of paper forms and thus eliminate the need for the proposed system? Of course, this is an overly simplified example and it might be something already considered beforehand. But questions like that are still worth asking.

Consider how it’s solved manually

Another question worth asking is how the problem is being (or would be) solved manually.

Of course, for quite a lot of applications, it might not apply. For instance, for face recognition, it might not be too useful to try to figure out how humans “manually” recognize faces (Still not fully clear).

But for a lot of applications it can be a fruitful exercise to consider how, if we didn’t have AI as a tool, the problem could be solved manually.

Furthermore, if there is already some, perhaps manual, system to solve the problem, then it can be very helpful to know how it works.

For instance, a restaurant chain that wants demand forecasting might not have one such system in place but they must have been using something (perhaps the manager’s intuition) till that point. It can be fruitful to follow that lead, which brings us to…

Incorporate Domain Expertise

Source: Yes, you guessed it: xkcd

For AI to work well, only having AI knowledge and skill isn’t enough. That is only one piece of the puzzle. The other piece is the domain of expertise around the problem space. Failing to leverage it can lead to anything from a suboptimal solution to a complete disaster.

For instance: maybe, for a forecasting problem, a local holiday has a significant impact on sales and a domain expert would know that. We’d be missing out on it if we didn’t work to weave domain expertise into the process.

Balance Standard vs Custom Tooling

It’s easy to be tempted to write something from scratch. But for a lot of things, using what’s already out there should be the priority. Basically, don’t reinvent the wheel, unless you plan on learning about wheels. Eg: Do you really need to implement that paper yourself when there’s already an implementation out there?

However, it is also important to know when the off-the-shelf tools might be hindering progress. Perhaps, the library doesn’t have good support or maybe you have found yourself having to hack it to do what you want (eg: some custom loss function). In that case, it might be worth rolling your own implementation.

Prioritize Potential approaches

Usually, there is a myriad of ways one could approach (solving) a problem. It can be overwhelming which can lead to stagnation. So it’s critical to prioritize potential approaches by what’s promising and what can be tried quickly. It’s good to favor the latter, especially in the initial stages.

Essentially, get into the Machine Learning Engineering Loop as quickly as you can.

If you feel like you’re hand-wringing about what to try, just pick one. Trying to do too many things at once slows you down. You can sometimes come back to try another idea while your experiment is running.

After all, finding the right solution is an iterative process and getting through those iterations quickly and effectively is what leads to success.

To be continued in Part II.

(Note: Adapted from a talk I gave at GDG DevFest 2018).

--

--

Bijay Gurung
fuse.ai

Software Engineer. Knows nothing (much). Always looking to learn. https://bglearning.github.io