Enterprise & Industrial AI Modeling

How it’s Done

Noodle.ai

Published in

Noodling on The Future of AI

15 min readNov 4, 2018

by Matt Denesuk, Ph.D., Chief Data Science Officer, Noodle.AI

Introduction

My earlier paper¹ described how Enterprise & Industrial AI differed from more popularized applications of AI. The latter tended to be more “mature” (i.e. at the later stage of creating solutions to the problem) and also tended to be targeted at things that humans can already do very well (things like facial recognition, natural language understanding, driving a car). Enterprise and Industrial AI problems (e.g. problems in supply chain planning or industrial operations management) tend to be much less mature and require different approaches, skills, expectations, and planning. This paper takes a high level view of how this can be accomplished.

Enterprise & Industrial AI Philosophy & Approach

Why should enterprise and industrial firms care about AI? Fundamentally, it’s because AI will help them understand their business better and enable them to make far better operational and strategic decisions.

It’s simple:

There are questions that enterprises need answered if they are to perform better, and AI can help them do this.
By combining good data, mathematical operations, and business expertise, you can construct an AI workflow/solution that can answer these questions.
Successful execution requires translation of this AI workflow into a set of specific and sequenced trained algorithms expressed in a readily-usable application that, in aggregate, produce answers that move critical business KPIs by amounts that matter.

Philosophy

Philosophically, one can distinguish two very different ways of looking at a data set, and in sense, at the AI-accessible world. At the risk of oversimplifying or creating idealized prototypes or even caricatures, we can label these as computer scientist and regular scientist¹.

**Figure 1**. Two ways of seeing a data set (and the world)

A prototypical computer scientist may see their job as needing to get to the knowledge that’s currently locked up in the data. They see the data essentially as a record of everything relevant that has happened — for example, all the customer transactions that occurred last month, or all friendship links between members in a social network — and the job at hand is to exhaustively probe the data to uncover any useful patterns, rules, or relationships manifest within the data set.

The prototypical regular scientist, on the other hand, sees their job more broadly as needing to get the knowledge, using any resources you can muster. Such a person sees the data set as a partial, often noisy, reflection of some underlying complex phenomenon. The job at hand is then to take that data, combine it with any other knowledge or information available, and use it all to obtain a better understanding of the phenomena, probably through mathematical model or some other abstract representation. Examples might include understanding what’s going on inside a star given the emission spectra hitting the earth. One would need to understand and incorporate knowledge of nuclear chemistry and gravity, which themselves were the result of data analysis + other knowledge & information. That knowledge is not contained in the spectral emissions which constitute the data set at hand.

Some problems are relatively self-contained and can be adequately addressed using a predominantly computer scientist approach. But many problems are sufficiently messy, either intrinsically or simply in the sense that sufficiently good data are not available, that a more regular scientist approach and the incorporation of knowledge from other domains and other experiences is needed².

Modeling Approaches

There are three distinct and complementary modeling approaches that can be employed when addressing Enterprise AI & Industrial AI problems (Fig. 2).

By far the most commonly deployed in practice may be termed empirical, heuristic rules, and insights. This is generally consists of the various rules determined or otherwise intuited by experts deployed in some operational system. Example rules would be “Replenish inventory when level is below x% of year-ago monthly demand” or “Stop process if thermocouple 20 exceeds 450C.” They can also be compounded, such as “Issue an alert if pressure is above 500C while pressure is rising faster than 0.5MPa/minute.” This kind of modeling approach is generally used and favored by operators of equipment. Systems embodying this approach are easily deployed. Operators are comfortable with them because they are also easy to understand. They also serve the purpose of capturing the knowledge of an operation’s experts over time. Constituent rules tend rarely to be updated or optimized, however, and over time as new rules are added, the accumulation often results in overwhelming (or excess) numbers of alerts and reduced utility of the alerting system.

Theory-based models tend to be developed by the designers or manufacturers of equipment, or by deep technical process experts. In industrial cases, these are often called engineering-based, or physics-based models, but they can be employed in more general enterprise cases as well if one can obtain relevant theory/models, for example from the marketing science literature, the supply chain system literature, etc. When available, they can be very powerful, as they require less data — the system knowledge is already contained in the form of the model structure itself. Specific factors or coefficients/parameters in theory-based models can also be used to drive more powerful domain-based feature engineering. For example, rather than using raw sensor signals as features in a predictive maintenance application, one could use a particular nonlinear combination of these from an engineering model that better represents a cause of damage, such as corrosion or excess stress.

Pure data-driven techniques are what many consider exclusively when they talk about data science or AI today. These include all the techniques falling under the umbrellas of machine learning, deep learning, advanced visualization, statistics, and optimization. Using data-driven techniques, models are in essence automatically learned from the data, with little explicit programming or assumptions involved.

If one had an infinite amount of perfect data, and limitless compute and storage capability, these would be the only class of techniques one would need. And in many cases, they do perform quite well on their own, with little help from other approaches. But in cases frequently encountered in Enterprise and Industrial AI, where the data are often scarce, heterogeneous, sparse and of low quality, combining data-driven techniques with other approaches can be very powerful and decisive in justifying deployment by achieving sufficient business KPI movement.

Example: Combining Data-driven with Theory-based approaches

An example of a data-driven solution is shown in Figure 3(a), which pertains to predictive management of an RH Degasser, used to degas and help remove or control impurities in molten steel. A machine-learning-based module (ML) takes as input the operational settings and factors, the relevant sensor data, and actual measured or recorded performance and maintenance data (the latter being the “labels” or the outputs that one wants to be able to manage better). Based on the historical relationships between these three streams of data, the ML-based module can then be trained semi-automatically to issue alerts or warnings when there is enhanced risk of performance degradation or a machine fault that may require unplanned maintenance.

Frequently, the challenge with such solutions relates to what is often termed the “High Dimension, Low Sample Size” (HDLSS) problem, in that there are a large number of potential cause related factors (all the sensor feeds, operating conditions, etc.), yet the events of interest (such as faults and failures) are comparably rare. Even when in aggregate such events may be frequent, individual instances often possess unique etiologies and thus in effect partly require their own training data. In addition, there is frequently a great deal of movement in the individual sensor feeds, as expected in accord with the physics and engineering of the machinery. In using a data-driven approach, however, data scientists are essentially asking the ML-system to re-create ~300 years of physics and engineering from these sensor data, and then to also identify when and how things are working differently than they should be. As might be expected, such solutions are frequently ineffective and either are not deployed or issue so many alerts that they are of little value.

If the data scientist has access to engineering and/or physics-based models of the machinery in question (the RH Degasser, in this case), a solution such as shown in Figure 3(b) can be constructed. In this case, the operational settings and factors are inputted to these pre-existing models, which are then used to predict what the sensor value should be. One can then take the difference between these predicted sensor values and the actual sensor values and input these so-called residuals into a data-driven ML system along with the actual performance or maintenance data. The set of residual sensor values compose a much cleaner set of signals in that they represent the deviation from ideal machinery behavior, not just the raw behavior itself. The predictions, prescriptions, and optimizations that result from such a system typically far outperform those from a pure data-driven-based system.

In cases where the engineering and/or physics-based models are not sufficiently complete to make predictions of the sensor values, and thereby produce the signal-rich residuals, we can still use them to create highly-engineered features to be used in traditional data-driven ML models. These engineered features, rather than representing only trivial combinations or raw sensor features, can capture actual drivers of degradation or performance in the machinery.

**Figure 3**. Example using Theory-based Knowledge to supplement a pure Data-Driven Approach.

High-level Process View of the Approach

A systematic approach to doing this is summarized in Figure 4 below, and explained in more detail in the subsequent text.

**Figure 4**. Systematic approach to successfully creating Enterprise & Industrial AI Solutions.

Business Process & KPI Modeling

It is important to begin with an understanding of the business processes relevant to the problem area and the important KPIs that the business cares about. These KPIs may be financial (e.g., profit per hour on the production line, unit revenue) or operational (e.g., system uptime, yield loss, safety events) in nature, but there should be a clear and quantitative understanding of how and why these KPIs are important to the business. Ultimately, the solution’s value will be judged by how well it moves these KPIs, not by technical model metrics.

The business process should be understood mechanistically and quantitatively, to the extent that one can readily relate improvements in process steps (e.g., time reduction, yield improvement, variability reduction) to overall KPI impacts. Analysis using this business process model should uncover the key constraints or bottlenecks where an AI-based solution would have the greatest impact.

Problem Framing

With the business process at least partially understood, one should attempt to frame or define what part of the process, and what specific kinds of recommendations and/or predictions should be targeted in order to move a specific KPI. The data available should be simultaneously assessed and judged for the likelihood that they will support a model sufficient to move the KPI by an amount that justifies the solution³.

In a supply chain problem, for example, one might target causal predictions and corresponding recommendations to prevent out-of-stock events. This means we would need to confirm at the very least that relatively complete data are available covering stock levels and various other potentially causal factors. We would also need to baseline the current lost sales due to out-of-stock events, and confirm that a realistic reduction in such events has a sufficient economic/KPI impact.

Looking at the example of an industrial maintenance management problem, a highly desirable problem framing may be to prevent unplanned downtime by predicting failures within certain critical machinery sufficiently far in advance to schedule and perform preventative maintenance. But if it turns out that insufficient data are available, we might decide to re-frame the problem as predicting aggregate demand for specific machine components and maintenance skills, and optimally managing labor availability and replacement parts inventory to minimize the length of unplanned downtime (rather than its actual occurrence).

A Simple Taxonomy of Framings

Most problem framings ultimately take form to either estimate or predict an unknown measure, or recommend or otherwise determine an optimized/preferred action. A simple breakdown with some examples is provided below:

Estimate some unknown measure. This measure often relates to the future, in which case it’s considered a forecast or a prediction. But it can also relate to the outcome of a suppositional case. Some examples:

Forecast. Such as:

What is the week-by-week unit sales demand for this specific tire SKU over the next 3 months?
How many people need to fly JFK-SFO on March 12?

2. Time-to-event (or event risk):

What the probability that this gas turbine will fail in the next 2 weeks?
How likely is this business client to default on this loan in next 2 years?

3. Conditional prediction:

If I offer a 5% discount, how many more Nissan Sentras will I sell next month?
If I uniformly raise airline seat prices between DFW and SFO by 10%, what will be the effect on bookings and on revenue for that airport pair?

4. Conditional estimation⁴:

What would be the hardness of a 0.21%C steel if annealed at 908˚C for 12.4 hrs?

Optimal Actions, Settings, or Recommendations (executed/prescribed)

1. Discrete:

Offer promotion A to customer X today
Dispatch truck 21 to location B16–2 now
Schedule inspection on engine G1473 this Thursday.

2. Continuous:

Set price for this ticket to $378
Order 850 units of inventory to DC#21 by tomorrow
Set process T=1485˚C & ramp down to 785˚C @25˚C/hr

Successful problem framing is currently part art, part science, and is often the highest leverage point in producing a high-value solution for an Emerging AI Problem.

Problem Setup

Problem setup is the process of constructing a particular data and modeling workflow that generates an output validated in the problem framing as being positively impactful to the business problem in question. Problem framing and setup are both typically adjusted in an iterative manner until a workable solution that moves the chosen business KPI sufficiently is obtained.

Problem setup is typically done by highly-experienced data scientists, preferably supported by domain experts. They generally avoid considering specific algorithms and focus on using functional capabilities at a higher level of abstraction, e.g., clustering, classification, regression, pattern ID, rule inference, decomposition, statistics, optimization, simulation.

A problem setup is preferably represented as workflow, or more technically as a “Directed Acyclic Graph” (or DAG), which demonstrates the flow of data, transformation, and other operations.

Algorithmic Implementation

It turns out that, given the available data, most problem framings and/or setups cannot produce output that meaningfully improve the business KPIs of interest. So the data scientist has to iterate through new framings and setups until a suitable solution is discovered. This can require considerable time and effort.

To evaluate a particular problem framing/setup, we need to develop an algorithmic implementation. Based on particulars of the data, and precisely what we are attempting to achieve with a particular element in the DAG, the data scientist would select one or more basic algorithms to employ for each element⁵. Each element must then be coded using one or more specific instantiations of the given algorithm (which could be open source, custom-coded, or from a proprietary vendor).⁶

Evaluation & Iteration

Once an algorithmic implementation is complete, we must select and employ an appropriate technical evaluation approach to ensure the technical suitability of the model (in terms of factors such as error characterization, overfitting risk, drift potential, etc.). But the ultimate arbiter of success is whether and to what extent the model can move the business KPI of interest. One generally must keep iterating back through Framing, Setup, Implementation and Evaluation until a technically valid and business impactful solution workflow is obtained.

Scaling

The approaches and processes described above constitute the key elements for how to identify, frame, and develop AI-based solutions for Industrial & Enterprise firms. But once a solution to address a given part of a business or industrial process is built, it is desirable to improve and scale it, and to put it on a path of continuous improvement. The ultimate aim is to reach a state where new implementations are done with a substantial amount of “asset reuse,” by which we mean the reuse of tools or granular high-value components to accelerate and/or remove risk from the delivery of a solution or application. These kinds of asset reuse are common in general software development, and the assets and methodologies associated with this can often be considered quite mature. Such asset reuse is generally quite immature in the context of the AI-enabled applications, however, creating enormous value potential for organizations that can systematically and consistently bring substantial levels of reuse to such applications. A process for doing this may be generalized as (see Figure 5):

Once a solid problem framing has been achieved, and or more well-performing problem set-ups are constructed, it is possible to begin standardizing how this is done in similar operations within the business, or in other related businesses, and to define a sequenced or iterating set of abstract modules that compose the solution. This in turn enables the assignment of skills and other resources specialized to each module, so deeper domain knowledge may be more readily acquired and incorporated, and greater skill particular to each module begins to accumulate.

This also permits us to begin moving away from basic algorithmic implementation, and to focus more on algorithmic enhancement, innovation, and optimization. At this point, we would be well on our way toward moving from what would originally be termed an “Emerging AI Problem” toward a more “Mature AI Problem.” This brings along with it the consequent benefits in performance enhancement, scalability, and continuous improvement. We can expect steady improvement in performance from adding more of the same kinds of data, tweaking and incrementally enhancing the algorithmic implementations, and riding the wave of enhanced computational power and architectures.

Over time, these can be hardened and more rigorously assetized, ultimately forming components of highly-scalable or productized deployments.

Summary

Enterprise and Industrial AI¹ are different from other kinds of AI. Failure to recognize this has resulted in tremendous resource misapplication and is largely to blame for the low success rates of related solution deployment (around 10–15% by many counts).

It’s important to understand that most Enterprise and Industrial AI problem formulations are not adequately solvable given the data that are actually available. This is why a solid understanding of the business processes being addressed and the corresponding important business KPIs is critically essential. This permits flexibility to adjust the problem formulation and problem setup until a suitable combination is found which, given the data available, support a solution with sufficient business value to justify full deployment. Ongoing business investment requires clarity that critical business KPIs will move sufficiently.

Increasing success levels frequently requires deep integration across typically disparate but complementary areas that are not accustomed to working together so intimately. These areas or disciplines often bring their own, relatively unique, philosophies of, and approaches to, problem solving. This includes integrating the newer “data-driven” modeling approaches with the more commercially established rule-based techniques and the powerful theory-based approaches. The power of pure data-driven approaches can be dramatically accelerated and increased by integration with prior human-generated knowledge expressed in rules and systems-based or physical-based theory.

We expect that “Enterprise and Industrial AI” will ultimately be seen as a new field bridging mathematical sciences, business process, domain science, user experience, design technology, and infrastructure engineering. This field will begin to have its own conferences, journals, textbooks, and university departments and specializations. Its impact on the global economy, global resource management, and on “how the world works,” will be fundamental.

—

See Noodle.ai White Paper, “What’s Different about Enterprise & Industrial AI, and Why Does it Matter?”

2. This can be seen as a form of “Transfer Learning”, where knowledge obtained from related or constituent prior experience is employed to compensate for insufficient data.

3. At this stage, the data understanding will often be thin, and one can only make a rough assessment of its suitability for a particular framing. Thus nearly every process discussed in this paper is highly iterative.

4. The distinction between prediction and estimation can be blurry. And it relates philosophically to how fundamentally different a “time” feature is. Predictions or forecasts are often presumed as intrinsically non-deterministic, since they pertain to a time which does not exist at the point the forecast or prediction is made. More practically, estimation is fundamentally time independent. It’s based on physics and/or logic, and is thus is deterministic (ignoring all the physics/logic philosophy stuff with quantum theory, chaos theory, etc.).

5. Example high-level algorithms are K-means, SBM, OLSR, LR, LASSO, SVM, GBM, RF, kNN, GMM, Apriori, EM, NN, CNN, RNN, RBM, AE, Word2Vec, Sent2Vec, Doc2Vec, LSTM, KF, NB, HMM, RL, GAN, FT, PCA.

6. It should be noted that advances in “Auto-ML”, where a diverse set of algorithms are automatically tested and evaluated, can often speed up this process considerably.

—

Photo by Mimi Garcia on Unsplash

Follow us here for more on the future of AI. You can also find Noodle AI on Twitter, LinkedIn, YouTube, and get our latest insights via our newsletters!