An ode to the analytics grease monkeys (analytics deployment = ROI)
Analytics has value only when it is actionable
Analytics provide a significant business (monetary) impact for organizations when analytical assets are deployed within business processes — that is, when analytical assets are consumed by people or processes.
One such example of deploying analytics in business processes is for smart buildings and elevators, as explained by IBM’s Harriet Green and KONE’s Larry Wash at InterConnect. Other examples include a call center operator making a recommendation in real-time to a customer (a recommendation promoted by an analytical propensity model), or a machine accepting or declining millions of credit-card transaction every minute (acceptance calculated in micro-seconds by an analytical fraud-detection model).
Embedding those models inside the appropriate business processes is the way to deliver that value. Making sure that those models (that have supported concrete decisions) can be legally traced, monitored, and secured requires discipline and management capabilities.
Deploying analytical assets within operational processes in a repeatable, manageable, secure, and traceable manner requires more than a set of APIs and a cloud service — a model that has been scored (executed) has not necessarily been managed.
Developing & deploying analytical assets: complementary cycles
In the mid-90’s we went through a phase of wild and creative programmable development. Today, aside from a salutary democratization of analytics in general and predictive analytics in particular, and an unprecedented development of analytical talents, the analytical open-source movement is also producing a wide number of analytical assets that will eventually have to be consumed — but current open-source real-time deployment techniques provide “dissemination means” not “managed deployment means”.
To make a timely analogy for this Summer: you do not win the Tour de France by building bikes, you win it by riding those bikes within organized teams.
In the last decade where the “industrial” deployment of analytics has started to boost the competitiveness of early adopters, we have learned, often the hard way, that releasing a model to production is not enough; that “release moment” is followed by many steps that make the deployment cycle as important as the development cycle (see Figure 1).
Figure 1. Complementary and interlinked analytical development & deployment cycles
Capabilities for the deployment of analytical assets
The development cycle has been battle tested for more than two decades, and originates from the open methodology CRISP-DM (Cross-Industry Standard Process for Data Mining) which was refined last year by IBM, through an extended version called Analytics Solutions Unified Method for Data Mining/Predictive Analytics (also known as ASUM-DM). Many development efforts at IBM enable the development cycle through technologies including our Data Science Experience (DSX) and our ML-as-a-Service expansive upcoming machine learning workflow capability.
The deployment cycle, on the other end, has not been as formalized; but successful organizations have adopted various technologies to provide the foundation to manage analytical assets in a reliable, repeatable, scalable and secured way. That foundation should provide:
- A “centralized” way to store and secure analytical assets to make it easier for analysts to collaborate and allow them to re-use models or other assets as needed (this could be a secured community or a collaboration space)
- The possibility to house the repository in an existing database management system (regardless of its provisioning) and establish security credentials through integration with existing security providers, while accommodating models from any analytical tool
- Protocols to ensure adherence to all internal and external procedures and regulations, not just for compliance reasons but, as an increasing amount of data gets aggregated, to address potential privacy issues
- Automated versioning, fine-grained traceable model scoring and change management capabilities (including champion/challenging features) to closely monitor and audit analytical asset lifecycles
- Bridges to eventually link both the development and deployment cycles to external information life cycle management systems (to optimize a wider and contextual re-use of assets) and to enhanced collaboration capabilities (to share experiences and information assets beyond the four walls of the organization).
The development cycle has always been the glamorous part of the full analytical process, and developers, especially coders (now fully enabled through robust sets of development tools), have thrived on that first cycle. However, neglecting the deployment cycle has often prevented organizations to fully realize the business value of their analytical efforts while generating distrustful attitudes towards analytics within those organizations. In the deployment cycle, the “industrialization” of analytics is less glamorous, but it is where the rubber meets the road.
Beyond fostering collaboration, delivering more effective analytical results, controlling and deploying those results to people and systems to support decision-making in a broad range of operational areas, automating analytical processes for greater consistency and auditing the development, and refining and using models and other assets throughout their lifecycle, the formalization of the deployment cycle provides a bigger picture advantage: more consistent decisions for the organization.
Next analytic market leader
The current hype around the development cycle is overshadowing the upcoming need for organized analytics consumption — and that need, through the advent of external data and the tsunami of Internet-of-Things data and processes will be overwhelming.
IBM has the benefit of having foundational capabilities through its Collaboration & Deployment Services technology, but the next wave will require some significant updates around that foundation — including the extension (the bridges) toward information life cycle management capabilities and wider collaboration capabilities (beyond the organization’s analytics community). We need a stronger focus and a formalized approach (with its accompanying technology) for the “deployment cycle” — let’s bring together the conversations on the analytical deployment theme.
I predict that the next analytic market leader will dominate (and strongly financially benefit from) the managed deployment of analytical assets; who will take that hill first?
Originally published at datascience.ibm.com on August 23, 2016.