How To Deploy Deep Learning In Enterprise (Pt. 1)
Update: Find Part II of our reflection on deep learning deployment here.
While most business leaders have started researching how to incorporate deep learning into the enterprise, knowing where to begin remains a challenge. Our co-founder Stephen Piron recently visited Silicon Valley to attend NVIDIA’s GPU Technology Conference (GTC), where he gave a talk on introducing deep learning into the enterprise, illustrated by DeepLearni.ng’s recent deployment of the first deep learning system for retail banking.
This post will revisit some of the talk’s key takeaways on the challenges associated with incorporating deep learning in enterprise.
1. Data Ready for Deep Learning is the Exception, Not the Rule
To successfully train deep learning models, vast amounts of data are needed. The good news: enterprises have a lot of data to work with, fuelled in particular by the rise of Big Data in recent years. In fact, one of the central promises for enterprises working with deep learning is that it offers a way of finally securing ROI on investments in this space. Because deep learning thrives on complex and diverse datasets, it elevates not only the ability to recognize patterns from data but also to generate value from data. Engineered to recognize non-linear patterns in data, deep learning opens up a lot of new questions to ask, leading to enhanced decision making and better operations in business.
The bad news? A great deal of enterprise data is either unstructured or improperly indexed. Because of this, extensive preparation is often needed before assessment of the data’s suitability for deep learning applications is viable. Finding the right data is also intrinsically tied with the need to locate the right questions to ask about your business. When first introducing deep learning to the enterprise, it’s critical to define a small but impactful use case that is also a good match for the way deep learning currently operates. After finding existing enterprise data that translates well to deep learning, it becomes substantially easier to tailor existing and incoming data assets to optimize deep learning’s organizational impact.
2. Deep Learning Impacts Existing Processes, Which Equals Resistance
Enterprise infrastructure, including the bank’s, is rivalled in complexity by only the largest governments. On top of that, the bank also has some of the most rigorous requirements for security and compliance. With these factors in mind, introducing new technologies into the enterprise necessitates caution and detailed planning to ensure the transformation is as seamless as possible. New technology also impacts existing processes, and disruption that results from the next big thing is not usually accepted silently by various business verticals. Skepticism about deep learning’s value can be accounted for by looking to the past. As most readers are probably aware, developments in AI have met several different ‘winters’ after periods of great excitement since the mid-twentieth century. For people who have encountered these previous cooling off periods, there is an understandable point of concern that this time will not be different. But with recent and exponential advancements, most experts working in the field are confident that the technology is here to stay. At the same time, there’s a large amount of hype and misinformation about what AI and deep learning can and cannot do. Sorting out what’s real from what’s bullsh*t is time-consuming, and a task that doesn’t realistically fit the busy agendas of most business leaders.
Working with the bank, we definitely encountered resistance across verticals and between different employee groups. Here are some ways we navigated and ultimately pivoted from a point of struggle to one of widespread enthusiasm upon the project’s completion.
A Deep Learning Education
To foster curiosity about deep learning, we led weekly training sessions with employees from different business verticals, all while dispelling common myths and mitigating common concerns. We also shared some resources we found helpful for understanding the technology, for both technical and non-technical users. Taking employees’ common knowledge of logistic regression as a comparative example, we worked carefully to dissect what was shared and what was different between our deep learning model and logistic regression. The sessions’ attendance grew exponentially each week: starting with 3 employees in the first week to over 200 after only a couple months.
Thinking About Deep Learning Results vs. Mechanics
To optimize the bank’s ability to find the right use case, we developed a framework to talk about deep learning not in terms of different neural network architectures or even supervised, unsupervised or reinforcement paradigms. Instead we discussed possible use cases for the bank in terms of Deep Learning Design Patterns. Focusing more on the outcome of each pattern rather than technical ‘nuts and bolts’ accelerated employees’ ability to understand how deep learning could be applied to enterprise data.
Here are the three patterns we talked about the most:
- Clustering: This pattern is used to differentiate a dataset into groups, or clusters, based on shared similarities. Using deep learning to organize information into clusters makes the discovery of new connections or categories existing within data possible. Emerging with an enhanced understanding of customers is one of the optimal use cases for Clustering.
- Reinforcement: This pattern learns to quickly recognize new insights from data from its training environment. Similar to a video game, the pattern is primed with objectives to search for in a particular set of data or data environment, as well as things to avoid. For enterprise, this pattern is able to enhance decision making processes by identifying patterns from data that best support a desired outcome.
- Prediction: The last pattern is a scoring algorithm that predicts outcomes based on a ‘snapshot’ of hundreds of features from a dataset. Most simply, this design analyzes patterns from the past in order to predict the future. This was the design pattern we ultimately used at the bank, working with credit card data to emerge with an elevated ability to predict late and missed payments.
By sharing our machine learning expertise with the bank’s business experts, we were able to develop a targeted and measurable use case that provided immediate value upon execution. Having positive, demonstrable, real world results has reinforced existing organizational confidence about machine learning’s impact, with a better understanding of what’s possible. Employees skeptical about the technology’s impact have also shifted from a point of resistance to one of excitement after seeing what the technology can do.
3. Working with Existing Infrastructure and Finding Machine Learning Experts
Deep learning projects for industry consist of drastically different technological realities than those conducted by academics. While academics can tailor and even generate data for a project and design stacks optimized for deep learning results, enterprises introducing deep learning must work with the data they have already collected, also taking the organization’s existing stack into close consideration. Building on top of these legacy systems requires significant effort but is possible when machine learning experts and employees with expert knowledge of the organization’s complex infrastructure collaborate.
Enterprises typically also feature sophisticated protocols for security and compliance, which are essential to organizational success. This was definitely the case at the bank. Because we were building a model on real customer transaction data, the bank had to restrict any direct access we might have to the dataset entirely. Tasks typically done by machine learning practitioners such as data wrangling, model selection, and benchmarking had to be automated by machine, without human intervention.
GPUs were a key component of overcoming the challenges posed by these restrictions. We knew the hardware would be critical for the secure cloud we would build to match the project’s requirements. To be sure, GPUs are not standard bank hardware. When we started our engagement, there was no official procurement policy at the bank to source them. To persuade the bank of their essential importance, we had to show the bank a diverse range of use cases to illustrate GPU’s huge impact on successful deep learning projects. Ultimately our persuasion was a success, and we built a secure cloud networked across dozens of distributed GPUs. As a result, the bank also has a framework to develop an official policy for GPU procurement, which will prove critical to future deep learning projects.
On top of these infrastructural challenges, securing machine learning talent is no easy feat. There are still relatively few brilliant engineers that are also machine learning experts.
With applications that create opportunities to enhance customer understanding, streamline processes and make better business predictions, it’s an exciting time introduce deep learning into the enterprise. The returns on investment that deep learning offers are truly revolutionary in their impact. For the first time, we’re approaching a technology that might be able to appreciate rather than depreciate in value over time. Without historical precedent, this massive shift in how ROI works will produce a sea change across industries.
In Part 2, we’ll showcase the results of our work with the bank, and how quickly we were able to begin measuring the project’s return on investment. Revisiting the importance of providing a deep learning education, we’ll also explore how our work with the bank fuelled our understanding of a need for greater access to deep learning knowledge and resources for non-technical users.