ML-powered Sales Pipeline Assessment

Verena Eitle and Tassilo Klein (ML Research Berlin)

Published in

SAP AI Research

7 min readMar 29, 2018

The emergence of machine learning has led to a new level of automation across the majority of business lines such as finance, supply chain and sales. The latter, especially, has attracted great attention in academia and business, where the technology is used to enhance sales processes or assists professionals in making data-driven decisions.

Most companies have a formal sales process in place including clearly specified milestones that are commonly understood by its salesforce. In particular, the effectiveness of managing the sales pipeline has a major impact on revenue growth as shown in a study of Harvard Business Review.

But what does the elusive sales pipeline really mean? In short, it is a visual representation of a company’s sales prospects structured into different phases along the sales process, from initial contact to closing a sales deal. If customers express initial interest in buying a product, they are defined as leads, for whom the company has only little information available. After retrieving more insights and targeting them with certain marketing initiatives like campaigns, it is to decide whether these leads can be converted into opportunities. Once classified as such, the company’s salesforce uses sales activities such as demos to turn opportunities into actual customers.

However, although most companies have a systematic and digitized sales pipeline in place, they currently face the issue of converting approximately just 10% of prospects into valuable customers.Taking this low conversion rate into consideration, there seems to be lots of untapped potential to improve the sales pipeline. In this context, the application of machine learning could offer new opportunities to positively influence the conversion rate by enabling a data-driven approach instead of an arbitrary one.

Machine learning on the sales pipeline of SAP

In the fast-paced software industry high conversion rates, or in other words, closing as many subscription and perpetual license deals as possible, are crucial to remain competitive. However, since the qualification of leads and opportunities depends primarily on the know-how of our marketing and sales specialists, sales pipeline management faces a high degree of subjectivity.

Despite the regional differences in lead and opportunity management across the regions, we maintain and digitize all information related to the sales pipeline in our Customer Relationship Management (CRM) system, which presents us with unique access to a large data pool essential for developing our model. The global dataset can be divided into the areas of customer, campaign, lead/opportunity, sales and product, counting 20–30 features in total. To provide a better understanding, consider the example of campaigns for which there are three features: campaign type, campaign subtype, campaign period . In general, prospects are primarily addressed through market-specific campaigns, which are distinguished by their types (i.e online events, paid media and corporate sponsorship). In addition to the chosen campaign period, the success of closing a sales deals also depends on the selection of the campaign subtype (i.e. blogs, webinars and emails).

Limitations of classical approaches

To ensure a more data-driven sales pipeline qualification, supervised classifiers can be used to assist our salesmen in identifying promising leads and opportunities. Initial results in predicting the likelihood of converting leads into opportunities or even winning a sales deal using classical machine learning approaches, however, yield moderate accuracy of only 78% with high variance. This solid predictive performance is likely to be caused by the presence of bias data due to different regional sales pipeline procedures as well as the high degree of subjectivity, which results from different professional backgrounds and skills as well as the tendency to under- or over evaluate certain prospects for achieving sales targets.

In our license-driven business, however, greater accuracy must be achieved, as data-driven decisions made in the sales pipeline significantly affect the success of the company. Therefore, in order to improve the performance of the supervised algorithms, we have decided to conduct and incorporate research on non-standard ML approaches for the treatment of subjectivity and noisiness in our dataset. In this regard, we aim to explore promising approaches such as learning with subjective terms as well as counterfactual inference.

Approaches and Trends

Noisy and subjective labels

The problem of noisy and/or subjective labels is generally common-place in domains where inexpensive and manual approaches are used for the collection of labelled data. However, the presence of noisy labels adversely influences the classification performance of the induced classifiers. Since the qualification process of leads and opportunities within the CRM system involves a high degree of human interaction, our model might be prone to label noise as well. Therefore, we are currently evaluating whether label noise techniques can be applied in our machine application.

Generally speaking, three types of noise are distinguished where X represents the features, Y the true class, Y˜ the observed label, and E the binary variable indicating whether a labeling error has occurred.

The first noise type, label noise completely at random (a), occurs independently of the true label and the feature values. While label noise at random (b) depends only on the true class, the third type, label noise not at random (c), assumes that the probability of incorrect labeling also depends on the feature values. With respect to our machine application, the first noise type (a) may occur in our data set, when leads are randomly labeled as discontinued by a salesperson, although they show great interest in purchasing a solution. The focus of our research will lie on the the third noise type (c) as it represents an even greater challenge for our prediction task. This type appears when, for example, a salesman systematically discontinues a lead when it contains a particular feature value such as always rejecting a certain industry based on his or her personal opinion.

A method that might be suitable for our machine learning use case is the label noise model presented by Sukhbaatar and Fergus. Instead of eliminating incorrect labels in the preprocessing stage, the authors propose an approach that modifies a deep learning model in the sense that it can be trained on noisy data. The idea is to change the label probabilities output to better match the noisy labels by adding an additional layer whose weights correspond to the noise distribution Q. Since the noise distribution is usually unknown, the study also presents methods for estimating these probabilities from either pure noisy data or additional clean data.

Counterfactual inference

Generally, standard machine learning techniques are not designed to address causal inference issues. In particular, supervised machine learning methods work for predictive tasks rather than for estimating conditional differences or causal effects. A key reason for this is that we do not observe any counterfactuals, meaning that a fraction of labels of a supervised learning problem would be missing.

Causal inference can be leveraged as another method of analyzing the factors which influence the sales pipeline, precisely looking at the distinct difference between correlation and causation. The underlying idea is to understand the behavior of complex systems interacting with their environment in order to better predict the consequences of system changes. Specifically, counterfactual analysis is an interesting approach when dealing with questions like “how would a system have performed during the time of data collection, if a certain action had been carried out?”. However, assessing the consequence of an intervention using statistical data is generally challenging as a confounding variable may be present. These uncontrolled variables trigger consequences that disturb the effect of an intervention. Consequently, it is often difficult or impossible to determine whether the observed effect is a simple consequence of the intervention or has other uncontrolled causes.

In our sales pipeline analysis, one such counterfactual question arises in the context of marketing campaigns. To take the example from above, marketing campaigns in the form of online events such as webinars are selected according to certain criteria of the potential target group. A confounder could be the personal network of a salesperson who decides to address the potential customer with this campaign or might be related to monetary incentives not reflected in the data. In addition to the challenge that these criteria may not be accurate, the high costs make it impossible to conduct marketing campaigns in a randomized control trial. This naturally boils down to framing the analysis of the marketing campaign selection in a counterfactual context as: “How will the customer react when the marketing campaign assignment model M is replaced by M’?” Given sufficient time and resources, we might be able to obtain the answer within the frame of a controlled experiment.

However, due to to our complex sales pipeline processes across the globe it impossible to set up and carry out a new experiment. Therefore, we would like to obtain an answer using sales data that we’ve collected in the past. In particular, we are asking the hypothetical question “How would the system have performed if, after the data was collected, we had replaced model M by model M’?” The answer of this counterfactual question is of course a counterfactual statement and therefore describes the system performance under a condition that has not occurred and, consequently, cannot be answered directly.

All in all, in order to reach our primary goal to enhance the performance of our prediction model, we will focus on the usability of non-standard ML approaches. We’ve found that a high degree of subjectivity and noisiness reflected in our data set, mainly caused by personal judgements and diverging expertise of salesmen along the sales pipeline, negatively affect the model’s accuracy. To tackle this issue, we’ll examine the impact of approaches like learning with noisy labels and counterfactual inference on the machine application of the sales pipeline in the future.