Predicting Payment Behavior in PAYGo: Machine Learning Can Power Customer Retention

Customer churn is a major headache for most companies and threatens to put the brakes on the red-hot growth of the pay-as-you-go (PAYGo) solar sector. With over 1 million units sold in the last 5 years and over 50,000 units installed each month, the PAYGo model makes solar affordable for end-users and provides sufficient margin for providers to scale last-mile distribution. However, for the model to succeed PAYGo operators must retain customers and build a base of loyal and engaged customers. Our project with Zola Electric (formerly Off Grid Electric) demonstrates that machine learning can help them do so.

PAYGo operators make money from installments and/or fees as end-consumers pay off solar assets over 1 to 3 years. Given these time horizons, maximizing customer retention is critical to driving sustainable growth for PAYGo companies. But it’s not always clear why a PAYGo solar customer stops paying or how a PAYGo operator should intervene to improve repayment and drive retention. FIBR’s work with ZOLA, a leading PAYGo solar operator active in four African markets (Tanzania, Rwanda, Ghana and Ivory Coast), demonstrates that smart predictions and machine learning can reveal opportunities for improving retention and deliver a powerful return on investment.

In the PAYGo sector, an account is considered “churned” when either: a) a customer has temporarily stopped paying for a number of consecutive days, forcing the product to disable use of the energy services, or b) the customer is completely lost and has stopped using the product entirely. People may stop paying for different reasons: dissatisfaction with the service, income instability, payment frictions around using mobile money, poor understanding of the product terms and conditions, the allure of better product offerings from competitors, access to the national grid, or simply the realization that they can no longer afford the ongoing payments.

Watch how FIBR Zola Electric used data to better serve PAYGo customers

For a PAYGo operator like ZOLA, customer churn has a big impact on the bottom line because the operator has already invested in the sales/onboarding costs, and it is foregoing lease revenue when payments stop. The onboarding costs include the physical asset and the cost of deploying a field agent to install and wire in a customer’s home: often in rural, difficult-to-reach areas. In the case of PAYGo water, the fixed investment includes connecting a consumer’s household to piping. Repossessing and redeploying the solar kit or the water connection from a churned customer can wipe out any potential profits from the original sale, and often leads to a net loss on the customer if the operator is unable to repossess and refurbish the equipment. Reselling the unit to a new customer generates another round of acquisition costs that also eat away at potential profit. On the other hand, a retained customer delivers not only positive margin to the PAYGo operator, but can often lead to referrals and cross- or up-selling opportunities.

Predictive modeling of repayment behavior is common practice in the financial services sector and holds significant potential for the PAYGo sector. Such models predict which customers are most likely to reduce usage or stop paying, thereby giving operators the ability to take decisions on interventions to shore up their usage and keep them happy. Steps in the retention journey typically involve: a) analytics and research to explore why churn happened in the past with the goal of identifying root causes, b) classifying customers into segments based on payments and other relevant behavior, c) predicting who might stop paying and when; and d) targeting proactive interventions to address repayment obstacles and reduce the likelihood of churn.

FIBR’s work with ZOLA focused on churn prediction, which is known as a “classification” problem. The team looked at user activity, observed who churned, then created a model that could separate those who remained from those who did not. With enough data, predictive models can identify the best indicators of a PAYGo consumer’s likelihood to continue to pay or to churn.

Predictive models are powered by Machine Learning (ML), which is the ability for an algorithm to learn from existing data to produce a prediction. In the case of churn prediction, ML shows an algorithm a set of mobile payments from ZOLA’s customers and tells it what the right answer is (In the past, did the customer churn: Yes/No). This is known as training a model. Once the model has been trained with a subset of the full payments dataset, the teams feed it new data to see whether the machine makes the right prediction for customers whose data it has never seen. Once the machine reliably predicts who has churned in the past, the PAYGo operator can use it to decide how to proactively engage with individual customers.

Sorted precision of confidence in churn risk probability. This graph is a distribution of how confident the model is that each customer will churn. So for some people the model is very confident, and so for those you really want to do something, and for a lot of others it’s less confident. For an intervention that’s particularly expensive, you probably only want to intervene with those in the upper left.

Machine Learning can analyze massive complex datasets that are time- and cost-prohibitive for humans to do rapidly, for example to recognize payment patterns and build payment profiles. For ZOLA, this could save the business intelligence team significant time and energy. PAYGo operators are often flush with data from tens of thousands of customers each making multiple mobile payments each month, as well as customer care and call center tickets, and product usage and performance data. All of this can easily be digested by an ML model to identify which has the most correlation with churn. The churn prediction model can also be updated or refreshed regularly, whereas re-running regression analysis can be unwieldy and time-consuming for teams.

Machine learning can quickly digest hundreds of variables. In our work with ZOLA, we were able to send 150+ variables through a random forest model to identify which are most influential to churn. This could be done on a regular basis in a few hours with ML, which can be much faster and more efficient than having humans run regressions.

How could this improve upon business as usual for PAYGo operators

There are many efficiency gains to be generated by automating data processes to identify and tag customers with a non-payment or churn probability. An effective churn prediction model can deliver cost savings by allowing the institution to target interventions for customers who need it, and minimize communications with customers who don’t need them. Most PAYGo operators currently deploy post-sale interventions — SMS messages, phone calls, agent visits — in the same way to all customers. For example, SMS payment reminders are sent out days before a customer runs out of prepaid credit, the day they run out of credit, and the following days when the product is locked. The same message content is being sent to all customers, regardless of their payment behavior, product level, and days remaining on the lease. A similar pattern is typically followed with PAYGo agent visits to non-paying customers — i.e. send an agent to visit a customer if they have been more than x days without credit, regardless of past payment history or any analysis of actual churn risk. By treating everyone in these examples the same, PAYGo operators are not able to target the right intervention to the right customer, and may even be spending money on interventions with customers who would otherwise pay on their own without a call, SMS or agent visit.

The churn prediction model can also deliver actionable information to operational teams to develop new programs aimed at customer retention. Over time, PAYGo operators can use this intelligence to tailor their offerings to customers based on payment behavior and churn risk, for example offering a short-term top-up loan to a good paying customer who has a temporary setback, or prioritizing an agent visit to a customer with an increasing churn probability. Ultimately, ML can deliver overall an improved portfolio quality once churn is better managed.

In the following blog posts we’ll describe our journey developing a churn prediction model for ZOLA, including the tools we used, steps in the process, lessons learned along the way, how far we were able to take the model with ZOLA and actionable lessons for the broader PAYGo and financial inclusion sectors.


FIBR stands for Financial Inclusion on Business Runways and aims to learn how to transform emerging business data about low-income individuals and link them to inclusive financial services to deepen financial inclusion and its impact. FIBR is a project of BFA in partnership with Mastercard Foundation.

Stay connected by signing up for the FIBR mailing list and joining the Inclusive Fintech Group on LinkedIn.

If you liked this post, please share the love by clapping below 👏