Is Your Behavioral Data Truly Behavioral?

Published in

Behavioral Design Hub

4 min readSep 25, 2021

*This blog post is based on my recently published book* *Behavioral Data Analysis with R and Python*.

In business analytics and data science, the goal is most often to predict and change customers’ behaviors. You want to know the probability that someone will repay their loan, that they’ll purchase a certain product or renew their subscription, and so on; and then you want to affect that probability, for example by sending them a reminder or a coupon.

Doing so require having data that adequately reflects behaviors. This means having or building variables for repaying a loan, purchasing a product or renewing a subscription. Beyond that, we’re often interested in understanding how certain behaviors affect other behaviors. Does having recently added a family member to your subscription increase the probability of renewal?

I’ll argue that [data] is rarely truly behavioral, in that it doesn’t do a good job of reflecting behaviors.

From that perspective, a very large share of data in business is indeed about behaviors. However, I’ll argue that it’s rarely truly behavioral, in that it doesn’t do a good job of reflecting behaviors. The main reason for that is that the way data is recorded is driven by business and financial rules, and is transaction-centric rather than customer-centric.

A telling example

For example, the variable “subscription renewed (Y/N)” can really mean a variety of different things:

The customer actually renewed their subscription (which is generally how we interpret that variable);
The customer checked a box without reading the fine print that mentioned that their subscription would be automatically renewed;
The customer didn’t cancel their subscription so we defaulted them to renewal;
(If we’re talking about a competitor’s service) The customer told us that they renewed their subscription but we can’t verify it.

Even if the customer actually renewed their subscription, we can’t assume their intent. They may have done it:

Because we sent them a reminder;
Four times in a row because the page was not refreshing;
Mistakenly, when they really wanted to get a different package;
A week ago, but due to regulatory constraints we recorded it only today.

And that’s for a pretty straightforward behavior! Once you start getting into more complex ideas such as “customer experience” or “customer engagement”, it’s not even clear that there’s any recognizable behavior involved. Indeed, we often use for behavioral analytics variables that reflect personal characteristics (e.g., demographics), cognition and emotions or intentions instead of behaviors per se. To understand how these variables impact behaviors, they also need to be well defined, but that’s a topic for another day. For now, I’ll just outline some characteristics of “good” behavioral variables.

“Good” behavioral variables

We want a variable to reveal all the behavior and nothing but the behavior, but that’s easier said than done. How can you know if a variable is up to the task or if you need to change it? A good behavioral variable is observable, individual, and atomic.

Observable ✅

For a variable to truly reflect an action or behavior, it must be observable, at least in principle. If you were in the room with the customer, could you see them do it? Abandoning a renewal in the middle of the process is observable, “changing one’s mind” is not.

Individual ✅

A good behavioral variable is individual. An aggregate variable such as the proportion of customers who renew their account in a given month can fall prey to confounding factors such as changes in the customer mix. If you ran a big marketing campaign towards younger customers exactly a year ago, the monthly renewal rate may fall alarmingly because the retention rate is lower for new customers. The solution would be to control for individual characteristics and tenure with the company, i.e., measure the propensity to renew at the cohort level instead of relying on broad snapshots.

Atomic ✅

Finally, a good behavioral variable is atomic: it reflects a specific behavior. For example, a customer may renew their subscription online, by mail, by email, on the phone or in a store. There are certainly situations where we care only about the implied intent, but at the very least we should be aware of the concrete ways of fulfilling that intent. And in some circumstances the concrete steps matter, such as when we want to measure the probability of upselling.

Are your behavioral variables observable, individual and atomic? If not, this may bias your analyses and explains partly why the variable doesn’t change as expected.

And if you want to learn more about behavioral data analysis, you should check out my book Behavioral Data Analysis with R and Python on Amazon (please note that this is an affiliate link and I may earn a commission if you make a purchase after clicking it).

About the author
Florent Buisson is a behavioral economist with 10 years of experience in business, analytics, and behavioral science. He most recently started and led for four years the behavioral science team of Allstate Insurance Company. Florent has published academic articles in journals such as the peer-reviewed Journal of Real Estate Research. He holds a Master’s degree in econometrics as well as a Ph.D. in behavioral economics from the Sorbonne University in Paris.