Improving Employee Experience Through A/B Testing

Employee engagement is the number one problem for many businesses across the globe. According to Gallup, only 13% of employees are engaged at work worldwide. Businesses spend significant budgets to engage their employees, reduce absenteeism, increase retention. But only very few of those have figured out a way to figure out a way to calculate ROI of their engagement programs and even less know how to predict an efficiency of these programs. Is this really such a complicated issue?

Looking 20 years back, this same issue has been affecting marketing industry: very few companies were able to calculate ROI of their marketing initiatives, so most were just guessing and relying on industry best practices. But marketing world has evolved a lot since then. And these days you can hardly find marketing organization that is not using method call A/B testing to increase the efficiency of their campaigns and maximize ROI of marketing budgets.

Putting it simply, A/B testing is an experiment with two variants, A and B. In fact there could be more variants and people often use this term to refer to any focus group testing, the key thing is that all the conditions for all experiments should be equal with the exception of one variable parameter.

An example from marketing world would be selecting the right text or picture for a context ad. If you were to run an ad campaign, you can pick 2 small focus groups (group A and group B) and run the campaign for these 2 groups showing them 2 different versions of an ad (version A for group A and version B to group B). Then if you compare the results (click-through rate) — you would know which version of the ad is more efficient. Thus, when you launch the campaign, you will be running the ad that works better thus increasing the ROI.

This approach works perfectly for marketing but it also works in many other areas including HR. Any change that a company is planning to introduce to increase employee performance, drive engagement or reduce turnover, can be very costly. Running a series of tests for the change before implementing it across organization ensures maximum ROI for the resources spent.

So how can this be applied in the employee engagement world? Here is an easy example from a company I happen to know. Few years back board of directors was disappointed with the high turnover rate for the key employees and the executive officers got a new KPI in their annual plans — decrease employee turnover rate by 10%. Executive board used what they thought was “common sense” and amongst other things they introduced fitness benefit — ability for every employee to visit nearby fitness centers for free. It was costly and complicated setup as the company had over 20 office locations and had to deal with multiple different fitness vendors. And at the end of the year they found out that less then 5% of their employees have ever used the benefit thus the change has not really affected turnover rate. So next year they were smarter and instead of cancelling the program they launched 2 different variations of it in 2 different locations. In one location they have offered 100% coverage of whatever fitness services employee selects while in the other one they’ve offered weekly yoga and pilates classes right in the office. And in 3 months they measured the results and saw that over 60% of people used the first option while in-office classes were used by roughly 10% of the employees in these locations. On top of that, this first variant did not require any setup from the company thus it ended up much cheaper than the original program. So now they’ve scaled the first program across all of their offices knowing that its efficiency is already tested and is much higher then the original offering.

With all the benefits, A/B tests can only be used under certain conditions. First, there should be a possibility to select 2 small but representative focus groups and implement the change for these groups only. The groups should be small enough so that the change implementation will not take too much effort and resources. At the same time these groups should remain representative meaning that if something works for a focus group — it will work the same way for the whole organization. So if the change is going to be applied for the whole company — do not test it using just accounting department or just one office in Bangladesh, pick a focus group that includes employees from different locations, age groups, profession.

Secondly, there should be a way to reliably measure the impact of the change for each focus group as well as for the whole organization. And it is very difficult because measuring people-related metrics requires some non-trivial approaches, especially when it comes to extrapolating short-term test results to the long term organization-wide impact. In certain cases companies can monitor objective metrics (program usage rate, employee KPIs change, absenteeism data) but whenever it is not possible — focused pulse surveys is the tool collect data from relatively small focus groups in an agile way.

One issue that is more specific to HR application of A/B testing is metrics. In marketing world there are always some pretty straightforward metrics available for any test — CTR, page views, number of sales. In HR, obvious metrics can not be measured in many cases. For example, one can not set retention rate change as a valid metric because it would take at least a year to collect some numbers and there will be so many other factors affecting the metric that the test will not be representative. What can be measured, however, is a secondary metric that is indicative for the primary one. In the case of retention rate it could be NPS (Net Promoter Score).

Running more then 2 experiments at a time is totally possible but it does not make sense to test too many options — the experiments themselves become too costly. The right approach here is continuous improvement: running a test for 2 or 3 options initially, then scale the most efficient option and then periodically run another experiment with just one alternative option and compare it to the one that is already implemented at scale. This approach also ensures that the programs stay valid over time.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.