An Experimentation Day at WBAA
The WB Digital Channels Experiment
At ING WBAA we know that experimentation lies at the heart of data science and data engineering. To come up with innovative machine learning solutions and creative use cases you need to dedicate time to research and experimentation. Every month we organize an experimentation day where anyone from the team can recommend experiment topics, that are selected by team vote for execution on the day.
Since research is also about collaboration and synergies between groups of people that can bring in valuable and diverse insights, we planned an experiment with ING’s WB Digital Channels team, aimed at clustering user profiles based on their click behaviour. The experiment was based on the Customer Experience project they have developed and their team was represented by Alexandros Batzios — Feature Engineer WB Digital Channels. Alexandros worked together with three WBAA members — Fabian Jansen and Lorraine D’Almeida — Data Scientists, and Wendell Kulling — Chief Product Owner.
Alexandros and Lorraine provided an interview to guide us through the experiment, the data they used, the results and the experiment’s potential.
What was your motivation to participate at this experiment?
Having a PhD in semantic data technologies as well as having helped take to market a data-focused Fintech in London, Alexandros had a personal interest to learn more about Machine Learning through this experiment:
“Coming from a data oriented background, I tend to think that most answers can be found in data, so I was very curious to see what applying machine learning to click data from InsideBusiness Portal would reveal”
Lorraine as an experienced data scientist with a strong business acumen was motivated to be part of this experiment not only because she saw potential in the applicability of the findings but also because she could work together with the actual users for whom the model is meant for.
“I found this use-case quite interesting to work with because even though we performed segmentation specific to users in the online channels, the same technology can be leveraged for different types of customer segmentation. It also gave an opportunity to work closely with the key stakeholders. We collaborated and did something similar to pair programming to do the initial analysis and feature engineering. We split the work in order to run the different clustering algorithms with multiple settings in parallel”
This experiment was part of the Customer Experience project in Digital Channels, spearheaded by Andrei Ilchenko — Tribe IT Lead who started the initiative — and Martijn van den Ordel (Chapter Lead CJE), with Alexandros as the Feature Engineer.
It consists of tagging pages and click events in InsideBusiness Portal via the Webtrekk Analytics platform. These events are then imported into ING’s Datalake (WB GDIL) and through MIBI (Management Information/Business Intelligence), user behavior and other metrics can be viewed. In one of the phases of this project, the team wanted to apply Machine Learning on this data in order to cluster user profiles based on their actual behavior. The purpose was to create a customizable user interface such as a smart dashboard, but also to be able to gain insights into any non-obvious usage patterns that develop over time.
“A large part of the experiment was the data engineering effort, which consisted of transforming the data into a form that can be used as input for Machine Learning algorithms. We had to track and link click events to distinct (anonymized) users and the companies they represented. We also had to deal with data quality issues by identifying and removing invalid tags.
Finally, once the data was ready for processing, we had to run multiple experiments to determine the level of granularity that would make sense, as well as the characteristics (features) we wanted to include in the analysis. Machine Learning is science mixed with a bit of art, because the characteristics we pick from a dataset to be used as input, significantly affect the result of an algorithm. There is no cookie-cutter way of knowing what the “right” variables to use for analysis are, so this requires a bit of intuition as well as trial and error.
At this point I have to say how hugely important the contribution of the WBAA team was. In 2 days of pair programming, we were able to select the right features of the data for analysis, optimize run-time and memory usage so that we could actually run multiple experiments per day (versus for example, having to wait hours per experiment) and link the data in such a way as to get meaningful results” — Alexandros
The Data and the techniques
For this experiment Webtrekk analytics data was used, enriched by custom tagging for specific click events and pages of InsideBusiness Portal. To simplify the process the sample consisted of one month’s anonymized data from the Production environment.
“We merged this data with user data from InsideBusiness Portal, to be able to identify distinct user roles and the company each user belonged to. This allowed us to include the clients (companies paying a subscription for a certain number of users) in the analysis, without the need to carry personal customer information outside of InsideBusiness” — Alexandros
“We did some initial analysis and ETL to link the log data with customer information, pre-processing of the data to come up with meaningful features and used some unsupervised learning algorithms like KNN and Hierarchical clustering for the segmentation” — Lorraine
The experiment’s results were quite promising since the team managed to identify specific patterns in user behavior or corporate function. These insights can be used for further segmentation and from there tailored dashboards according to users’ preferences. This segmentation can facilitate decision making processes and enhance the users’ experience.
“We were able to identify clusters of both individual users but also corporates based on commonly recurring usage patterns”
There were for example, clusters of users that seemed to have multiple roles as both admins and executing payments. Or users across companies and functions that seemed to go to the knowledge centre a lot and then create service request tickets. This might be useful to investigate if these are the same users over time, if they all have the same corporate function or they work for a particular industry, and what it is that they’re finding difficult to do.We also found clusters of companies whose users were combining certain products and channels, and specific functionality of those.
These were just high level results, some of them probably worth drilling down to. But the main benefit would come from being able to consistently run this and see how usage patterns change over the months, and to also run these experiments on a more granular level such as per product / application rather than per Channel. Then each CJE would be able to see first-hand exactly what the common behaviours, patterns and maybe even sticking points of their users are.
One thing is for sure. Whether this is used to reveal new trends or to validate future working assumptions and strategies, the era of speculation is fast giving way to the era of data-driven decision making! Many thanks to WBAA for helping WB Digital Channels in this direction” — Alexandros
“Some of the clusters from the algorithm were in line with the intuition based on user-specific roles. Within the Digital Channels of WB, they have a segmentation which provides users with a specific view/ dashboard based upon their function/ roles. User behavior based segmentation will help segment customers based on their actual behavior and the dashboard can be tailored to their actual usage” — Lorraine
Experimentation is also essential when working agile. We get to develop our skills, share knowledge, be creative, test and evaluate different solutions and machine learning models and from there capture an idea that can be useful for a future model. In this process, a synergy between Alexandros who brought his experience in engineering, data technologies, and digital channels, and our expertise in machine learning, uncovered an interesting solution which can be further investigated and lead to a valuable solution for the organisation.
If you are interested in learning more or have a creative idea and would like to join one of our experimentation days, please send an email at email@example.com