Optimize the Customer Experience Through Usage Analytics

Paul Lashmet
Product AI
Published in
2 min readMay 11, 2021

Execute statistical and AI models on hundreds of billions of records to enable a better understanding of your client base in order to optimize their user experience.

Challenge:

Understand how clients are using a product based on billions of records of usage data, and then make recommendations for fine-tuning their experience.

Solution:

Ingesting billions of application usage messages in JSON format, all in a single nine-hour day, is no trivial task. Spikes in usage caused by any number of external events must be handled gracefully. These JSON messages need to be enriched, and then persisted in an MPP database for further enrichment, analysis, and querying. This all must happen in near real-time.

Fortunately, Azure provides a set of tools that facilitate building out this type of architecture. Messages are ingested from Kafka queues and then .Net and Python function applications are executed based on events such as a certain threshold of messages having arrived, or a timer. Next, JSON is transformed to Parquet from a .Net function application, and then these files are bulk-loaded into the MPP database.

The reason for bulk-loading is that MPP databases are not traditional transactional databases, and they perform much better when periodically batch-loading millions of records. Azure Synapse MPP is a column store database, and the larger the number of records, the better compression and performance.

Executing statistical and AI models on hundreds of billions of records is complex. Fortunately, Azure Synapse Analytics integrates Apache Spark, Spark ML, Python Notebooks, and TensorFlow into a computational grid and user experience that enables machine learning at this scale. The usual set of data science (e.g. Scikit-learn) and visualization (e.g. Matplotlib) Python libraries are used for analyzing the usage data within a Python notebook. Spark auto-scaling is used to scale the Spark cluster, depending on the computational complexity of the analysis.

Employing these integrated sets of tools, a series of traditional machine learning models are executed to understand how a user is utilizing what is essentially a very complicated set of mini applications that function together as a single large application.

Using this infrastructure, different types of analyses are performed. For example, the user experience is optimized by preloading context sensitive data using predictive branching. Likewise, micro-market segmentation is used to get a deeper understanding of client usage based on similar personas. Additionally, time-series analysis and Bayes modeling is employed to identify subtle interconnections. All these approaches enabled us to better understand our client base and optimize their user experience.

Technologies Utilized:

Python, Azure Synapse Analytics, TensorFlow, Spark on Azure

--

--

Paul Lashmet
Product AI

Paul Lashmet is a business integration architect and financial services subject matter expert.