Starschema Blog
Published in

Starschema Blog

Learning to Stop Worrying and Augment Your Analytics

I don’t think about memes a lot — or much of them — but lately this one has been on my mind:

Some wording may have been changed to avoid having to label this post NSFW.

We have billionaires taking joyrides to space, and yet we keep manually updating Excel spreadsheets to uphold what we know as “order” in our world. Seems like we’re a long way from spending Wednesdays gorging on figs.

But hey — baby steps, right?

This post looks at a maybe-not-so-baby step towards a working life for data analysts that’s just a bit more in tune with the basic human need for comfort and fulfilment: using ML-driven solutions that leverage natural language generation (NLG) to help eliminate mundanity from data analytics.

These Machines Are Not Out to Replace Data Analysts…

Most people don’t love to work, but pretty much nobody wants to lose their job. So, until we recalibrate the fundamental organizational logic of our societies, there will always be a justified wariness that technological innovation will cost people their livelihood. In such an environment, the best we can do is strive for a compromise where we’re still chasing ways to improve KPIs but do so using tools that get us to our destination faster, with optimized effort and a more rewarding experience overall.

Augmented analytics technologies and practices aim to optimize human effort by focusing it where it can make the greatest impact. Its endgame isn’t to completely remove human attention and input from the analytical process but helps eliminate steps that humans find repetitive and uninspiring — and which, as a result, we tend to fail at due to lapses of attention. Solutions that fall under the augmented analytics umbrella, including ones that take advantage of NLG, integrate into human workflows to enable results that would not be possible through processes that are driven purely by AI or humans.

But that’s enough philosophy for now; let’s see what business problems prompted one of Starschema’s clients to augment their analytics with an NLG-enabled solution.

…But We Should Probably Stop Throwing Humans at Every Problem

Upper management at a major tech company wanted to better understand what was happening in the market, especially the factors driving sales. Every week, analysts review their range of products to see what happened in the last week and to find the main contributors to the changes, with special regard to anomalies in the data. And every week, analysts need to dig deep into several dashboards and connect the dots in the data to find fundamental connections. This would tie down analysts in relatively low-level tasks that nevertheless introduced considerable room for human error and delays.

The solution consists of two main components: one dives into the data to find anomalies, while the other packages the findings in a way fit for human consumption. The first component applies ML to time series data to find anomalies, the exact nature of which are defined by the user, while the second one uses NLG to create natural, easy-to-understand and focused sentences about the anomalies, which get delivered to users over their preferred channel (email, notification, DM, etc.).

Seeing through many charts to find the anomaly and understand the underlying cause takes much attention. By contrast, reading a small but straightforward summary of the problem and its contributing factors helps focus on the exact value immediately — and spend more time devising the most effective course of action.

Demo results from Starschema’s Boardwiser NLG solution for anomaly detection and summation.

This method of automating the recognition of relevant changes in a KPI and the contributing factors behind it eliminates the need for an analyst at the lowest step of the process. And this doesn’t mean that the analyst is kept out of the loop: on the contrary, this solution gives them a head start to identify and work with the most promising information, applying the “authentic” intelligence and intuition that our machines are yet to replicate instead of burning time and energy trying to find a needle in a haystack of data with their bare hands.

Do Your Analytics Need Augmenting?

Does your organization have so much data that you have people staring at it all day? Then chances are your analytics pipeline includes a range of necessary but relatively low-level tasks that wear out analysts and hurt their performance at greater value-added processes. They might need to go through dozens of dashboards and iterate through many combinations to find the right aggregation level for the region/product/shop where something important happened. And when potentially critical data is unavailable in dashboards, opportunities open for missed insights and information gaps filled in with personal bias.

The complex solution described above may be tailored to a specific organization’s specific needs, but its two main components — ML for automating anomaly detection and NLG for communicating the auto-generated insights — are highly adaptable to a variety of use cases. Think of them as pieces in the growing bucket of augmented analytics building blocks as you explore what kind of solutions you might put together to help data analysts at your organization feel more inspired at work.

Acknowledgement

This post refers to the white paper Automating BI Analytical Tasks with Anomaly Detection and NLG Summation by the Starschema data science team. If any of the ideas or technical details in this post got your gears turning, you’ll find additional food for thought in the full paper, including practical implementation tips — access it here:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Zsolt Palmai

Zsolt Palmai

12 Followers

Zsolt is Content Manager at data services firm Starschema, where he creates materials to help you learn about the company and enterprise-grade data solutions.