Husprey — Collaborative home for data analytics teams

Published in

Husprey

4 min readMar 23, 2020

Welcome 👋

We’re very excited to start this journey with you!
In this first post, we explain why we decided to invest in this area and build the collaborative home for data analytics teams.

Feel free to clap, share comment and do not forget to join the waitlist at husprey.com!

The Data Analysis Process

Based on our experience and many user interviews we’ve performed over the last months, here are our thoughts on the usual data analysis process and its common pitfalls.

❓Asking a data-related question

Business, Operations, Finance, Product and other teams all consume data to make relevant decisions in their area. If self-serve BI tools have been implemented and deployed, members of these teams can find answers to some of their data questions, but only when they are well defined and match information available in such tools.

In any other case, they need a data analyst to investigate their specific question.

Let’s note here that such self-serve BI tools commonly claim they free data analysts from repetitive and low-value tasks. Indeed, most regular queries can be automated and data consumers can get results in some autonomy. However, they do not ease the overall data analysis workflow -only take care of execution steps-, and the more they get used, the more data analysts have to take care of their maintenance and support.

Most of the time, data consumers have a high-level business- or operations-related question that first needs to be analyzed, split and translated into data-related ones. During this process, usual communication issues surface: differences in terms of professional culture, context awareness, vocabulary, general understanding of the company challenges, etc. can make this discussion tricky and lead to critical misunderstandings.

We believe it’s crucial to clearly write down what has been decided and operate afterwards in full transparency (in terms of prioritization, preparation tasks, findings, etc.). Keeping track of the preliminary discussions and underlying rationales participates in improving the traceability of key business decisions and in sharing knowledge across teams.

🕵️‍♂️ Gathering relevant data

When the requirements are made clear, the analyst gathers all the needed data. Most frequent questions then include:

What data would I like to use at best?
What data can I use right now?
Are these data relevant / accurate / up-to-date?
What data did we use for similar analyses in the past?
What did we already use these data for?

To address these challenges, data catalogs and similar tools endeavor to map and document most of the data in the company. However, as they are global, address a governance topic and have to impact almost the whole company (from developers who write code that stores data, to analysts that use them, to sales who ask and read analyses report, to DPOs who supervise all data usage), their adoption is slow and they are of little help to the analysts for this step.

We believe that keeping track of what data is actually used to perform a real, useful analysis is the best way to document the available data. A pragmatic approach to data management should be favored over a systematic (and unrealistic) one.

💻 Analyzing data

As a third step, the data analyst uses her expertise in statistics, data mining and machine learning to carry out the actual analyses. Frequently, similar analyses have already been performed in the past, and the analyst could leverage any such past work if it were registered in a searchable repository. However, most archiving tasks are left to the lonely analysts, without clear guidance and anticipation (at best, we see dedicated folders with all source files on a shared drive).

We believe that these problems can be addressed by providing data analysts with dedicated tools that embed best practices from both statistics and programming worlds: restrictions and assumptions for particular statistical methods, usual techniques for a given problem; peer review, code versioning.

📊 Delivering results

Finally, the data analyst can gather all her results and deliver her findings in an appropriate way, depending on the purpose of the analyses, their audience and the nature of the underlying data -producing a one-off report, creating or updating a recurrent automated report or dashboard. Unfortunately, the traceability is rarely taken into account when delivering these results: they eventually reach some slides on which important decisions will be taken, with no simple mean to go back to the analysis itself or the source data.

We believe that the deliverables should be clearly linked to the results, the analysis itself, and the underlying data, so that the traceability of each data-driven decision is guaranteed.

Our Proposition

At Husprey, we build the collaborative home for data analytics.

Based on a data-specific workflow, our SaaS application fosters collaboration with and within data teams.

We help business users to ask the questions the best way.

We help data analysts to bring speed and reliability to their process: search through what was done, reuse and spread knowledge.

Would like more details? Learn more (and register!) on our website: husprey.com

A glimpse at our design mock-ups

As a business user

As a data analyst

If you find it interesting don’t hesitate to clap for us and also share our first blog post ever. You can also email us or join our waitlist at husprey.com

Brand new Husprey logo. Thank you so much Amélie & Zoé :D