The Self-Service Analytics Initiative at Evermos

Embracing Self-Service Analytics in a Dynamic Business Environment

Muhammad Ammar Fauzan
evermos-tech
8 min readOct 20, 2023

--

By Muhammad Ammar Fauzan & Rendy Bambang Junior

Data Team at Evermos

Evermos

Evermos is a retail enabler startup in Indonesia that helps brands successfully widen and deepen their market penetration. We have focused on building a reseller model as shared infrastructure and assets for brands to go to the neighborhood, especially in lower-tier cities. We empower individuals, 70% of them women, to join Evermos as resellers and become entrepreneurs, earning income for the family. We help local emerging and challenger brands scale their businesses.

The data team at Evermos strives to create value and impact for businesses and end users using data and analytics through impactful insight analysis to make decisions, enable use cases using data and machine learning/AI, empower stakeholders to utilize the data, and provide trustworthy data securely and reliably. We are a team of data scientists, data analysts, BI engineers, and data engineers. We believe in continuous learning and improvement.

Evermos Analytics Funnel, Aiming for Business Impact

The Classic Problem: Too Much to Do, Too Little Time

As Evermos pursues more business initiatives, this imposes a new challenge to the data team: how can we keep up with providing relevant data and insights in a timely manner? Contrary to hype, the fundamental needs of every business are not AI, GPT, or advanced analytics algorithms. Every new business initiative will need basic data reporting or dashboards as a starting point [1]. Not sexy, but that’s the reality.

Without solving the basic reporting needs, the data team will never have the leisure time to focus on deep insight analysis. This is a big problem for two reasons. First, the data team’s unique value proposition is impactful insight analysis. If we don’t have time to do the analysis, then we might lose some opportunities or overlook some risks. Second, as the team grows in terms of knowledge, we need more challenging tasks to prove that we are able to create values and deliver impact using analytics.

In addition to that, the ratio of the data team to the overall workforce is often highly imbalanced; most companies have a ratio of 1–5% [2]. This means a longer lead time from data request to finish due to data team resource limitations. As the business becomes more and more agile, this is yet another bottleneck to be solved.

The Self-Service Analytics Projects

Despite the high demand for data and after we eliminate requests that are not valid (e.g., those that do not align with business objectives), we can classify the complexity of data needs into the categories below:

  1. Simple: reporting or aggregation on 1–2 tables; request to add more filters.
  2. Intermediate: Reporting requests with more complex needs, such as multiple joins and aggregations.
  3. Advanced: Analysis requests such as correlation or diagnostic analysis

The bulk of requests come from simple and intermediate categories, especially when we have new features or business initiatives. We believe that given the right tooling, training, and process, we can enable our users to do simple tasks, even intermediate ones. Starting with those hypotheses, we started the Self-Service Analytics Project.

The scope of the self-service analytics project is to prepare the necessary technology and processes required to enable users to analyze data by themselves. After we prepare, we then train the users to use the technology and follow the process. Our target users are the business team, product team, and engineering team. The business team, especially business operations, needs to monitor some metrics from day to day. The product team is visioning products and building features to improve some metrics. The engineering team is implementing features; some of them require data and analytics.

The self-service analytics project execution can be further divided into some aspects: 1) people; 2) technology; 3) process; 4) data.

The People: Series of Workshops

There’re a lot of things that we need to empower our stakeholders to do self-service analytics in our business intelligence tools, Metabase. So, we designed a 4-series workshop to equip our stakeholder readiness for doing self-service analytics.

  1. Introduction Datahub and Metabase
  2. Intermediate Query Builder
  3. Basic Query
  4. Intermediate Query

Here’s an overview of the topic for each series:

  • 1st Series : Introduction Datahub and Metabase
Title slide of the first series workshop

We share some topics. Firstly, what, why, and how do we use the data catalog and introduce DataHub as our data catalog platform? Secondly, introduce Metabase as our BI tool, and Lastly, explain the basic query builder in Metabase. Using a query builder, it’s possible to create our report’s stakeholder just by few clicks button without complex SQL syntax 🙂

So in this series, our objective is to make sure our stakeholders understand the importance of Datahub, how to use it, and, at least, that they can do simple reporting that can be solved by the basic query builder.

  • 2nd Series : Intermediate Query Builder

In the second series of workshops, we share how to solve more complex study cases that can probably be solved using a query builder. Such as how to use join, create new columns using logic case when, date diff, and so on. In this series, our objective is to make sure our stakeholders can create a more complex study case using the query builder.

  • 3rd Series : Basic Query

In the third series, we share basic knowledge regarding queries. We consider it’s not easy to make a non-tech-division familiar with SQL. To make sure our workshop is effective, we share some reference links to read first before joining the workshop session. We explain the basics of SQL, select statements, filters, sorting, and basic aggregate calculation. And for additional planning, to make sure our stakeholders understand the sessions easily, we create some study cases for exercising that relate to the common report’s business team.

  • 4th Series : Intermediate Query

Last but not least, we held the final session of this series regarding more complex query knowledge, such as aggregate calculation, join tables, common table expressions and subqueries, conditional statements using case when, and some modifying syntax and handling dates. At the end of the session, like the previous session, we held a hands-on session to validate the understanding of the participants.

The Data: Easy to Discover, Easy to Use

To ensure our users keep using our data, we need to make the data easy to use. For example, if certain data is frequently joined, we can do pre-join processing to make it simpler for our users. In order to avoid duplicate data and visualizations, we also need to make it easy for our users to discover existing data and visualizations. To solve this problem, we use DataHub [3] as our data catalog. By using DataHub, users can search for data first and then do self-service if the data is not available yet. In DataHub, we are using Glossary to track our key business metrics and a dashboard for each (it feels like it is worth another post for the data catalog project).

Datahub Search and Filter Illustration. Source: Datahub

The Technology: Database and Visualization

From a database and visualization point of view, there are three most important things you have to consider when doing self-service analytics. First is security; we have to ensure the least privilege principle: only those who need access will be able to access the data. We also need to ensure personal data is secured by the initiatives. The second aspect is cost; you have to monitor the cost, especially when you are using flexible load infrastructure such as pay-as-you query models or auto-scale machines. The last one would be workload. You have to secure important workloads such as daily data processing from self-service queries. Especially as they tend to be less optimized compared to the ones written by the data team. You will need to monitor for queries with outliers.

Cost Monitoring Example on Snowflake Database. Source: https://medium.com/snowflake/snowflake-cost-management-overview-59c8b125e7df

The Process: Reviews and Conventions

One crucial aspect to consider in this initiative is ensuring the validity of reports generated through self-service. The challenge extends beyond the technical aspects, encompassing not only proficiency in utilizing query builders in Metabase and SQL but also a deep understanding of our schema tables. Knowledge of our data structure is an essential requirement for conducting effective self-service analytics.

In order to solve the problem of validity, we create a user guide that must be followed when stakeholders create their own report by themselves. One of the steps is that the stakeholder must submit a form request review. And to consider whether the metabase that is created by the data team is created by the non-data team, we have a naming convention as well in the question title using the prefix [SS] to mean self-service.

Example of title name using naming convention [SS]

Then, to help stakeholders follow our guidance, we’ve already documented the User Guide for Self-Service Analytic, as follows:

User Guide Documentation

Lessons Learned & What’s Next

What went well

  1. We can distinguish the number of questions that come from this initiative or not. Currently, we already have proper dashboard monitoring.
  2. All teams will at least have representatives that use self-service analytics in Q3 2023. By number, more than 10+ team members create their own reporting.
  3. More than 70+ questions were created by stakeholders, and 200+ questions were edited independently by stakeholders in Q3 2023. It reduces a lot of potential ad-hoc tasks from stakeholders without self-service.
  4. We’ve already created a user guide for self-service that can be easily followed by stakeholders when doing self-service analytics.
  5. Self-service works better when we work with a business analyst counterpart from the business team or when we work with a PM or business who is quite techie and understands technical

What can be better?

  1. Consistent “power users”, unfortunately, not many people actually do self-service frequently. We believe that we can pursue more potential stakeholders for doing self-service consistently in the future.
  2. Based on our monitoring, we still found that in some cases, they have already been released but haven’t yet followed our guidance properly.

Next plan

  1. Include a documentation slide and video on our self-service workshop in the onboarding journey for new joiner Evermos.
  2. Improve the number of power users. Instead of doing workshops with many audiences, we prefer to empower selected candidates personally (1-on-1 session), such as the business analyst team and/or stakeholders, who request data frequently.

Join our Team

Thanks for reading until the final section. Yes, we are hiring! Team Data at Evermos strives to make an impact on the business to grow UMKM using data analytics and machine learning. We are passionate about growing juniors through on-the-job coaching with our industry experience and the tremendous variety of data we have. Look for the openings here.

Data Team Outing, August 2023 at Dago, Bandung

References

[1] https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007

[2] https://www.synq.io/blog/data-team-size-at-100-scaleups

[3] https://datahubproject.io/

--

--