API-driven Analytics in Data Science

How Data Science APIs help other teams to make data-driven decisions.

Erum
Better Practices
3 min readMay 20, 2019

--

Photo by Luke Chesser on Unsplash

Postman strives to use data in product decisions. In this article, I will cover our journey from traditional ways of dashboard development to API-driven dashboards for studying usage behavior.

The need for API-driven Analytics

The Data Science function, since its inception around two and a half years ago, has been catering to the analytics needs of different functions in Postman. Dashboards and reports constitute the major need across functions. This is how the Dashboard development workflow looked like when we started out.

Dashboard Development Workflow 2 Years Ago

It would take usually 2 to 3 iterations to develop a production-ready dashboard owing to back and forth discussions around evolving requirements. As the app complexity further evolved to cater to diverse needs of Postman community, more functions within the organization felt the need of using richer analytics to drive product improvements. We needed a scalable way of fulfilling this need.

Bringing the API way of thought to data science

The solution was to develop an internal Data API exposing a standardized set of product usage metrics for all functions within the organization (referred to as “business users” from here on). This would enable the business users to choose metrics of interest and create their own dashboards, thus eliminating back and forth discussions and enhancing reproducibility in dashboard development workflow. Consumers of the API won’t have to worry about data sources, metric definitions, relationships between the tables, learning a scripting language anymore.

Building an internal Data API

We achieved this goal of building our internal Data API with the help of Looker’s Explores. A Looker Explore contains the reference to calculation logic of metrics to be exposed along with the relevant source tables and relationship between them. We identified a comprehensive set of metrics regularly needed within the organization and exposed them through these Explores. These explores now act as the starting point of query for the metrics exposed through them. In addition, consumers can also add these metrics as graphs and other visuals in Looker Dashboard. The dashboard development workflow now looks like this:

Existing Dashboard Development Work Workflow with Data API

With the Data APIs in place, the responsibility of the Data Science team reduces to building fundamental APIs through explores.

API-first Data Science

The shifted focus of Data Science from traditional dashboard development to the development of a shared Data API that can be used throughout the organization has proven to be advantageous in multiple aspects. These include:

  1. Organization-wide availability of readily consumable metrics
  2. Standardized metric definitions to avoid confusion
  3. Single source of truth
  4. No need to understand the complex relationships between the database tables to fetch data
  5. Elimination of inefficient back and forth between the Business Users and Data Science team
  6. Removes dependency on Data Scientists as the bottleneck in the way of business users’ consumption of analytics
  7. Frees up the Data Scientists’ time for more interesting activities

The Data APIs are a major step towards API-first Data Science where we develop our APIs first and then build our reports and dashboards on top of it. Since we launched these APIs internally, Postman’s product, customer success, and sales teams have begun using them to build higher-level dashboards. These dashboards help them to identify trends and track progress of their respective functions. Best of all, they could do this without raising a request with data science.

--

--