Business Intelligence at TextNow
Here at TextNow, we take data very seriously. Data is used across the company to drive critical business, product and engineering decisions of new user experiences, new features development, improving reliability, quality, and forecasting growth. Streaming event logs are analyzed and made actionable using our custom Business Intelligence (BI) infrastructure.
“In God we trust. All others must bring data.” — W. Edwards Deming
Infrastructure
When building the BI infrastructure for TextNow, our goal was to build something that was self-managed, scalable and easy-to-use for our internal customers. As a result, we focused on the important components of the Data Pipeline, Data Warehouse & Data Visualization.
Data Pipeline
There are two main sources of the data: instrumentation in source code that sends events through AWS Kinesis Firehose, and production MySQL dumps via AWS Data Pipeline.
For streaming data we use AWS Kinesis Firehose, which is the easiest way to load streaming data into Amazon Web Services (AWS). It helps us capture, transform, and load streaming data into Amazon Redshift (with a hop in Amazon S3) enabling near real-time analytics and easy plug-and-play with our various internal visualization tools. Amazon Kinesis Firehose is a self-managed service that dynamically scales to the volume of our data based on peak traffic. It provides multi-day queue, batching, compression and encryption for granular manipulation of streaming data.
For MySQL data dumps and extracting data from 3rd party APIs we use AWS Data Pipeline, which helps us easily create complex data processing workloads that are fault-tolerant, scheduled, and highly available.
Data Warehousing
Our main data warehouse is Amazon Redshift which has a lot of benefits.
- Petabyte-scale data warehouse but still performant
- Using columnar data storage technology and massively parallel processing (MPP)
- Easy to unload/load various formats of data to/from S3
- SQL based
Cluster Specs:
- Nodes: 8
- CPU: 13 EC2 Compute Units
- Memory: 31 GiB per node
- Storage: 2 TB HDD per node
We also have a faster MySQL database in RDS that we use for summary tables and faster reporting of Key Performance Indicators (KPIs).
Data Visualization
Collecting high volumes of data is just the beginning. The real magic begins when this data is turned into insights via data visualizations. We evaluated a lot of existing tools in the market and finally decided to use Airbnb’s open-source data exploration tool called Superset.
We cloned their Github and brought up our own version in Amazon Elastic Beanstalk.
Superset provides:
- Easy-to-add databases using SqlAlchemy URI
- Intuitive drag-and-drop interface to build rich set of visualizations
- Built into statistical functions
- Extensible, high granularity security model allowing intricate rules
- Fast loading dashboards with configurable caching
Measuring Success
Here are just a few of the results we saw after implementing our BI infrastructure:
- With detailed Acquisition funnel analytics, we are able to stay on the bleeding edge of performance marketing and acquire users at scale — upwards of 4.5 Million new registrations in the last 3 quarters.
- With automated alerts for certain categories of spam, our response time to tackle specific types of abuse has decreased to just a few hours versus a few days previously.
- With customized sales, we increased premium conversion by ~15%.
Business Intelligence empowers us to measure and improve business KPIs, net promoter score (NPS), spam detection, app performance, new feature development, and many other things so we can continuously improve the user experience.
BI is part of Growth at TextNow, and we’re actively looking for web developers, engagement managers and data scientists to join our talented team in San Francisco. Click here to learn more about our career opportunities.