Cheerz simple data stack

Technical insights about how we empower everyone at Cheerz to get access to business data.

Published in

Cheerz Engineering

4 min readApr 2, 2020

“Data”, “Data”, “Data” — You hear that every time and everywhere. Everyone realized how valuable are the loads of data any company owns. This certainly applies to our company. At Cheerz, we run three applications that produce customizable photo products in factories all over Europe.

At Cheerz we want to create the best, most magical, photo printing experience for our customers. We’ve been building up momentum over the past. Like many others, in the early days our main focus was on facing increasing demand by scaling Tech and ProdOps accordingly. That part is now working well and marketing has also scaled-up. Today we have reached the critical size from which profitability and durability of the Cheerz brand can be targeted. That’s what we aim for and data is going to help us.

As the company grows the amount of data surrounding ramps up too. On the first day that’s your apps lying alone on top of your DB. The next a shiny tracking plan adds up. Then your custom Production Management System (PMS) is launched, plus “the” brand new marketing tool. Finally comes the time when your founders realize that there may be something that could help you to grow healthier and stronger. It is not named yet but you are already thinking about metrics for information, efficiency, performance.

Ok, we can’t ignore these loads of information anymore but how to leverage that in the most fruitful way?

Which ROI I expect from the data?

To start look at the scale of data usefulness in a company like Cheerz. At entry-level 0 you do nothing with your data. At level 100 you have Data/AI in your product (means incredible UX), in your operations (ex. super-optimized processes) including marketing and in your business analytics (you know your trends, you anticipate, etc.).

From level 0, the first metrics you compute have a massive impact on your business, they put numbers on the strategy, quite often it’s an eye-opener. Then immediately comes the need for refinement, improving grain, quality, frequency. In our context — we don’t have a data-based product — we can reach the 60–65 mark with an easy to maintain, not too pricey data stack run by a single person and a simple motto: deliver a few accurate(-enough) KPIs for all teams, every day.

This is what we intend to do. Below is how we make data flows from many data sources to 60 data-hungry users of our BI tool.

Where does our data come from?

The sources we need to collect data from are (by order of appearance) :

Postgres DBs: Cheerz Apps & our home-made PMS
APIs: CRM, Apps Analytics, PSP, Social Media
Lots of Google Sheets and Files generated by our tools.
FTPs
Amazon S3 buckets: filled by 3rd party tools
Big Query: CRM and Tracking Events

How do we centralized & store data?

If you want to keep it simple, you need the right tool(s) to ensure smooth data collection and storage. It should be accessible to not-so-technical profiles to stay focused on business understanding. To do that we chose Panoply. It makes scheduled data collection and database management easy and reliable. In fact, you don’t do the management except from checking free storage space from time to time. Panoply runs (and optimize) your Redshift data warehouse in your place. In the end:

- We don’t have live data but hourly is good enough.

- It’s ELT instead of ETL, so we store raw data, which is better than no data.

- Panoply does not connect easily to everything, but no one does.

- Managed Redshift is not as shiny (and fast) as self-tuned-hyper-fast “other” tools but it’s the right compromise at this time of our story.

Panoply cares for our data and we concentrate on improving our business.

How do we interact with our data?

The last tool of our two-segments-only data pipeline is Looker.

Looker alone copes with our entire raw data. We use basic SQL and proprietary LookML to render all the business insights we need. Looker materializes Core BI tables with Derived Tables and creates most views on request.

This is where we add one key thing in our motto: empower your end-users. At Cheerz, each person having a named account can build his own Look to track business KPIs.

In that context, your data guy or data girl main subject is to prepare ready-made models and to focus on data quality and delivery. He can give a hand for complex analyses but business drives the roadmap, not technical issues.

Even if there is a long way to go to the 100-mark, every day we find new clues. They allow us to understand the business, get our processes better, grow the company while data is part of it.

Thanks for getting to the bottom of this article! Do you want to see more than what we’ve shown here? Come join the team or let’s just talk around a cup of coffee or a drink 🍻virtually or in our office?

PS. Thanks Ari Bajo for your wise comments.