Two years ago, I was listening to Dylan Baker — my future co-founder — speak on a panel at a meetup. Dylan was venting about a frustrating experience LookML developers encounter daily: the drudgery of manually testing dimensions and measures from the Explore page to make sure they don’t error.

Dylan sighed, “I would pay for a tool that would go through each explore, click on every dimension, and run it to make sure there are no SQL errors.”

After the meetup, I found Dylan and said, “That tool you described… I think we can build it. Should we…


Building a data-driven desserts business with Looker

Image for post
Image for post

This post is adapted from an article I wrote for Looker about increasing data literacy and usage at Milk Bar.

“Wait, why does a bakery need a data engineer?” I get that a lot. I’m a data team of one at Milk Bar, the popular dessert brand by chef Christina Tosi, of Chef’s Table and MasterChef fame. In addition to sampling literally every cake, cookie, or truffle that our R&D team sends over to our office, I’m responsible for wrangling information across our omni-channel business. …


Keep calm and use lots of data validation

Image for post
Image for post

I occasionally need to grant a non-technical colleague the ability to input information into our data warehouse on an ad-hoc basis. For example, our customer service team at Milk Bar maintains a list of special wedding cake orders in Google Sheets that we need to collect data from for downstream calculations.

This is a tricky problem for a data engineer — my colleagues don’t have the technical skill to interact directly with our data stack, and I don’t want to have to support my own web form or similarly involved infrastructure to collect this information. …


Implementing a probabilistic model for customer lifetime value

Image for post
Image for post

When it comes to customer lifetime value (CLV), most people are doing it wrong, according to Wharton marketing professor Peter Fader. At face value, CLV is an easy concept to understand —it’s a measurement of how much a business’s customers are worth over their lifetime. In practice, it’s deceptively hard to implement in a way that accurately captures the variation in customer behavior. CLV is so valuable to every business that it’s worth putting in the time and study to estimate it properly.

To help you estimate CLV the right way, we’ll walk through the formal definition, examine the pitfalls…


Using Looker’s API and Airflow to send feedback request emails to users

Image for post
Image for post

At Milk Bar, we use Looker to serve up business intelligence across our company. Looker is our data buffet, and I expect our department heads to be able to self-serve the majority of their data requests using Looker. It’s been immensely popular, but I’ve also noticed that some people are slower to adopt Looker than others. Why might this be?

Looker provides some helpful usage charts in Admin > Usage via the i__looker Explore. Using this Explore, I can monitor usage across the company, but it’s hard to know why someone with low usage is not using the tool. …


How to develop essential data skills by tackling interesting projects

Image for post
Image for post

There’s no time like the present to teach yourself data science, analytics, or engineering. A quick search on Udemy shows over 2,000 results for courses about “data.” People have even compiled their own Master’s degree programs in data science comprised entirely of free online courses.

In my experience as a self-taught data engineer, taking dozens of massive open online courses (MOOCs) is not the best approach. It didn’t work for me.

I didn’t have hours every night and weekend to spend studying. The lectures didn’t feel practical enough to launch me from a non-technical field to a job in data…


How Pandora’s method and a $3 pack of sticky notes made stakeholder management a breeze.

Image for post
Image for post
Sticky notes: $3. Stakeholder buy-in? Priceless.

Army of one

Like many data professionals at small and mid-size companies, I’m a data team of one at Milk Bar. As the first data hire, I’ve had the rather terrifying privilege of building our data stack from the ground up. I’ve spent the first few months of my time here building data loaders, modeling our data in BigQuery and dbt, and deploying and training our teams on Looker. Now that our data stack is functional, it’s time to plan for next quarter.

The value of business intelligence and analytics is quickly becoming apparent at Milk Bar, and more people are coming to…


Image for post
Image for post

Toward a testing philosophy for the data warehouse

Over time, software engineers have developed a strong philosophy for testing applications. Concepts like unit testing, the test pyramid, code coverage, and continuous integration have made application testing robust and have established solid design patterns. Good testing practices are taught and practiced in most computer science programs.

In my experience, a unified testing philosophy is missing in the data world. As a data professional, I tell people that my goal is to provide accurate and timely information to enhance decision-making. However, if I supply our decision-makers with inaccurate data, they might make far-reaching, strategic mistakes. If our website goes down…


BuzzFeed’s data is awesome, but can we collect it in a format that makes it more useful?

Image for post
Image for post
Photo by Dane Tewari on Unsplash

tl;dr: BuzzFeed published an interesting collection of data today — disciplinary case files for about 1,800 New York Police Department (NYPD) employees who were “accused of misconduct.” I wrote a scraper to download the data in PDF and plain text format for large-scale analysis.

Unfortunately, the case files aren’t stored in a way that makes large-scale analysis very easy. Each case file is stored as a separate PDF, but there’s no clear way to download all of them. The raw text is stored behind a tab in a JavaScript interface.

Josh Temple

Co-founder of @SpectaclesCI, analytics engineering @Spotify

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store