Pleasures and Pitfalls of Looker

Tl;dr — Looker is great at getting data into our internal users’ hands here at Earnest, but we’ve realized we need to be careful about how we use it.

Pleasures

Looker has helped us discover and share information at Earnest. Here’s how:

The need

We don’t have an army of analysts at Earnest — we’re still operating as a lean, young company. As a data-driven company, though, everyone in the business needs data to inform decisions, build models, or generate external reports. How do we fulfill our myriad needs for data? We could just throw more people at the problem, and this would help, but 1) it would slow us down — there’d be an extra layer between the data and the people who need it, and 2) it’s not a scalable solution that will fulfill our aspirations to grow exponentially. Instead, we primarily use Looker.

Looker democratizes data

Looker provides a graphical interface that lets non-technical business users access data, extend an existing analysis, modify a live dashboard, or perform an ad-hoc investigation — all without needing to involve a PM/engineer/analyst to write raw SQL. Essentially, Looker allows technical folks to predefine all the joins between tables and columns so that end users can pick the dimensions and measures they want to chart, filters, etc., without needing to write SQL themselves.

Easy sharing

Productionalized “Looks” (essentially pre-written SQL queries) live in personal or group “Spaces”. Saved Looks can easily be shared (nice and clean like https://looker.example.com/looks/1095) or browsed in the public Spaces hierarchy, which is especially helpful for new employees. Ad hoc analyses can be shared just by copy/pasting the URL:

Everything is magically saved in the URL — copy/paste shares it all

Extra bonuses: any LookML (the code we write that tells Looker what data we have and how to join it together) changes are automatically checked against all saved Looks with the LookML validator.

Quick and easy to update

The LookML editor makes it pretty straightforward and quick to update our queries to incorporate any additions or changes to our underlying data model. It also allows for a clean and resilient library: field descriptions can reference other fields, so your LookML library is cleaner and less prone to incomplete proliferation of updates than pure SQL would be.

SQL Runner

SQL Runner is a great tool for quick ad hoc queries: technical users can write arbitrary SQL in the Looker environment, which is 1) in a browser, where we all spend most of our time anyway, and 2) lightweight. The latest version made it even better, with a query history tab and convenient buttons for table descriptions. It’s often the fastest way to get a quick answer to many questions and is quickly becoming a realistic alternative to dedicated SQL tools.

Pitfalls

Here are some gotchas we’ve run into — I hope this may help others avoid them.

It’s hard to stay in sync

We can’t synchronize data model changes in our underlying system with LookML changes — if the right people realize the need, and prepare a change beforehand, it’s possible to make them soon after the underlying data changes. Even then, we can’t simultaneously deploy — not a huge deal, but hurts availability. It’s also pretty tough to test before code goes live.

Speaking of testing…

Our engineering team has a battery of continuous integration tests that builds must pass before being deployed. It’s easy to know if a change you made in one part of the application will break things in other parts.

Unfortunately, similar tests are not performed with respect to Looker, leading to some panicked occasions when things break. Even worse, these breakages can fail silently. If you’re lucky, a Look breaks loudly because the data it used no longer exists: the user sees an error and can sound an alarm. If fate conspires against you, the meaning of an underlying piece of data changes but its name does not: Looker happily constructs its output but users see wrong information until the problem is discovered, usually by double checking suspicious-looking data. This is obviously worse and leads to confusion and heartburn for all involved.

Both syncing and testing issues could be solved if we were able to combine LookML changes into our product pull requests and those changes were automatically tested against both our product and Looker expected output.

It’s too good

We have many extremely bright, resourceful people at Earnest (and are looking for more!). Easy access to all our data empowers them to solve whatever problems they run into, which is great. The other (pernicious) side of this coin is that it becomes all too easy to rely on Looker for more than its intended function as a business intelligence tool.

We’ve had times when Looker fills a gap and normal business processes overly rely on it. In the short run, the job gets done, but this is dangerous territory: 1) Looker is subject to the risks mentioned above, in ways that are acceptable for a BI tool but unacceptable for production work, and 2) if business processes exist outside our core system, we don’t capture data in a structured way — as a data-driven company, this doesn’t work in the long run.

We’d often be better off if our product teams got the signal earlier on that XYZ data was needed, so that we could build it into the product. We have taken steps to make sure that this now happens, and we all now have a sixth sense that says, for certain tasks, “We *could* do this in Looker…but we probably shouldn’t.”

TL;DR again

Looker is a powerful tool that allows Earnest employees to make smarter, more informed decisions without a large in-house analytics team. We’ve realized, though, that “with great power comes great responsibility”, and we now make sure to only rely on it for the right things. We’ll continue to focus on how best to use it.


Keith is a Product Manager at Earnest and was previously a PM at PayPal, an Investment Associate at ValueAct Capital, and a Business Analyst at McKinsey & Company. He has a degree in Electrical Engineering from Princeton University and an MBA from the Kellogg School of Management.


P.S. If anyone from Looker is out there…

A couple little quibbles:

  1. The search box in an Explore gets slowww when there are many joined views with a lot of fields. Speedups here — or at least a spinner to amuse us while we wait (and let us know things are working) — would be great.
  2. Could we get a “copy to clipboard” button in SQL Runner? Things get a bit weird at the moment…
Argh whyyyy