After one year in business, I was excited that Fishtown Analytics was still alive and showing early signs of promise. After two years, I was excited that our way of thinking about analytics was starting to see some traction.

This week marks our third anniversary. And, wow…year three has really been something. Here’s a by-the-numbers:

  • There are 2,100 members of dbt Slack (~3.5x YoY growth).
  • dbt Cloud processed 140,000 jobs from 200 accounts in June (~4x and 3x YoY growth, respectively).
  There are now 3 cities—New York, San Francisco, and London…

Holy crap! Within the past week we’ve seen the acquisitions of the two biggest players in the modern BI landscape, Looker (announcement) and Tableau (announcement). And if you broaden your view to the entire analytics tech stack, it’s bigger: here are the major acquisitions I’ve tracked over the past year:

  • Alooma. Acquired by Google on 2/19/19 for an undisclosed amount.
  • Periscope. Acquired by Sisense on 5/14/19 for an undisclosed amount.
  • Looker. Acquired by Google on 6/6/19 for $2.6b.
  • Tableau. Acquired by Salesforce on 6/10/19 for $15.7b.

Compare that to the list of…

Step 1: Switch to Snowflake or Bigquery.

I joke, I joke. While Snowflake and Bigquery do have much more sensible approaches to dealing with semi-structured data, my guess is that you probably don’t have the luxury of making that switch in your current frenetic search to figure out how to deal with that nasty JSON array living in the varchar(max) field you’re staring at. Sometimes we have to use the tools we have!

What makes Redshift’s lack of an unnest, or flatten, function all the more frustrating is that Amazon’s other columnar SQL products, Athena and Spectrum, both have…

You’ve heard of Product-Market Fit. PMF has become common parlance within startup culture. It has even spawned similar terms, my favorite of which is founder-market fit—you get the idea.

Your startup is full of lots of “fits”. This is called strategy—making lots of interlocking decisions that make your organization uniquely suited to pursue a particular market opportunity. Here are some other example “fits”:

  Product/location fit. Does your product development require access to the unique San Francisco talent market or are there other…

Sinter just became dbt Cloud. Why the re-brand?

Tonight we put a permanent redirect from to Simultaneously, we launched a new design of the product: both a new user interface and some additional functionality.

I’ll let Drew dig into the design & functionality changes in his post, but I wanted to take a minute to give some context on the rename. Why are we doing this? What are the implications for Sinter customers? How will this impact the broader dbt community?

To do this, I have to go back in time a bit. Bear with me.

Feeling around in the dark.

Drew and I started working on dbt and Fishtown Analytics…

The role of the data engineer in a startup data team is changing rapidly. Are you thinking about it the right way?

I find myself regularly having conversations with analytics leaders who are structuring the role of their team’s data engineers according to an outdated mental model. This mistake can significantly hinder your entire data team, and I’d like to see more companies avoid that outcome.

This post represents my beliefs about when, how, and why you should hire data engineers as a part of your team. It’s based on my experience at Fishtown Analytics working with over 100 VC-backed startups to build their data teams, and on conversations with hundreds of companies in the wider data community.

If you run a…

The hardest thing about scaling a company is communication.

The more people that work at a company, the more nodes in the network, the more links between them. In a completely decentralized network, the number of connections is proportional to the square of the number of nodes. In a hierarchical network, each node is only connected to X peers, which limits complexity but also significantly decreases the flow of information. The reason why messaging platforms like Slack are so important today is that they reduce communication friction, allowing companies greater flexibility and/or greater throughput in how they design their communications…

This week marks the second anniversary of Fishtown Analytics. A year ago, I wrote One Year In: We’re Still in Business. I had pretty serious impostor syndrome when looking back on the first year of Fishtown Analytics—I remember just being shocked that people were actually paying us, that growing numbers of awesome companies were actually using dbt, that we actually continued to exist at all.

That voice in my head is still there—I don't know that it ever will really go away! — but it's quieter than it was a year ago.

I just went through the process of converting 25,000 lines of SQL from Redshift to Snowflake. Here are my notes.

At Fishtown Analytics, we provide analytics consulting for venture-funded startups. Our clients range from A-round funded companies finding product-market-fit to late-stage companies in hyper-growth mode.

Because of this range of client profiles, we get exposure to data stacks of companies who are just starting out with analytics all the way up to some of the most sophisticated data organizations in the world. And the biggest change we’ve seen in this cohort over the past two years is a shift towards Snowflake and away from Redshift.

We are wholeheartedly in support of this shift. For a large number of reasons, which…

This is a very stupid problem. I am not writing this post because it’s a fascinating topic—rather, I’m writing it in the hopes that you avoid the headaches that I’ve gone through scouring the internet for the best answer to this question:

Let's say I have a Redshift table users. This table gets loaded via some process I don't control. It contains a field, amount, that gets loaded as a varchar when it should really be an int. There are a very small number of records (< .01%) that are not valid integers, and so simple ::int fails.

