Databricks Q1 Roadmap: W2W4

Matt Weingarten
3 min readFeb 22, 2024

--

Created by Kyle Shanahan’s 2nd-half playcalls

Introduction

Each quarter, Databricks has a product roadmap webinar to build the hype for some of the latest and greatest features we’ll soon have access to. Here are my top takeaways from today’s unveiling.

Unity Catalog

Databricks is all-in on Unity Catalog and the future that it’ll provide in the quest to build a data intelligence platform. It probably doesn’t come as much of a surprise that a good portion of today’s discussion was dedicated to things coming our way in Unity.

For starters, Unity Catalog is now the default, meaning that any new workspaces will come with that catalog set up originally instead of having to do additional work to make that a reality. One of the cooler features that was mentioned was that Unity will be supporting external databases and warehouses. While having everything in Unity should suffice, it’s good to know that data sources like Redshift and even Snowflake will be covered as well.

In terms of performance, Parquet reads have been significantly sped up in Unity (up to 30x compared to before), which is why I need to run away from this post and get back to my team’s ongoing Unity migration (Delta or Iceberg hopefully coming soon, though). Materialized views will also soon be supported, which is another win for those looking to speed up expensive queries.

Unity embracing more and more data sources

Serverless

It feels like I’ve been writing about Serverless compute within Databricks for some time now, but they just keep adding more to the stack. Currently, the following features are in private preview:

  • Serverless compute for Delta Live Tables
  • Serverless compute for jobs
  • Serverless compute for notebooks

This could be a huge cost-win for us (and plenty of other teams as well) in the long run. Being able to properly predict what size cluster you need is an art more than a science; Serverless may just be the component necessary to really drive efficiency in bigger workloads.

Developer Experience

If you’ve noticed the developer experience smoothing out in Databricks, that’s because they’ve been putting in work behind the scenes to make this a more seamless environment. Some features that will help with this include:

  • Notebook debugging
  • Gantt view for task executions
  • Databricks Asset Bundles, including support in VS Code

Debugging has certainly been a helpful one (I’ve seen it come up a few times when running cells within my notebooks). The Gantt view, as Databricks jobs look to rival competitors like Airflow, will be extremely helpful as well. Asset bundles are a wonderful way to have proper code behind all workflows within your workspace.

Databricks is becoming the ultimate platform for all personas related to data. Engineers/developers are really starting to see the benefits of these features with more and more usage.

Conclusion

I will definitely be reporting back with my results from implementing certain features here over the next few months. I always look forward to seeing what the Databricks team is coming out with; great stuff as usual!

--

--

Matt Weingarten

Currently a Data Engineer at Disney Streaming Services. Previously at Meta and Nielsen. Bridge player and sports fan. Thoughts are my own.