Unlock your Open Data Lakehouse — join us at PrestoCon 2023

Ali LeClerc
Presto Foundation
3 min readNov 14, 2023

--

We’re so excited for 2 days of all things Presto, the open source SQL query engine for data analytics & the Open Data Lakehouse

Over the last few weeks, I’ve had people reach out to ask more about PrestoCon 2023, so I figured I’d write a blog to share my thoughts. First, a quick overview:

When: December 5–6th, 2023 at the Computer History Museum in Mountain View, CA

What: Hundreds of developers and software engineers will come together at PrestoCon 2023 to support and learn more about Presto, the open-source SQL query engine for data analytics and the Open Data Lakehouse

Why: We’re excited for the future of Presto! For the first time we’ll be highlighting Presto on Velox (code-named Prestissimo), the C++ native worker for Presto that brings a huge performance boost to Presto. Bytedance will be sharing some new benchmarks on this 👍

How much is it? $100…but since you’re reading my blog, here’s a code to get your pass for $50: PRESTO50 (register here)

So really, why should you consider coming? If you’re involved in data analytics or data infrastructure, PrestoCon is a must-attend event, and here are 3 reasons why:

  1. Prestissimo. Did I mention we’ll have a full track on it? 😁

If you’re not familiar, Prestissimo is Presto on Velox (OS project out of Meta), a native C++ worker for Presto. A lot of work has been going into Prestissimo over the last year and we’re really excited to show you the latest advancements. Leading contributors from Bytedance, IBM, and Meta will be sharing their work.

This is an opportunity to see first-hand some of the cutting-edge innovation that’s going on in Prestissimo as well as some initial benchmarking results that have not yet been publicly shared with the community! If you’re new Prestissimo, IBM will present an introduction to it. Key sessions include:

  • Bytedance — A Journey of Evaluating Prestissimo against PrestoDB on TPC-DS
  • Meta — Prestissimo: A Year In, the Path to Veloxification
  • Bytedance — Batch-size in Velox aggregation

2. Free Hands-on Presto workshops

You read that right — when you purchase a pass to PrestoCon, you’ll get the chance to register for our Presto workshops for free (while availability lasts!).

This year we’ll be hosting 3 workshops:

  • Getting started with Presto: Get Presto running locally on your machine, connect data sources, run queries. This is a good workshop if you’re new to the Presto project.
  • Building an Open Data Lakehouse with Presto and Apache Hudi: The data lakehouse brings the flexibility, scale, and cost management benefits of the data lake together with the data management capabilities of the data warehouse. Architect the building blocks of a data lakehouse with Presto, Apache Hudi, and AWS S3 in this workshop.
  • Getting started with Prestissimo on Docker: This is an intermediate-level workshop for developers and engineers who want to get their hands on Prestissimo. It’s recommended for folks who are already familiar with Presto and know how to run distributed systems.

We intentionally keep these workshops small to ensure we can provide hands-on support, so I recommend you register soon if you’re interested in these.

3. Be the first to hear about new innovations in Presto and the Presto ecosystem

Engineers and data architects from Meta, Uber, Intel, IBM, Bytedance, Apache Hudi, Alluxio, and more will be presenting some really awesome sessions. A few highlights IMO:

  • Presto express — leveraging historical data to predict upcoming query execution times and optimizing cluster routing (Uber)
  • Statistics with sampling using Iceberg on Presto (IBM)
  • Optimizing Presto at Meta scale (Meta)
  • Learned query optimization in PrestoDB (Intel/UPenn/Technical University of Munich)
  • …there’s so much more, check out the full agenda

We put together a video to get you excited! Hope to see you there.

--

--

Ali LeClerc
Presto Foundation

Presto Community Chair, Product Manager at IBM. Chair of the #Presto Foundation Community team. Topics on #bigdata, #dataanalytics, #lakehouse, #opensource