Holden Karau presents Spark — Beyond Wordcount With Datasets & Scaling

Thomas Lockney
Nike Engineering
Published in
2 min readMay 2, 2018

by Thomas Lockney

As we discussed in our previous post, the Nike Tech Talks are a forum hosted by Nike Digital Engineering. We are fortunate enough to have access to a large community of engineering thought leaders and technology experts. At Nike and many other organizations, working with data at scale presents many interesting challenges. These challenges are often addressed with tools like Apache Spark —but with new tools come new considerations.

This past January, Holden Karau, Developer Advocate at Google, joined us to speak about all things Apache Spark. She looked at the challenges that come with scaling Spark jobs, as well as Spark’s new(ish) Dataset/DataFrame API and how it’s evolving in Spark 2.3 with improved Python support. After the initial excitement, Spark’s shiny finish may start to wear off and have you wondering if you’ve accidentally deployed a Ford Pinto into production. If you’re already a Spark user, you will see why it’s not all your fault. If you aren’t already a Spark user, you will learn how to save yourself from some of the pitfalls once you move beyond the example code.

Holden Karau, Spark — Beyond Wordcount With Datasets & Scaling

Learn more about Holden Karau

Holden is a transgender Canadian open source developer advocate with a focus on Apache Spark, BEAM and related “big data” tools. She is co-author of Learning Spark and High Performance Spark. She is a committer on the Apache Spark, SystemML and Mahout projects. Prior to joining Google as a Developer Advocate, she worked at IBM, Alpine, Databricks, Google (yes this is her second time), Foursquare, and Amazon.

When not in San Francisco, Holden speaks internationally about different big data technologies, concentrating mostly on Spark. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal. Outside of work, she enjoys playing with fire, riding scooters and dancing.

You can find Holden on Twitter, Medium, and Github. Also, you can check out her newest book, High Performance Spark, for more information!

--

--

Thomas Lockney
Nike Engineering

Dog lover and engineering leader @ Nike. Host of the Nike Tech Talks.