Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Spark vs Pandas, part 3 — Scala vs Python

11 min readOct 26, 2020

--

Photo by Timothy Dykes on Unsplash

In this third installment of the series “Pandas vs Spark” we will have a closer look at the programming languages and the implications of choosing one.

Originally I wanted to write a single article for a fair comparison of Pandas and Spark, but it continued to grow until I decided to split this up. This is the second part of the small series.

What to Expect

This third part of the series will focus on the programming languages Scala and Python. Spark itself is written in Scala with bindings for Python while Pandas is available only for Python.

Why Programming Languages matter

Of course programming languages play an important role, although their relevance is often misunderstood. Having the right programming language in your CV may eventually be one of the deciding factors for getting a specific job or project. This is a good example where the relevance of programming languages might be…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Kaya Kupferschmidt
Kaya Kupferschmidt

Written by Kaya Kupferschmidt

Freelance Big Data and Machine Learning expert at dimajix.