Sign in Get started

Large-scale Data Processing

Experiences with, stories about and insights from using frameworks for large-scale data processing and distributed applications: incl. but not limited to Hadoop, HBase, Storm, Drill, Spark, Mesos, YARN, Myriad, etc.

Latest

How to kick-start Spark development on IntelliJ IDEA in 4 steps

How to kick-start Spark development on IntelliJ IDEA in 4 steps

I’m gonna walk you through the process of how to set up your environment in order to develop a Apache Spark application using Scala in…

Michael Hausenblas

Feb 8, 2015

Dataspaces have arrived

Dataspaces have arrived

It took less than 10 years to deliver on Alon Halevy’s vision.

Michael Hausenblas

May 27, 2013

Serendipity

Serendipitous discovery in distributed real-time query engine development.CERN team deserves attention and credits.

Michael Hausenblas

May 22, 2013

Ubi fumus, ibi ignis.

Ubi fumus, ibi ignis.

Thoughts on TCO of data-driven business decisions.

Michael Hausenblas

May 20, 2013

On infrastructures and applications

On infrastructures and applications

What do roads & cars, the electricity grid & TVs, the Internet & Facebook all have in common?

Michael Hausenblas

May 7, 2013

About Large-scale Data ProcessingLatest StoriesArchiveAbout MediumTermsPrivacyTeams