Big Data: Spark, AWS & SQL

simple cloud computing with S3 & EMR

Mark Cleverley
The Startup

--

Few buzzwords garner so much attention as Big Data, and there’s good reason: the global economy runs on information, and we’re generating exponentially more of it every day. “Cloud computing” gets thrown around a lot as well.

But that’s not all, you say to the investors seated before you.
Our data — it’s Big. Someone raises an eyebrow. You hear a cough.
And it’s in the Cloud. The room immediately bursts into applause.

This is generally how Silicon Valley VC meetings go, or so I hear.

But Big things are intimidating, and I’m just a man with a Jupyter notebook and a dream. Yet I remembered:
“We are all in the gutter, but some of us are looking at the Clouds”. Oscar Wilde wrote that, probably.

So I set out on a journey upwards: googling “how to cloud” and then googling my many error messages until I arrived at something resembling a coherent setup process, which I will detail here — so that you may also stick your head in the Cloud.

What exactly is a cloud?

--

--

Mark Cleverley
The Startup

data scientist, machine learning engineer. passionate about ecology, biotech and AI. https://www.linkedin.com/in/mark-s-cleverley/