DS, CS, Big Data
Geolocated data exists across a wide range of industries: Travel, telecom, finance, marketing, advertising, manufacturing, etc. Applying machine learning techniques to such data makes it possible to identify…
This post is a reference and solution to issues I had trying to make these technologies work on EMR or even instances. By technologies, I mean — Jupyter notebooks, Spark, Hadoop, Hive etc.
Let’s build an end-to-end data pipelines with: