Tagged in

Big Data

datamindedbe
datamindedbe
Making Data Delightful
More information
Followers
967
Elsewhere
More, on Medium

Organize your data lake using Lighthouse

Lighthouse is an open source library (using Apache Spark and Scala) that we developed at…


Joining Spark Datasets

Ever wanted to do better than joins on Apache Spark DataFrames? Now you can!

The new Dataset API has brought a new approach to joins. As opposed to DataFrames, it returns a Tuple of the two classes from the left and right Dataset. The function is defined as


Connect to AWS Athena using Datagrip

Datagrip is a great database IDE, just like the other IDEs from Jetbrains: Pycharm, IntelliJ, … In this blog, I describe how to connect an AWS Athena database to Datagrip, for my own reference and hopefully helpful for someone else as well. This assumes that you…