Tidy Data with Java & Jupyter

Gary Sharpe
6 min readNov 25, 2021

Exercising Tidy Data concepts and practices using Java on Google Colab (Jupyter)

Tidy Data For Efficiency, Reproducibility, And Collaboration, Julie Lowndes And Allison Horst, October 12, 2020

This paper does not include a thorough explanation of Tidy Data. There are countless blog posts, articles, and videos that already do an exceptional job of just that, including the original paper on Tidy Data written by Hadley Wickham. It was fair to say, at the time of Wickham’s publication that…

A huge amount of effort is spent cleaning data to get it ready for analysis, but there has been little research on how to make data cleaning as easy and effective as possible.

Tidy Data by Hadley Wickham

To the benefit of the Data enthusiastic community at large, this is arguably no longer the case.

For a more thorough explanation of Tidy Data, though, I suggest reading the original paper, or any of the following articles published on Medium:

What is this Article About?

What this article hopes to add to the current body of work on the subject are examples of the most common Tidy Data concepts and practices using Java and Jupyter Notebooks.

--

--

Gary Sharpe

I build back-end systems for moving, munging and synchronizing data from one end of the enterprise to the other.