New Data Is Like a Christmas Present*

Michael Elashoff
2 min readJul 28, 2022

*some assembly required

I’ve been a data scientist for more than twenty years. One thing that never gets old for me is the excitement of a new dataset. The first data I worked on after graduation was at FDA, reviewing clinical trials for a shingles drug. Trying to quantify the risk/benefit ratio required a deep dive into the patient data, spread out over sixty different tables. There have been many datasets since then. And new data still feels like a Christmas present waiting to be unwrapped.

The process of opening it up and exploring the dataset, and ultimately using it to try and solve some clinical or scientific question, remains the most enjoyable part of my career. But it goes hand-in-hand with the most frustrating part: finding out that, once again, the data needs to be properly put together, standardized, and cleaned, before I can play with it.

The first handful of times, that work doesn’t seem so bad. The process of fixing up a dataset has the beneficial byproduct of learning how all the parts fit together. But much of the work is just drudgery without insight. Dates are out of order again? Check. Units are not standardized again? Check. Missing values are coded haphazardly again? Check. It’s like the worst kind of Ikea furniture where the parts are mis-labeled, some of the screws are the wrong size, and you’re pretty sure a key piece must have fallen out somewhere on the trip home.

This is the origin story of Cornerstone, its reason for being. To enable data scientists to unwrap their new data and start using it as quickly as possible, with confidence that it has been put together to very high standards. But to not sacrifice the data insight part of the process along the way. It was important to us that the software show all its work, essentially creating a clear set of assembly instructions to arrive at the finished product.

If you want to try this out on your data, let us know. Our software can help turn it into an analysis ready database. And my co-founder Andrew builds his own furniture, so we might be able to assist on your Ikea dresser as well.

--

--