Introduction — Know Your Data (KYD)

Sumit
SPARK
Published in
2 min readJan 1, 2019

In this new era of artificial intelligence and machine learning, you need not be expert to realise that data is the new diamond, more the merrier. But unless your diamonds are shiny, polished, multi-faceted and easily accessible with low maintenance they are nothing more than just a piece of rock.

In my previous organisation (to be called organisation from now on), we analysed millions of user events per day and produced shiny multi-faceted analytics from it. As we know that diamonds come in all shapes and sizes, so they should be maintained based on their potential to fetch money or in simple business terms their ROI.

In my current organisation, we handle data of size in terms of billions of events per day mainly comprised of communication logs between different microservices.

On another note there is nice must-read series of write-ups from Ankit Sobti (CTO, Postman) on how Postman implemented Microservices. Here is link for first one.

Handling data at this scale and complexity, results in plethora of complications and problems. I will write about our approaches to solve these problems.

SPOILER ALERT

The solution always lies in the direction realised only after knowing our data.

--

--

Sumit
SPARK
Editor for

Leading Runtime engineering team @ HackerRank, with ~8 years experience in Data, Platform & Backend domains with touch of Web Javascript frameworks.