Working With Data

Surajeet Bhuinya
DataPebbles
Published in
3 min readJul 5, 2020

We don’t have to emphasise on the importance of data anymore, frankly it would be waste of time. We should now put our focus and energy into harnessing data.

Working with data is challenging and needs skills and patience, but that makes it more interesting. Data potential is not yet fully discovered, we have done a lot but there are endless opportunities.

It’s a new challenge for industry and that brings innovative ideas and breakthroughs. As with anything new, we can always start from a blank page, but sometimes starting with guidelines give us the much-needed boost towards our goals.

To create insights from data we must understand it and refine it. We will see it’s a circular process, every time we refine it further, we understand it better.

These are a few key topics to keep in mind while working with Data:

Data Credibility

Data needs to be examined for its validity and correctness. All data sources must go through refinement to be transparent and verified for future uses. Any non-credible data source is a huge risk and decisions based on such data sources will probably bring business uncertainties.

Unless we act on it from the beginning, it can become a tiring, if not impossible, effort to do it later.

Data Interpretation

Human expertise is needed, to gain insights and applications of correct refinement processes. Useful metrics generation needs thorough profiling and classifications.

Such profiling processes provides insights which we can apply to our business decisions, hence goes without saying data interpretation is an important factor in data potential.

Data Governance

We don’t hear many people talking about data governance, at least not soon enough in the data process. This is a key point while looking at data from an organisation level.

While working with data we must adhere to certain standards and consistency which applies to the whole organisation. It has huge benefits to create a unified data layer across the organisation.

Data Quality

Credibility is mostly attached with the data sources, but how do we ensure that we keep the data credibility through the entire process of refinement?

One may say, once data is proven credible, it remains credible. Unfortunately, it is not always true. Once we start working with different data sources, we must ensure every newly created dataset keeps the credibility crown.

This mandates us to apply various cleansing rules and quality checks to ensure such quality.

Data Automation

Now, we can say it is challenging to work with data, it’s a lot to keep in mind and lot of steps to follow. But, hey, that’s what computers are good for. They do the heavy lifting for us and we should just focus what we can do better than any computer or other humans.

We should automate repetitive refinement processes and focus on data profiling and interpretations.

We cannot let us slow down by a process which can be automated.

It’s a century of information revolution, we cannot look away, but work with data.

--

--