10 Rules for Data of Any Size (Big/Small)

John Thuma
Nov 6, 2018 · 4 min read

I have spent the past 30 years literally staring at data. I built my first analytics application on SQL Server 4.x when big data was measured in the megabytes. Today, big data is almost endless if you have the right tools and platforms. But tools alone are not enough to make a good data program. In fact the tools and platforms are the easy part! It is the humans and processes that make it challenging. The list below are my top 10 rules for data. They come from experience.

  1. Data is one of your organizations most valuable assets. If data is not treated like a core asset in your organization then you are missing out on an opportunity. Ask yourself a question? What is more important: the money you have in the bank or the data about the money you have in the bank? Data is one of the best raw materials and is a byproduct of your business or service.
  2. Do you know your legal obligations for data retention policies. Vertical industries have to keep data records around for a certain period of time. Some data if kept around longer than it is needed can be a liability. I am talking about regulatory liability but sure it can be a burden of cost too. Keeping too much data around, cold data, also competes with other more useful data.
  3. Strive to have a single source of the the truth in data. If you can help it do not make copies of data. Why do people make copies? Because the original system of record cannot keep up with other processing. Sure move data from an OLTP system to an OLAP system for analytics. But if you don’t have to move it to an OLAP Cube or an in-memory solution because your current analytics systems cant keep up it is time to rethink those solutions.
  4. Data Governance is everyones responsibility. First of all don’t call it ‘Governance!’ This word turns people off. Call it something else! A part of a good ‘Data Strong’ campaign is selling the campaign to others in the organization. If you believe data is one of your organizations key assets then having a ‘Data Strong’ set of systems and processes should also be critical to you!
  5. Data quality and data lineage are a must have, not a nice to have. Once data is broken it is always broken no matter what you do to it. Data is an artifact and can never be originally repaired. Sure you can update it but the original is the original if you follow a good data lineage program. Data quality should be a goal of every department and of every company.
  6. Data security and data privacy are must have’s too! There is nothing more embarrassing than a data breach that exposes customer or citizen data. Security is top of mind for all organizations. If you don’t have a CISO in your organization then you should consider hiring one. CISO-CHIEF INFORMATION SECURITY OFFICER.
  7. The Chief Data Officer is more important than the Chief Information Officer! Everything in computing revolves around one thing and only one thing: Data. Without data you would not need a network, servers, the cloud, or any of the software we purchase and produce today. So why has the Chief Data Officer just become popular over the past 5–10 years?
  8. Data is the pulse of your business. Data Flow is Cash Flow. It is a popular saying: Cash Flow is more important than Profit. Might be true. Are you measuring the real time rate of cash flow as it happens? Can you? Why aren’t you? Cash flow can be measured in electronic transactions and is definitely the pulse of your organization. What is one of the first things that a doctor does when examining you? They take your pulse!
  9. You should have Data Goals to reward good Data Behaviors. Do you have data goals for your organization? Do you financially incent people to have good behaviors departmentally? You can measure data quality so you can set goals to encourage good practices. The better quality in data the healthier your organization. It can also be game-ified.
  10. Real-time is a must. The big gold rush in data these days is the run to real-time. Why not? Do you measure the time value of your data? If I have to wait 6 hours for a batch of records to get to my reporting systems what is the financial burden? If I have data almost as it happens what can I gain? Real time is a really big deal!

I am sure you have your list or perhaps you think these are common sense items. However you feel I would like to read your thoughts. Enjoy!

Data Driven Investor

from confusion to clarity, not insanity

John Thuma

Written by

Data Nerd! Walking the Data wire for 30 years. If you are serious about data and analytics then I might be interesting to you!

Data Driven Investor

from confusion to clarity, not insanity

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade