Build an effortless (and sexy !?) Data Governance Strategy?

Xavier de Boisredon
CastorDoc
Published in
2 min readSep 29, 2020

Data governance sounds boring… No one ever wants to deal with it. Yet, when you dive into it, you can make it painless and against all odds … quite fun!

Let me take you through the crafty, yet sexy, things you can work on to make your Data Privacy Officer’s life easier:

  1. Auto-classification of personal information (PII) Data: your customer might give you their email, birthdate, credit card information, or worse access to their medical data. You can work on Machine Learning algorithms to automatically scan and detect those PII fields. Of course, if you auto-classify tags, let people know that it was a computer-powered PII tag so they can correct the mistakes.
  2. Automate Data Access With Tags: data consumers are numerous. It is never obvious to provide a well-designed access control policy for data people. How about defining access based custom tags such as business-related metadata, technical metadata, or security classifications? This enables you to build a complex access right system, without having to maintain a huge number of access rights.
  3. Automate lineage generation: if you are working with modern data warehouses and data visualization tools, chances are that you can automate lineage generation through SQL parser. You might need to have proper ELT processes in place though.
  4. Build a Propagation Algorithm: if you managed to overcome the first steps you are now able to classify PII, tag data assets, and build the lineage programmatically. This means you can automatically ensure that every table or column that is derived from a column tagged as sensitive inherits the same classification and security controls. In an ideal world, you could propagate column definitions based on the lineage.
  5. Make sure no one ever accesses sensitive data: now, in the past or in the future. I reckon that maintaining by hand a record of all data users can be a pain. What if you could build an audit log of all users and match it to the roles defined in the warehouse? How about automatically flagging accesses that haven’t in the past been consistent with the new data policies?

Well, I agree that this is a lot of work for a few data engineers/scientists. And often, Data Privacy Officers don’t have the budget, the skills, nor the time to build all this complex automated process. But, god ! that would make data governance so much easier.

Luckily, we are working on Data Governance Automation for you at Castor. Join our waiting list to beta test the latest version of Castor.

If you are interested in working on such topics (internship, research project, full-time job, remote) or if you want to try out our product. You can reach out to me by email: xavier@castordoc.com

--

--