What is data governance?
“…a data management concept concerning the capability that enables an organization to ensure that high data quality exists throughout the complete lifecycle of the data… The key focus areas of data governance include availability, usability, consistency” Wikipedia
In short, and how I tell my clients, the goal is to have trustworthy and usable data.
I think we all know what trustworthy data.. or do we?
Let me tell you a short story on two years back when I helped out a company in some product analysis. They took me as I already had 7 years of Mixpanel usage as a PM and they wanted my first task to be to map out the main drops and create a priority UX list based on my findings.
But when I started to look at the data I saw stuff like “Sign up completed” and “Signed up” and the great thing they were both getting events. So how can I build a funnel?
So no matter how good you are with ANY tool. If you don’t know what you’re looking at and what every event means you’re bound to fail.
Now trust me. I saw at least 500 implementations. And a few things I can tell you to help you think differently:
- Most mid — large companies should have 50 to 100 events tops
- The more events you have the bigger the mess
- You only use about 5% of the events anyway
- Within time your events will be messy
- What you think your events mean isn’t always the case
So. What can you take out of this as a todo item:
- Make sure to create a process to be able to continue over time to have consistent and trustworthy events.
- Aggregate events and make sure they track a KPI
- Delete events and use the tools given by the analytics software such as merge, hide, drop and any other tool you can use.
- Always try to understand what is the trigger of the event to make sure you’ll get what you expected.
I know all this can seem overwhelming and the first thing that pops into mind is I don’t have time to allocate dev resources. But if you do a lot of the maintenance then the dev time will be a few hours at most once every few months and the benefits are easy to show.
A few more tips
- Suggest to have a Miro board mapping the events and the page that is triggered (even maybe have it linked in the event description) — https://miro.com/miroverse/event-mapping/
- Remember. You’re PMs your whole job is to make sure you validate and test things out. Do the same with your data. Start small. 5–10 aggregated events should cover the 80/20 rule and then add what you need and not based on “I might need this someday”
- Make sure you know what your decisions mean. Saw many make a decision to create new projects in mixpanel per product and then understand it limits your possibilities of cross querying. Ask first. We have a great community
Hope you start off 2023 with a clean data set :)
Alon
Don’t forget to like and subscribe on Medium this motivates me a lot!