Intro to Data Bias — A Conversation with Christian Beedgen
In this episode of the Masters of Data podcast host I sat down with my colleague, and long-time friend, Co-Founder and CTO (Chief Technology Officer) at Sumo Logic, Christian Beedgen. Christian has a long history in the world of data and he brings his years of experience and expertise to the discussion, which centers on the ideas of bias, data and analytics in today’s world. It is undeniable that all people live with the realities of biases, which impact how we see things and interpret the world around us. The question in the machine data analytics realm becomes how to embrace the realities of these biases and their impact on the data being collected, while still recognizing the inherent value in the data and the need to analyze it purposefully.
To start Christian shares about his personal venture into the technology spectrum from a young age through his now current role with Sumo Logic. As he shares, his fascination with technology started when he became interested in computers by reading programming books, which engaged him and pushed him to discover more. From there his interests adapted into tinkering with computers and programming as a hobby, something he did while in school, which in turn prepared him for his career ventures. Early in his career Christian had an opportunity at Amazon, which opened up a series of roles at different startups, which centered his focus on data analytics, the experience that fueled his venture into co-founding Sumo Logic. The reason behind founding Sumo Logic as he identifies is that he saw analysis of data not just as a tool but also a sellable service with real good.
The focus of the conversation between Christian and I was around this idea of how the data that’s gathered in our world is impacted by biases. While many people in the data spectrum see data as simply cold, hard facts, which can easily be analyzed, crunched and quantified, the reality is much greater than that. As we see it, the data collected actually points back to more than just facts, but rather to the people the data represents, something that is much harder to quantify through machines. The challenge also becomes realizing that even though data is being gathered, people who also have biases and preferences are the ones who gather it. When that data is compiled the people gathering the details inherently influence it in some way. This is where the role of context comes into play as there is context surrounding all data that is gathered in our world.
While pointing to multiple sources which have helped their thinking, Christian and I proposed that there is need to be asking questions - not just about the data itself, but about the context in which it was gathered before it is analyzed. Pointing back to Weapons of Math Destruction, Christian refers to the theory that data isn’t necessarily truth itself, but you have to review the context of how data was gathered in order to fully understand it, something they try to adapt at their work with Sumo Logic. As the discussion continues it’s clear that the challenge surrounding biases is that we often convince ourselves into thinking a certain way about things. These biases are impacted by what are called anchor biases, which are our initial exposure to something, which inevitably impacts the way we interpret things related to that ideology. Similarly, a confirmation bias is our focusing on information that supports our beliefs and paying less attention to information that contradicts it and the assumption that ambiguous information automatically supports your perspective. To land the plane Christian and I discussed the responsibility for companies who deal with data and analytics to address the issues raised. In their opinion it comes down to having a clear set of ethics, dialoguing in open reflection on biases and looking for a workable solution. One way of doing this is by simply assuming that data is representative of people, not simply impersonal or factual content. And in the world of building products and features for machine data analytics, the reality of context needs to be embraced.
Learn more about Christian:
- Check out this episode on iTunes or on our website at mastersofdata.com
- Learn more about Sumo Logic
- Follow Christian on Twitter
- Learn more about Sensemaking: The Power of the Humanities in the Age of the Algorithm by Christian Madsbjerg
- Learn more about Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil
- Learn more about Ten Simple Rules for Responsible Big Data Research co-written by Kate Crawford