3 Disturbing Consequences of Using Data Unethically
Technology is at a crossroads.
We’ve seen it before: something new is developed, and there’s a rush to approve it and use it as soon as possible. For example, after DDT was used to fight typhus and malaria during WWII, there was a push to find more applications for it. At the end of the war, the FDA approved its use as a pesticide, without robust testing and regulation… and it caused cancer in thousands of people.
Similarly, as we race to develop applications for AI and machine learning, we have an ethical dilemma.
As the CTO of HG Insights, I’ve spent a lot of time thinking about how data is used in the industry. Security, especially personal information, is one of the many concerns with the growing use of data. On the heels of HG achieving SOC 2 security compliance, I can’t help but think about how we use it on a more general level, and the ethical concerns that come with it.
So while we continue to develop novel technologies, I’d like to open a conversation around how we’re going to be mindful of these ethical considerations — not to get bogged down in an individual, case-by-case frame, but in a more broad context, by exploring a few examples.
- Decisions Have Consequences: Safety in Self-Driving Cars
AI is only as good as the learning models and datasets it’s based on.
In situations where decision-making is left to machines — decisions based on the dataset from which it learns — who has accountability if something goes wrong? And what about the ethical considerations behind a machine’s decisions? There are a lot of moving parts — the modeling and training data has to be analyzed, the software that instructs what to do and actions the decision should be reviewed. This gray area becomes a dilemma for even consumers to consider. In many cases, they are completely unaware that decisions are being made for them. This becomes a data ethics problem… for example:
The race for AI is a race for data… but we’re not all race car drivers. So, as AI changes the way we drive, what happens when there’s an accident in a self-driving car, a sort of 21st-century Trolly Problem? If a self-driving vehicle crashes, is it the programmer’s fault? Is the car a separate entity that can be held accountable?
Who’s making these types of decisions? While the driver is technically liable, how could they have had any impact on the ML algorithm that triggered making the decision?
As companies roll out a “full self-driving” beta software on public roads, this is an increasingly important ethical dilemma for regulators. Passing the liability to the driver is really just the surface of what is a deeper issue about accountability.
And this is just one of the ethical crossroads we’re facing.
2. Avoiding Bias: We Must Learn From the Past, Not Repeat it
This is a problem we’ve grappled with for some time, as we’ve seen in facial recognition controversies.
Despite the fact that we will create more data in the next three years than we have in the preceding thirty years, most of it is [rightly] protected and guarded closely. So, we still need more data to train algorithms. This data is created by A.I., for A.I., and it can be imperfect in a couple of different ways.
For example, these algorithms are only as good as the data that informs them. This data is created to accurately represent the world, the good and the bad. For example, if an algorithm is fed data about historical information re: the salary gap between genders, it will learn from this history and hardcode this systemic issue. As The Alan Turing Institute points out, “even data that never translates directly into information can be used to support actions or generate behaviors.”
Removing bias is an extremely important, complex, and controversial area of data science today. Synthetic data is a hot area to both provide more data for machine learning when data is not readily available, but also techniques to generate synthetic data to overcome bias.
One of the biggest problems in our politically- and socially-charged world today is that biased predictions are knowingly and purposely proliferated when it helps to back one’s position or point of view. In this last case, humans are actually amplifying the effects of bias which feed more models downstream creating a compoundingly negative effect.
Work towards equitable and accountable AI and organizations like the Algorithmic Justice League are more important than ever.
3. Protecting Personal Data: A Balance Between Privacy and Innovation
While these are important decisions as we move deeper into the world of data, they remain outside of daily life for many of us.
When we get to the use of our own personal data, these questions become more pressing. We’ve seen some progression in personal security, like GDPR regulations, CCPA, and the Cookie Law.
This early-stage regulation is a call for more privacy laws worldwide — some are even calling for an AI “Bill of Rights.”
The argument for privacy around personal information is more pronounced now than ever. You see it every time you visit a website and see an ad, use Spotify, Netflix. Everything suggested to you is dictated by your past behavior.
I’m curious to see how we can keep up with this, to help protect ourselves, and limit the negative consequences while continuing to improve the world around us. How do we create a balance between innovation and personal privacy? I’m interested to work together with the data science community to mitigate these adverse effects and dubious uses of data. As a curator of new and novel insights, I have a responsibility to act ethically and hold myself accountable as a model citizen in this data ecosystem.
In the era of machine learning, and the rush to create novel technologies to make our lives better — and capitalize accordingly — we need to always be thinking about both the benefits as well as the consequences of our innovations and breakthroughs. Just as we’re already feeling the positive benefits of AI in our daily lives, there are also problems that arise. The farther our technologies advance, the consequences of disregarding them can become farther reaching.
By treating these ethical considerations, we can all use data to bring the most value for our societies. The ethics of data usage need to be considered from many perspectives, not just by entrepreneurs and data scientists. In order to have this conversation properly, we all have to work together.
Society has dealt with enormous ethical concerns before. And compared to the effects of DDT, the cancer-causing carcinogen introduced within miles of millions of Americans, the concerns around AI are significantly less.
About Robert Fox
Rob has over twenty-five years of commercial software and engineering experience, strong analytical skills, and a broad range of general industry and business knowledge. He has led engineering teams at industry-leading organizations like MuleSoft (now Salesforce) and Liaison Technologies (now OpenText).
With his entrepreneurial spirit, innovative vision, and industry-leading thought leadership, Rob specializes in all things data, including: analytics, data science, integration, management, security, API management with domain expertise in and around B2B, EAI, Cloud, and Big Data.