Ethical and Responsible AI

Practical ways for your organization to get started

Dr. Nel
Responsible AI,  Ethics, and Privacy
7 min readJul 2, 2023

--

Part I: Representation

Written by: Nelson A. Colón Vargas, Ph.D.

Pronouns: he/him

Generated with DALLE. Prompt “wide image of a robot wearing a tie teaching a lecture”

Preface

At the height of the pandemic during that brief period where we collectively thought Clubhouse was going to be the next Facebook I heard MC Hammer say in a CH room “Smart people are good at finding difficult problems they would like to solve” and I swear I think about this at least once a week, or every time I’m in a meeting related to AI, even more if we are discussing the Ethical aspects of AI. In fact, I bet that if you and I have been in a meeting about AI or Innovation together, you’ve heard me quote Hammer before.

What Hammerman was referring to was that in business (and in life) we often jump into trying to solve the complicated problems because of the sense of reward and feeling of success that we get from the accomplishment but put off solving the issues that have straightforward solutions because they aren’t as stimulating and tend to be dull, menial or repetitive.

In line with KB’s astute observation, from an Ethicist’s perspective the exciting questions about AI are precisely the ones that seem ambiguous. The ones that aren’t easy to determine right from wrong. While this is intellectually rewarding and a necessary pursuit for the advancement of the field, I find that most organizations I interact with are trying to knock down all these big ticket items instead of focusing on laying a good foundation so that AI can flourish ethically and responsibly.

In this first installment I want to share with you why Representation is crucial for AI Ethics, and how it fosters better performance and efficiency in teams and the products they build. Increasing representation improves the quality of questions, answers, and data, which mitigates ethical issues such as framing and confirmation bias. Enhancing the overall integrity and fairness of AI systems leads to more ethical AI products.

Representation

And why it matters for AI and Ethics

Terms like “diversity”, “equity”, “inclusion” and “accessibility” often sound like buzzwords that companies include on their statements to sound like they are doing what’s morally right but my intent is to demonstrate to you that these “buzzwords” are intrinsically tied to the performance and efficiency of our teams and the products we build.

Let’s first start by defining what we mean by: diversity, equity, inclusion, and accessibility as these are the cornerstones of representation.

  • Diversity: The variety of perspectives, experiences, and backgrounds.
  • Equity: The fair and impartial treatment of all individuals, regardless of their background or identity.
  • Inclusion: The active engagement and integration of diverse perspectives, experiences, and backgrounds.
  • Accessibility: Designs that are usable and understandable by all individuals, regardless of their abilities or disabilities.

“DEI encompasses the symbiotic relationship, philosophy and culture of acknowledging, embracing, supporting, and accepting those of all racial, sexual, gender, religious and socioeconomic backgrounds, among other differentiators.” -Lisa Dunn

Bias:

  • In statistics refers to: discrepancy of a sample statistic vs the true statistic of the population.
  • In cognitive science refers to: a mode of reasoning which is likely to produce an incorrect or skewed result.
  • In social justice refers to: a morally suspect discrepancy in the treatment of people.

As Data and AI Scientists we tend to be pretty good at recognizing bias in models. For example, we know that if we are evaluating a fraud detection or network intrusion model, accuracy might not be the right metric because a high percent of accuracy could just be a result of how anomalous malicious events are. Accuracy in these cases could be a result of how imbalanced the training dataset is. So we pay close attention to other metrics that could provide more relevant information, like the rate of false negatives, precision, recall, f1-score, AUC, etc.

When we have further suspicions of imbalance classes we usually go a step back in the process and tweak parameters in the model. For example, we might add weights to the classes to compensate for the imbalance or penalize wrong predictions of the least represented class. Oftentimes we go further back and try to address this before it gets to the model by doing things like Undersampling, Oversampling, or generating synthetic data for the underrepresented classes (e.g. SMOTE). This is usually as far back as we can go. As scientists we know that collecting specific data from another class to “balance” our data set invalidates our experiment so there really isn’t much we can do but to start over.

These techniques for addressing bias in models and data are acceptable for Retrospective Data Analysis but we shouldn’t go into Prospective Data Collection thinking these issues will be addressed later.

So why minimize bias prior to Data Collection?

And how does it relate to representation in our team’s composition?

There’s an activity I like to do with every team I work with to get everyone (not just Data or AI Scientists) to understand the “why”. It is based on the book The Wisdom of Crowds by James Surowiecki.

Experiment

The rules are: No googling, no sharing your answer with anyone else.

  1. Pose a question that has a numerical value as its answer to your teammates. Ex: What’s the area of the state of Nebraska?
  2. Have them write their answer on a piece of paper.
  3. Draw a numeric line on the white board and place their estimates on the line.
  4. Calculate the mean of the answers and place that on the line as well.
  5. Reveal the real answer and place it on the line.

Usually what happens is that some people do better than others but the mean of all the answers tends to be a pretty good one.

When I ask people how they got to their number one person might say “I remember it took me x amount of hours to drive through Nebraska so I approximated the length and figured the height was roughly half the length.” Another might be from South Dakota and knows the area of their state by heart and given their similarities in size they used that as an estimate, or someone might have seen a list of countries by their size the day before and remembers the area of the US and divides that 50 (states).

If you then follow the same experiment with a different question, “How many left handed pitchers are there currently in the MLB?” you might find that there’s a change in who got closer to the real answer but the mean was again a solid choice.

What I noticed by experimenting with this is that what brings us collectively close to the right answer is precisely our individual life experiences and our unique ways of seeing the world around us. For example, how travel, interests or where someone is from could help them estimate the size of Nebraska. It is then no surprise that by introducing variability in the way an answer is reasoned we are able to reduce bias.

The reality is that we, as individuals, have biases. They might be unintentional and we might not even be aware that we have them but we do. Our primitive brain is obsessed with finding patterns and the easiest ones to spot are the ones that have low variability. The brain adapted to make certain decisions fast, based on intuition or impressions as a survival mechanism. While we are most likely not in any danger holding a cup of chamomile tea with two hands, while leaning forward in our ergonomic chair as our Artificial Neural Network is being trained in our 68F air-conditioned office and the PC fan warms our feet, the fact still remains that our brain has far more data as to how to survive than how to rationalize (thinking slow) so we have to be proactive at mitigating implicit bias and one of the best ways to do so is by actively engaging with people from different backgrounds.

For those of us who take trivia night at our local pub very seriously, this isn’t a foreign concept, right? When putting together a winning team for trivia we always make sure to have at a minimum someone who knows sports, someone who knows history, and someone who knows pop culture. We want to cover as much ground as possible, so every addition to the team is evaluated not by what they know but by how their knowledge and experiences complement what the team already has. We are very self aware of our limitations when we are looking for new additions to our trivia team, building DEIA into our Data Science and AI teams should be no different.

In essence, increasing representation leads to better questions, better answers and better data which leads to better AI products. In practice what we are really doing by increasing representation is mitigating ethical issues like framing -the reference point has a disproportionate effect on the outcome- or confirmation bias -tendency to pay attention primarily to the evidence that supports the position you already hold-.

“We must never let our moral intuitions escape scrutiny.” -Jonathan Wolff, Introduction to Moral Philosophy

Till next time,

-Dr. Nel

Source of DEIA definitions used in this piece: https://ethics-of-ai.mooc.fi/chapter-6/3-discrimination-and-biases

--

--

Dr. Nel
Responsible AI,  Ethics, and Privacy

AI Scientist / Quantum Topologist bouncing around DC, NYC and Miami