Yes, a statistical model can detect workplace communication bias (and why it’s ok if that makes you uncomfortable)

What if we told you that an AI could help you to better understand how you’re communicating at work?

What if we told you it could do so by reading your written workplace communication?

What if we told you that it could even provide you quantitative metrics about your communication biases and the engagement levels of your team members?

At this point in our conversation, you might start to ask questions:

“But, how do you measure bias?”
“How do you account for different personalities when you’re measuring engagement?”
“Does an algorithm really understand my team dynamics?”

In fact, we hear questions like this all the time about our work here at Cultivate.

In this post we want to address two underlying themes that we perceive in questions like these: (1) the doubt that workplace communication can be accurately analyzed and quantified with computation, and (2) the discomfort in quantifying something as subjective as a human relationship.

Recognizing bias and engagement signals in the workplace

As we will be discussing the analysis of workplace bias and engagement in this post, let’s first loosely define these two terms as they pertain to our research:

  • Bias = An individual’s variation in communicative behavior towards their colleagues
  • Engagement = An individual’s level of involvement in, enthusiasm towards, and commitment to their work and workplace

Before we discuss how a computer can score someone’s engagement or bias levels, let’s talk about how you might go about recognizing engaged or biased behaviors in your own workplace.

Recognizing your own bias is tricky, since a lot of these behaviors happen unconsciously (a difficulty unconscious bias trainings, like Google’s reWork program, are intended to alleviate). Thinking about engagement from one’s own perspective is a little easier. If you were asked to score the engagement level of one of your colleagues, you’d probably think about the many instances in which you interacted with this individual to come up with an answer.

You might think about her mood, her performance, or the amount of time she’s been spending at the office recently. You might also think about how this individual’s current behavior compares to that of her colleagues, and herself throughout your working relationship.

From the outset, it is clear that bias and engagement levels are identifiable via observed (or not always observed, in the case of bias) behavior over time. Believe it or not, a similarly holistic analysis can also be achieved via computation.

By decomposing the high-level concepts of engagement and bias down into algorithmically defined sub-parts, a model can learn to not only recognize the signals of engagement and bias, but also quantify them.

It is in defining these sub-parts where the quantitative and the subjective blur. One must decide which calculable metrics are actually indicators of workplace engagement or bias.

For exemplary purposes, these are the extractable metrics we have associated with bias and engagement, respectively, at Cultivate:

Metrics related to engagement:

  • Mood — Positivity, politeness, and job satisfaction
  • Attention — Speed and thoroughness of response in both 1-on-1 and group conversations
  • Productivity — Work initiative and output
  • Motivation — Job commitment and effort level
  • Teamwork — Willingness to collaborate in a group setting

Metrics related to bias:

  • Tone — Politeness and sentiment
  • Attention- Initiative and responsivity
  • Style — Similarity of vocabulary and sentence structure

Algorithms can actually be built around each of these metrics, drawing from a combination of well-established Natural Language Processing techniques and cutting-edge sociolinguistic research. We are going to cover some of these algorithms here and describe example use-cases, but before we proceed, let’s review a brief definition of Natural Language Processing (NLP) and the related process Sentiment Analysis.

Natural Language Processing for Dummies

Wikipedia defines Natural Language Processing as “a field of computer science, artificial intelligence and computational linguistics concerned with the interactions between computers and human (natural) languages.” This is a pretty broad definition, and indeed, NLP techniques are used to conduct language-related computational research across a wide range of disciplines (see Stanford Natural Language Processing Group’s publication list for some current examples).

An easy way to start thinking about the types of problems NLP can solve, and relevant to the computation of bias and engagement metrics such as mood, tone, and motivation, is a process called Sentiment Analysis.

Sentiment analysis is used to define the attitude of a string of text. For example, a very simple sentiment analysis algorithm might be responsible for reading a sentence and outputting whether the sentence is positive, negative, or neutral.

If presented with the sentence: “I had a good day today,” how might our algorithm correctly identify the sentence as positive?

Probably the easiest way would be extracting the word “good,” which has positive connotations in the English language.

Let’s go one step further. How might the same algorithm handle the sentence: “I did not have a good day today”?

In this case, we could not only rely upon the positivity of the word “good” to define sentiment. The algorithm would also need to handle negation of the positive word “good” with the word “not” in order to correctly identify the sentence as negative.

Building upon this problem, we could continue to add more complex logic to our algorithm, for example, logic to quantify a sentiment’s direction (ie. “I don’t like the color red”) or intensity ( ie. “I had a really good day today”).

Natural Language Processing in Practice (in the Workplace)

From the example above, it is fairly evident how Sentiment Analysis can be used to help calculate metrics related to positivity and negativity, such as mood, tone, and motivation.

What might be less obvious is that NLP models can also be trained to recognize and measure other affective signals.

For example, you can actually train a model to score the politeness of a string of text, along with its directionality and intensity as you can in Sentiment Analysis. The model will do so by identifying signals of politeness (cultural norms will obviously come into play here), such as saying “please” or asking someone for a favor rather than telling them to perform a task.

And NLP methods go far beyond affective signal recognition. Here are some additional methods that can be used (in part) to calculate the bias and engagement metrics listed above:

  • Style Detection — Used to analyze the similarity of text by extracting and analyzing vocabulary and structure
  • Dialog Act Classifiers — Used to assign a “high-level categorization of pragmatic meaning” to spoken or written language. For example, an utterance might be defined as a “Request for Information,” or “Request for Action” (see Omuya et al.).

Critical to the use of these NLP processes to reliably identify signals of bias and engagement is their ability to extract reproducible, quantifiable patterns of behavior. It is even more important that these patterns be distinct enough to allow accurate prediction of social context from text alone.

Cultivate Chief Data Scientist Andy Horng put the replicability and predictive ability of the models we built to the test in his recent white paper. The white paper details a case study of the gender-identified Enron email corpus, analyzing variations in language use conditioned by social dynamics. The goal of the study was to reproduce findings from 3 sociolinguistic research papers using Cultivate’s existing conversational analysis pipeline.

The results? Cultivate’s models were able to reaffirm the results of each of the research studies.

So, yes, a statistical model can, in fact, quantify bias and engagement in the workplace. By identifying metrics associated with bias and engagement such as mood and politeness, and quantifying them using established NLP techniques, we can ascribe quantitative metrics to traditionally qualitative aspects of a workplace relationship.

Accepting the Qualitative with the Quantitative

Convinced though you may be by the techniques and scholarly support for the explanation above, many individuals will still be uncomfortable with quantifying someone’s engagement or communication biases.

And that’s a reasonable discomfort; quantification is so often associated with the notion of scoring or grading. Slapping a score on something as subjective (and indictable) as workplace bias or engagement gets tricky fast.

But it is possible to frame quantitative information about bias and engagement in a way that provokes conversation and introspection rather than imposing a stark judgement of quality.

One way to go about this is to use relational visuals rather than ascribing numbers or simple yes/no judgements to these metrics. Rather than informing your user that they are “polite” or “not polite” based on arbitrary cutoffs you have set, you can instead make it clear to them that they are more polite to certain individuals than to others.

Another approach is to acknowledge that every individual is different and will exhibit different “normal” behavioral signals. You can do so quantitatively by baselining behavioral signals not only on a group level, but also on an individual level.

For example, one of your team members might generally communicate less positively than other members of your team. If you baseline this person’s mood metrics against the average of your team, you’ll probably see low mood signals for them.

This might be interesting information about the team, but it won’t actually tell you much about your colleague’s current mood. Baselining their mood against their own average will be much more informative in this respect.

But presenting metrics in a reflexive way won’t satisfy everyone, or scrub the data of all edge cases. There will be situations where someone disagrees, or is uncomfortable with a metric that is calculated based on their communication.

However, we are of the mindset that these limitations do not undermine the value of quantitatively analyzing your workplace communication.

Why? Even if they are at times debatable or discomfiting, these metrics still provoke conversation and behavioral awareness.

Maybe you’re a manager, and you receive metrics about your email and chat communication indicating that you are least responsive, and least positive to the female employees on your team.

Might there be a valid, situational reason for that? Of course, and that is for you to decide.

Will these metrics make you think more actively about how you’re communicating, and to whom, regardless? We think so.

About Cultivate

Cultivate’s AI-powered platform enables engagement and inclusion by helping you understand and improve your workplace communications.

For more information on what we are doing at Cultivate, check out our interview at TechCrunch Disrupt!