7 experiments in natural language understanding for CoachBot

Published in

Saberr Blog

6 min readNov 1, 2018

CoachBot is a digital coach for teams, that helps teams have the conversations they should be having but maybe aren’t. CoachBot’s primary value is in understanding what those conversations should be and guiding teams to have them (AKA “just ask smart questions”).

CoachBot can do that because we read a lot and speak to a lot of experts: lots of books, papers, articles, professors and coaches taught us the fundamentals of teamwork.

Actually asking the right questions (“what’s the worst thing your team could do?”) is simple logic for CoachBot. But understanding what people say back is much harder!

Our philosophy with CoachBot is that people should mostly be talking to each other rather than to a bot. Even still, there are lots of places where CoachBot can learn to be more useful.

We’re often asked about the intelligence of CoachBot. These are 7 experiments (all using machine learning in some way) that we’ve done over the last year, prototyping new concepts for CoachBot.

Picking an icon for every “agreed behaviour” a team writes

Agreed Behaviours are little rules teams write for themselves. “Get out of the office”, “Over-communicate everything”, “Be on time!”. Teams can end up with quite a few of these on their Canvas, which is great. But then they can look like a wall of text and go unread.

We used NLP to help. Pre-trained word embeddings* let us understand the meaning of each Behaviour. We compare those meanings to some pre-defined categories to select an appropriate illustration for each Behaviour. This makes the Behaviours a bit easier to scan over on the canvas, and so easier to live by!

Categorised and iconified team behaviours

How?

* Word Embeddings turn a word into a set of numbers. Similar meaning words end up with similar numbers
Semantic similarity using spaCy. Small expert comparison dataset built through desk research. Model deployed on Sagemaker.)

Organising a teams’ discussion topics into a semantically-flowing order

There are a few places where CoachBot asks everybody in a team the same questions, individually. But the team development happens when the team discuss their answers together: they learn from each other and think through ways of improving how they work together. Our Retrospectives, Agreed Behaviours and Behaviour Review sessions all work like this.

The value is in the discussion, so CoachBot doesn’t shortcut you to an answer. But we can use word embeddings to measure the similarity in meaning between every pair of submitted discussion points. That lets us organise the topics into a flow with fewer context switches and less repetition.

An example:Topics:
"We should use the phone more"
"Email isn't as good as talking"
"We should talk about things more before doing them"
"Why don't we stick to our decisions?"
"Buy some new computers"Organised:
"We should talk about things more before doing them"
"Why don't we stick to our decisions?"
"We should use the phone more"
"Email isn’t as good as talking"
"Buy some new computers"

How?

Semantic similarity using spaCy and nodal connections using NetworkX

Suggesting how to make a team’s purpose statement more motivational

Aligning around a clear purpose is a foundation of great teams. But shouldn’t a purpose statement also be motivational? Some teams aren’t used to thinking aspirationally about their work, so this experiment looked at whether we could help them.

We combined a corpus of motivational quotes with a word usage dataset to find words that are used particularly frequently in motivation and inspirational writing. In CoachBot, every team member suggests their own ideas about the team’s purpose. Here, CoachBot picks out some of their keywords and suggests some related motivational words that might help them to craft a great purpose statement.

How?

Word frequency data from COCA
Categorised quotes from TheWebMiner
Noun Chunks and Key Terms extraction from spaCy and Textacy
Jupyter/ipywidgets prototype

Generating a progress tracking interface for any “measurable result” a team write

Having clearly prioritised goals is important for any team, but just keeping track of them can be hard. In the “Objectives and Key Results” model of goals, Key Results are the leading indicators that an objective should be reached. In this experiment, we looked at making progress tracking a bit simpler by automatically creating a suitable progress interface for any Key Result. By dependency parsing the text, it can create a slider with units for a numeric goal, or a simple checkbox for something that looks like a done-or-not task.

How?

Dependency parsing using Tensorflow’s SyntaxNet.
Jupyter/ipywidgets prototype
Full details

Converting hand-written Post-it notes into a digital Goals board

A Goals Board is an organised canvas of… goals — a bit like a digital whiteboard. We know that digital is best for remote teams, but when people are in the same room it’s sometimes more natural to just write ideas on sticky Post-it notes (we do!). In this prototype we used a pre-trained machine vision service to read handwritten sticky notes and upload their contents straight into CoachBot’s Goals Board for that team.

How?

IFTTT Camera Widget
Python API which calls the Google Cloud Vision API and does some geometry to regroup bits of text into notes

Generating a cartoon strip for a user’s “story of my values”

Values are our basic motivations at home and work. One of our users’ favourite exercises is to share a story about “where my values came from”. Sometimes using a drawing can be more meaningful than just writing a story. When we thought about how to do this digitally, we experimented with creating cartoon strips automatically for a given story.

To draw a cartoon, we used NLP to extract Noun Chunks (things) from the submitted story, and matched them up with the filenames of icons.

We tested it with short personal stories similar to the quirky stories people have about their values.

It works great for stories full of common things (and not so well for more niche stories!)

The most glaring problem in this was in fact the biases inherent in the language model itself. For instance, most cartoons that mentioned children end up with an icon typically associated with women and motherhood, even when the story did not reveal the sex or gender of the author.

This kind of bias (that women are associated with children more than men) is learned from the text corpus that the NLP tool is trained on: the internet.

How?

Icons from Icons8 (using filenames like bananas.png to understand their content)
Word vectors and noun chunk extraction using spaCy)
Short personal stories for testing from the HappyDB

Measuring biases in NLP tools

Motivated by the biases seen in the Cartoon experiment, we set out to assess the level of bias in the NLP models we use. A set of tests called the Word Embedding Association Test have been created to measure this learned bias. We implemented them for spaCy and found significant biases, as expected; if you’d like to run the tests on your own models, our code is available. We’re trying to get these bias tests included in the core NLP libraries as well.