Helping the Climate Action Tech with on-boarding

I’m spending a bit of time working with the ClimateAction.tech community — a warm, safe, super professional and action-oriented group of 1100+ tech professionals working together to help combat climate change.
I’m passionate about the impact that natural language processing technology has on how people work with technology, so I’m digging into this.
Problem: too much info in on-boarding. The ClimateAction.tech new community member on-boarding is fantastic: it’s really professional and structured, but my personal experience is that it’s far too much to read at once and I didn’t read much of it at all, and so missed out on some important elements around code of conduct and other valuable knowledge.
Hypothesis: people will engage better is by asking questions. This might not be true but it might be more accessible. (Hypotheses are assertions that are then scientifically tested.)
Experiment 1: using Allen NLP’s reading comprehension to pull answers from unstructured text.
One of the big revolutions over the last couple of years is technology that’s vastly improved machine processing of text in a way that can help navigate knowledge: specifically “general purpose language models”. (I specifically avoid the terms “understanding” or “intelligence” because this anthropomorphism of tech is, in the balance, unhelpful as it contributes to the myth of computers approaching “human-like” intellect.)
The AllenAI Institute is a top AI research institute and they have some pretty wizzy NLP tools, that, for extra bonus, have a great online demo site. The authors have underscored that the demo site is not state of the art, and that they’re illustrations of what’s possible. We’ll look at state of the art later.
Reading Comprehension (Q&A)
Reading comprehension is technology that takes a question, and a body of ordinary text, and tries to find a selected passage of the text that most closely links to the material. Maybe this is useful for on-boarding docs?
So in the spirit of ultra-small validated learning loops, let’s prototype a few things.
How does AllenNLP Reading Comprehension fare with a small section of the on-boarding doc?
Start small and increment. So I tried to ask a few questions from the first 6 pages: the CAT mission. There are three options for the Reading Comprehension demo and I tried them all: the ELMo-BiDAF (trained on SQuAD) worked best. (ELMo was arguably the first general-purpose language model back in Feb 2018, preceding the one that made a big splash, BERT in Oct 2018, so already we know there’s more potential regardless of these outcomes)



Another observation: if you try it yourself you’ll see that it takes 30+ seconds to run.
But wait is BiDAF and NAQANet really that bad?
Subjectively, yes. Let’s try “What is ClimateAction.tech” (no question mark)


So what have we learned?
The demo is just that: a simple demo. But it’s clear that the the output is:
- Very sensitive to the nature of the input (that a question mark completely affected the result is very surprising)
- Misses spectacularly a lot of the time
- Very very slow.
Welcome to probabilistic technology
Yes, this is technology that, during business as usual, produces a range of results. It’s not a “bug” per se that these results vary, it’s just how it is. This is a different kind of technology to normal rule-based software where you can generally get predictable results given certain kinds of inputs. With probabilistic technology, you get different results based on even subtly different inputs.
What next?
These are anecdotes to test things out. The next step is to look at higher performance resources (both in terms of speed and performance). HuggingFace’s Transformers has a huge amount of attention (in-joke, sorry) and so it probably merits exploring that to see how its SQuAD performance works.
