Deep learning to intervene where it counts

How we built a feedback loop to optimize learning nudges

Marianne Sorba
Aug 14, 2018 · 5 min read

Learning isn’t easy. To make it a little easier, we launched In-Course Help, delivering behavioral and pedagogical nudges as learners move through course material. In this post, we cover our process and learnings in implementing a machine learning feedback loop for personalizing and optimizing these nudges.

In the first implementation of In-Course Help, all learners at a given point in a given course — for example, completing Lecture 9 of Course A, or failing Quiz 3 of Course B — received the same message. This allowed us to intervene in ways that were helpful on average, and moved the needle on course progress and retention.

But we also observed heterogeneity of impact across learners and messages. Correspondingly, in a world where all learners at a given point in a given course received the message, we were wary about rolling out too many messages.

For the next implementation, we created a smart feedback loop to control which learners received each message. The model is a neural network that takes as input a wide range of features, including the following:

Using these features, the model predicts how likely a specific learner is to find a specific type of pop-up message helpful at a particular point in her learning. If it predicts that the message will have a sufficiently positive impact, it triggers the message; otherwise it holds the message back. The weights of the model and its predictions update nightly while our data science team sleeps — a big improvement from the baseline of complex and long-running nested A/B tests, with the team making manual adjustments to the interventions based on observed results. The feedback loop system also naturally extends to allow us to choose among multiple versions of a message that can be sent at the same point to the same learner, triggering only the version predicted to have the most positive outcome for the learner.

Today we have two levels of filtering: a course-item-state level filtering to decide which messages to keep around because they are sufficiently helpful, and a user-course-item-state level filtering to personalize which messages go to which learners at any given learning moment.

In brief, for each possible nudge on every item state in every course, the course-item-state level model predicts the average probability of a learner finding the message helpful based on past interactions with the message and course-level data. Intuitively, if the model predicts that the message is not sufficiently helpful, we hold back that message at that trigger point altogether (provided the number of impressions is sufficiently large). This trigger-level filtering is especially useful as we expand our message inventory because it automatically detects and filters out messages that are not helpful — or are not for a particular class or trigger point.

The course-item-state level model is layered under a similar feedback loop that filters on the user-course-item-state level. Take a simple example: We want to know whether to send a particular message to Alan at a particular point in his learning journey. For exposition, consider a message for which we are directly collecting self-reported helpfulness from the learner. In the current implementation, there are three possibilities.

Image for post
Image for post

We send the message if and only if a) sufficiently exceeds b) and c). Today, the feedback loop holds back about 30% of the messages and increases the ratio of helpful to non-helpful reports by 43%.

Image for post
Image for post

So what’s next?

First, we’re iterating on the optimization function. The example above considers optimizing for a positive uptake on the call to action (either reporting the message was helpful or clicking through on the recommendation). For some nudges, however, the optimization function can and should be further downstream. For example, if we invite the learner to review important material, her clicking through the link provided does not give us sufficient information on whether that review material actually helped her learning outcomes — only on whether she followed our recommendation. For these types of interventions, we’re extending the optimization function to incorporate downstream learning outcomes such as items completed.

Second, with this fail-safe built in, we are brainstorming and launching new kinds of interventions. Since the model automatically chooses which nudges to keep running where and for whom, we can explore new ways to engage learners, confident that those that are not helpful will be efficiently held back.

Interested in applying data science to education? Coursera is hiring!

Coursera Engineering

We're changing the way the world learns!

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store