Storytelling in the Local Voices Network

Lily Xie
Cortico
Published in
6 min readFeb 25, 2020

The Local Voices Network brings together diverse communities for in-person conversations. Storytelling is a big part of these conversations — the script that facilitators follow asks participants specifically to share personal stories, based on a theory of change that stories can elicit more understanding across difference than opinions.

As part of my fellowship with Cortico, I was interested in what stories and storytelling looked like in the setting of LVN. Where do stories appear in these conversations? What conditions make a conversation right for storytelling? I broke down my task into a few steps:

  • Define what we mean by “story”
  • Identify where stories are in LVN
  • Understand the conditions that bring about stories

What is a story?

To find stories at scale, across hundreds of conversations and hours of transcripts, I needed a concrete definition of what a story is. What aspects of a story make it different than an opinion or an anecdote?

When someone is telling a story, you feel it — the room gets quiet, you become more focused, you follow the narrator along their journey. But how do we quantify this in a systematic way?

Maggie, a student at the Laboratory for Social Machines, lent me a book called Narratology: Introduction to the Theory of Narrative, by Mieke Bal. Adapting Bal’s taxonomy of narrative for LVN, I came up with the following set of rules to define a specific type of personal story that I will focus on for my work. By these rules, stories should:

  • Describe a sequence of connected events, e.g. “When I was five, I saw the Lion King with my dad. Then we had lunch.”
  • Have happened to the speaker (i.e., uses “I/me” and is in first person)
  • Take place in the past

Building a classification model

Now that we have an idea of what we’re looking for, it’s time to find all the LVN stories! I framed this task as a statistical classification task: given a new piece of a conversation, try to guess whether it is a story, based on some set of training data where stories have already been labelled.

I wasn’t sure when I started how easy or hard this would be — through my research, I found a few papers that described story classification and related tasks (Gordon, Cao, and Swanson (2007), Gordon and Swanson (2009), Li and Nenkova (2015)). Based on this research and other conversations, I ended up forming the following features about each speaker turn:

  • Length of a speaker turn, measured by the number of tokens
  • How often a speaker used past tense verbs
  • Pronouns: how often a speaker used first person singular (I, me), first person plural (we, us), and second person pronouns (you)
  • How often a speaker referenced named entities, real world objects such as a person’s name or the name of a town
  • Stopword ratio: the number of stopwords (the most common words of the language such as “a”, “the”, “and”), normalized by the total number of words in the speaker turn. Li and Nenkova use this feature in predicting sentence specificity, with the intuition that “specific sentences will have more details, introduced in prepositional phrases containing prepositions and determiners”.
  • Average number of tokens per sentence

I built a logistic regression and a random forest classifier with these features in order to predict whether a snippet contained a story. These models had similar performances:

ROC score and AUC score for logistic regression
Confusion matrix for logistic regression model. Each cell of the confusion matrix shows the fraction of examples with the true label (along the rows) that are assigned the predicted label (down the columns.)

The feature importances inferred by the model confirmed what we thought was true about stories, and provided some new insights as well.

Feature importances from logistic regression (standardized and scaled)

We knew already that, by definition, the stories we chose to look at happened in the past and happened to the speaker, so the positive importance of past tense and first person singular pronouns make sense. This exercise also confirmed some of my hunches around stories typically containing more specific language, with a positive weight on named entities. I was surprised to see such a high weight on token count — it turns out, stories on LVN are long!

Here are all the stories — now what?

Zooming out a bit — we were initially interested in finding stories so we could learn about what conditions make a conversation right for storytelling. Hopefully this can help us learn something about how to foster more storytelling in LVN.

I approached this with a similar method as the story identification task, but framed in a different way. Again, I wanted to predict whether a snippet of conversation was a story, but instead of using features about the snippet itself, I wanted the attributes of the window of time before the snippet. In other words, is there something about what happens leading up to a speaker turn that can help explain why that turn did or did not contain a story.

There was significantly less existing research about this type of task. I relied more on my intuition to create features about the window leading up to a snippet, which included:

  • Number of facilitator turns: how often the facilitator chimed in
  • Average number of tokens per turn: how long each speaker spoke for
  • Number of stories: how many stories were shared
  • Number of distinct speakers: how many different people spoke

I also included an “audio start time” feature, a value between 0 and 1 indicating the relative time in the conversation that the turn occurred.

With this model, I was less interested in the accuracy of the prediction and more interested in whether we could learn anything about the window leading up to a story. This model was not as strong of a predictive model, but it still helped shed some light on the conditions that make storytelling possible.

AUC score and ROC curve
Feature importances (standardized and scaled)

What can we learn from this model? In logistic regression, high positive feature importance weights signify higher importance of that feature in the prediction of positive class. Large negative values signify higher importance in the prediction of negative class.

  • Facilitator presence: this model has learned that snippets that happen after a period where the facilitator is speaking frequently (e.g., at the very start of a conversation, while the facilitator is introducing the project), are not as likely to contain stories.
  • # of stories: this model has learned that the presence of stories likely means that the next snippet is also a story — storytelling leads to more storytelling.
  • Audio start time: the model has learned that stories are more likely to happen near the beginning of the conversation. This makes sense, because often facilitators will ask participants to start off the conversation with a story.

Future work

Of course, there is a lot more that needs to be done before we take any action on these findings. This model is far from perfect, and future work can try to find more predictive features. Future models could also add lexical or syntactic features that can give information about how the content of a conversation itself might affect storytelling. Even with an improved model, there would need to be more work to prove a causal relationship between these features and the presence of stories in LVN.

Storytelling is such a critical part of LVN, and I am excited to have had the opportunity to work with stories as part of my fellowship. The stories that participants tell in LVN conversations are rich and and full of life. By automatically identifying stories, we might choose to feature them in search results or highlight them in other ways. I hope that future work can do more to elevate and understand these stories.

--

--