Investigating Attendee Reviews…with Data Science!

Tom Martin
Oct 15, 2018 · 8 min read

During Clojure eXchange last year, Skills Matter premiered a new mobile app created by our team of developers. This created an easier way for attendees to leave feedback, and allowed us to collect and engage with this feedback on an unprecedented scale.

As shown by the screenshot below, this is a relatively simple feedback form which asks for a score, rating both the enjoyment of the talk and how much the respondent learned. These two fields are both required. There is also the option of leaving written feedback about what they liked or disliked about the talk, as well as any other suggestions they might have.

Some ten months later since the app’s implementation, we’ve collected over 3,700 individual session reviews (a session is a specific talk or workshop an attendee might join), so it feels like now is a good time to dive into the data!

Key Figures

  • Reviews from 442 unique attendees
  • Reviews for 335 unique sessions
  • 912 reviews with written feedback
  • All figures current through 1st October 2018

Key Questions

  1. Do reviews tend to come from the same attendees?
  2. Do sessions get the same number of reviews?
  3. Do positive and negative reviews use the same words?
  4. Do attendees rate enjoyment and learning scores similarly?

Do reviews tend to come from the same attendees?

The majority of attendees left multiple reviews, although the most common number of reviews to leave overall was one. As shown with the histogram of total reviews by attendee above, there is a significant positive skew with a gradual drop off. The largest number of reviews left by any attendee was 26, of which two attendees did so.

Do sessions get the same number of reviews?

Can we infer anything else about sessions with more reviews than most? Perhaps these reviews illicit more of a response than most e.g. they are more favourably rated than the average? Looking into this a bit more, there doesn’t appear to be any relationship between the average score given to the session and the number of reviews a session receives.

So what might be a factor in some sessions getting more reviews than others? Looking at the top 10 sessions by the total number of reviews, all of which had over 30 reviews, nine of those were keynotes so we could anticipate a large attendance. Perhaps the number of reviews for a given session could be taken as a proxy for the total attendance for that session. This is all the more useful as we currently can’t track exact attendance per session.

Sessions with most reviews

  1. Keynote: Architectural patterns in Building Modular Domain Models
  2. Opening Keynote: JavaScript: The Next Generation
  3. Keynote: The Magic Behind Spark
  4. Keynote: The Survival Kit of the Web and PWAs
  5. Keynote: Own the Future
  6. Keynote: Serverless Functions and Vue.js
  7. Free Monad or Tagless Final? How Not to Commit to a Monad Too Early
  8. Keynote: V8 Engine Internals For JavaScript Developers
  9. Keynote: Choose Your Animation Adventure

Do positive and negative reviews use the same words?

Of the 3,700+reviews collected, only 1,079 (just over one quarter) have any written response, be it relating to feedback or suggestions, and only 512 have both fields completed. Due to the small numbers, it’s debatable as to how useful a session-by-session comparison (based on written responses) would be. However, in aggregate, it could be interesting nonetheless to see what common features there were (i.e. in positive versus negative written reviews).

First things first, how to tell if a written response is negative or positive? This is a question of determining the sentiment of each review. To do this, I fed the feedback and suggestion text fields into the AWS Comprehend API which returns a probability vector indicating the most likely sentiment for the input text — one of either positive, negative, mixed, and neutral. The sentiment of a given response is then the sentiment class with highest probability. This approach can lead to a few edge cases, i.e. Comprehend will determine a response to have a different sentiment to what I might determine, but in general, it is quite robust.

It’s interesting to consider the reviews split into these categories as it already gives some idea of how attendees interact with the feedback app. If we look at the number of written responses by sentiment, we already see some important patterns in behaviour:

  • Written feedback responses are predominantly used to elaborate on what the attendee liked about the session.
  • Written suggestion responses are well divided between three of the four sentiment classes that Comprehend uses. Reviewers use this field to detail what they want to see improved.

Overall, I’d take this as some indication that reviewers tend to provide as much constructive and appropriate feedback as possible, and are generally well-meaning with their comments. In the case of the ‘feedback’ field, the overwhelmingly positive feedback may largely be due to the responses being primed by the text immediately above the text field: ‘One thing you liked about this talk’.

Taking this a bit further, I wanted to see what kind of language was used for responses of either a positive or negative sentiment. Firstly, I identified keywords in written reviews. To do this I used the keywords module of the Gensim library, which does a good job of ranking words in a source text by how important it is to the meaning of the text (not just how frequently it occurred).

There is a very clear split between the kinds of keywords found in reviews with positive sentiment compared to reviews with negative sentiment. The top-rated keywords for positive reviews were mostly all adjectives such as ‘great’, ‘good’, ‘nice’, ‘interested’, ‘useful’, whereas top keywords for negative reviews were all nouns such as ‘talk’, ‘time’, ‘code’, and ‘examples’. A possible reason behind this is that negative reviews tend to be more direct and less elaborate, as we saw above the feedback field is predominantly used for comments of a positive sentiment.

Positive Feedback Top Keywords

  1. Good
  2. Nice
  3. Nicely
  4. Interested

Negative Feedback Top Keywords

  1. Talks
  2. Time
  3. Timing
  4. Coding

Nouns and noun phrases typically used in written reviews tended to be the same regardless of sentiment. Typically these were words related to ‘talk’, ‘presenter’, ‘speaker’, and ‘time’ i.e. general words related to presentations.

A different way of looking at the written feedback is to look at the responses as a word cloud. This is not directly comparable with the keywords discussed above as it only considers overall word frequency not whether they are keywords as per the Gensim keywords module. Nonetheless, it’s a good overview of the kind of language we can expect from different reviews.

Word Clouds by Sentiment

The word cloud for all written responses with positive sentiment …
… and the corresponding word cloud for all written responses with negative sentiment

Do attendees rate enjoyment and learning scores similarly?

Key for emojis used to rate learning from and enjoyment of talk

To illustrate the relationship between the two scores, the following heatmap shows the co-occurrence of learning and enjoyment scores. This follows the key above where ‘1’ corresponds to the worst score a session can receive and ‘4' the highest. The numbers in each square give the number of reviews with that score e.g. 8 reviews scored a session with a learning score of 1 and an enjoyment score of 4.

The immediate takeaways from this chart are that sessions are often favourably rated for both learning and enjoyment, and there is some indication of correlation between the two scores, as the highest numbers are along the middle diagonal.

To put more numbers to this last point, I calculated the Spearman rank-order correlation coefficient between these two scores using the Scipy library. This returned a correlation coefficient of 0.69 and p-value of approximately 0. This means that there is a moderate positive correlation between the two scores, which is statistically significant. Or in other words, the higher the learning score, the higher the enjoyment score, and vice versa. This meshes well with my initial assumption. It is interesting to note as well that sessions have generally received slightly higher enjoyment scores compared to learning scores, whatever the reason might be.

Overall, I’m really pleased with the results we’ve obtained so far through the reviews app. The figures seem to suggest that we’ve done a good job of capturing how attendees feel about sessions and already allow us to track some important patterns in attendee behaviour. This will only get more important, as we come to increasingly rely on data-driven approaches to improve the attendee experience.

For more on the why behind data-driven organizations, check out my post earlier this year.

It is now critical to learn from this data to iterate on this initial feedback app. For instance, we’ve seen that a big challenge for us is the relatively small numbers of attendees leaving feedback, which probably means we’re missing out on some important feedback.

Looking forward, I’m most excited to see how we can begin to compare the same conference series across multiple years and see if we are making the changes our attendees want to see. Very soon we will be able to compare between Scala eXchange and Clojure eXchange.

Skills Matter

A community of software developers with a passion for…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store