Visualizing Toxicity in Twitter Conversations

Peter Beshai
Aug 17, 2018 · 11 min read

In late June, approached me to ask if I would be interested in doing some visualization work for a presentation he’d be giving at Twitter’s all-hands event with . The talk was to be about toxicity in Twitter conversations, and would showcase the conversation analysis work being done in the lab, particularly around classifying replies as toxic or not.

At this point, any time Deb asks if I want to visua — I stop him right there and say yes. Yes please. When can I get started? (.)

The project started with an initial design discussion in which we all agreed it would be cool to somehow visualize twitter conversations as natural looking trees, where replies form branches and the more toxic the reply, the more withered the branch would look. At this point, I had no idea how I’d even approach rendering a withered tree, but it sounded like a fun experiment so I said I’d look into it and do my best.

One of the benefits of working at in the is that a ton of interesting people are always coming through. As luck would have it, a few days after our meeting I was showing seasoned visual FX artist around the lab. She was intrigued by the idea of the conversation trees and thought I might enjoy using to try and model it, suggesting its procedural nature would appeal to my engineering background. A day after our meeting she was even so kind as to send over a sample file for an approach that might work.

Eugénie’s sample render of a partially withered tree

It looked pretty sweet and I was convinced it would be fun to learn Houdini for this project. My naive past self was exceptionally optimistic about this despite the deadline for the project being just a few weeks away. Thankfully MIT offers access to courses and there were a couple on Houdini that I completed to get a crude, basic understanding of the application.

With Eugénie’s sample file,’s courses, and the online Houdini help pages at the ready, I began my journey modeling the toxicity of conversations on Twitter. I broke it down into three steps to ease my anxiety:

  1. Figure out how to layout the tree
  2. Figure out how to render a tree in Houdini based on real data
  3. Create a video showing multiple conversation trees growing

Lay out the Conversation Tree

At and the (our group in the Media Lab), it’s pretty common to spend time thinking about, looking at, or . Every time we’ve rendered graphs in 3D however, we’ve used a . Given that this data had a bit more structure (it’s a tree), I thought I’d look and find if there were any other interesting 3D layouts to try.

The first thing I found was by an old professor of mine, ! Sweet. But it was about hyperbolic spaces, which was a bit beyond what I was interested in. However, there was a great figure demonstrating a 3D cone layout that looked very promising. Thanks Tamara!

Tree Cone Layout (figure taken from Tamara’s paper)

Next I had to figure out how to apply the layout to our data. I found a very extensive Python graphing library called that had algorithms for laying out graphs in dozens of ways, including the cone layout. Jackpot. With a little finesse I was able to take our conversation tree data and output JSON files that included the nodes, links, 3D positions, and toxicity scores. With these pre-computed files, all I’d need to do was get Houdini to render objects in their positions as specified in the data.

My first approach was to bang my head on the wall several times, but it turned out that a more effective means of moving forward was reading documentation and trying things in Houdini. Houdini has great , so I was able to write a bit of code that generated geometry (points and lines) based on the data. Thanks to Eugénie’s example, I was able to figure out how to use the toxicity parameter in the data to color the nodes. I really owe her a beer… or five hundred.

I didn’t want to settle on the cone layout without first trying a few others, so I generated several different JSON files with the layouts and began to explore how they’d look. Here green corresponded to “healthy” and red to “toxic”. Each tweet was represented as a sphere, and a line was drawn between tweets to indicate that one had replied to the other.

Various layouts attempted for rendering the conversation trees

I brought these screenshots back to the team to get their input, and we decided that we should move forward with the cone layout turned upside-down so it looked like it was growing out of the ground.

Visually model toxicity

With a layout settled on, it was time to figure out how to actually make it look cool. The main goal was to represent toxicity as a withered, dead part of the plant. Knowing next to nothing about Houdini, I went blindly by the names of the operators. The first two I tried were “mountain” and “point jitter”.

Applying mountain to the edges and point jitter to the nodes

Unsurprisingly, something wasn’t quite right, so I began considering alternate possibilities: maybe we could try something less organic. What if the “trees” were made of metal and glowed? Sounded cool to me, but mostly I think I was just drunk on the idea of being able to easily add materials to my geometry.

One of the many bad ideas I tried.

Seemed a bit too much like Christmas tree ornaments for my liking. (What was I even thinking, a metal tree?) At this point, I went back to the basics and focused on showing a withered plant. Step 1 was to invert the cone, color the edges by toxicity using a more natural color, and try and use a small amount of jitter on the toxic edges to make them look a bit more degenerate.

Trust me, that trunk is structurally sound.

I felt like a real genius when I decided to add weird fruit to represent the tweets, but on closer inspection, I was still just myself.

An epiphany struck when I realized I was going to want to animate these plants growing into the scene. I had no idea how to do this, but I stumbled on to a that was similar enough. My current geometry that just used individual edges wasn’t going to work with the approach shown there — it required longer lines, not disparate edges. I modified my geometry to consist of lines from the root tweet to each of the leaves in the Python code and was very pleased with the results, which had a more organic look.

After I switched to using longer lines, the model looked a bit more natural
My personal favorite — the party tree! Toxicity encoded as a rainbow.

As I learned more about how the Houdini operators worked, I was able to make something a bit more respectable. I was feeling pretty good, it was starting to look a bit like a withered plant if you squinted. And I was even able to animate it growing!

Things finally started to come together!
Grow little plant, grow!

Not wanting to be strictly productive, I got a bit cheeky and thought I’d give a nod to my Canadian roots by adding in maple leaves to represent the tweets themselves. (Hey, it’s better than weirdo fruit, right?) Unfortunately I realized the manipulations I applied to the edges to make it look more organic had caused them to drift away from their original node positions, so some leaves floated orphaned in mid air. Oops! I’m sure there should be some way to solve this, but I couldn’t figure it out for the life of me. Instead I adjusted the way I jittered the edges by tapering to maintain the start and end points, resulting in nodes being connected.

I tried using maple leaves to represent the nodes, but oops! They no longer connected.

To my complete surprise, the rest of the team was less enthusiastic about the leaves, so I returned to just using simple sphere as the nodes. Around this time I decided I’d add in a little wavy trunk because it just felt right. The longer the reply chains were, the bigger trunk was.

It was time to try and duplicate this approach for the other twitter conversations we wanted to visualize. Given the procedural nature of Houdini, this turned out to be pretty straightforward (praise be, , praise be). However, upon reviewing the group of conversations together, we decided that the “natural” withered look wasn’t visually distinguishable enough from a distance.

Three different twitter conversations with varying toxicity and depth

I tried several different color variations, but settled on a bright orange as the signifier of toxicity. Turns out half-way between green and orange is vomit, so that worked out pretty well. The colors came out a bit subdued in the renders, but I was hoping I’d be able to brighten them up in the (something I had just learned existed).

With less natural colors, the differences between the trees are more obvious… right? Ok, I know it’s pretty dark.

You’ll notice the trees also changed shape in the above screenshot. We decided to resample the data used in the trees so they were more representative of the relative size of the actual twitter conversations. Sampling was necessary due to the extreme volume of single replies to the original tweets that made comparing differences more challenging in the visualization.

Lights, Camera, Action!

We were running out of time and the modeling was good enough, so it was time to try and create the video that followed the narrative of the presentation. The team had selected tweets and replies that we wanted to focus on and it was my job to navigate the camera around the scene in coordination with the growth of the plants to match what they wanted to talk about.

Having never really animated a camera moving around or really thought much about directing a movie outside of just taking casual photographs, this turned out to be pretty challenging. Kudos to directors everywhere. There were mystical comments on the internet about rigging up to the camera and animating them to make it work, but it was mostly gibberish to me. I persevered and did what I could, but boy iterating on these things was really slow.

It turns out that rendering with a CPU-based ray tracer takes its sweet time, and I was doing just that. The final result had 528 frames in it, resulting in a 17 second long video. (Yes, I did think to myself: all this work just for 17 seconds, have I lost my mind?) I had no idea it would take over 24 hours to render all the frames! This meant I settled a bit more than I would’ve liked since iterating took so long and I didn’t want my poor laptop to melt.

I brightened up the colors and superimposed the tweets in After Effects, and voilà! The video was complete.

The End Result

Here’s the final video, brightened up and with tweets superimposed!

The 17s raw video is .

Crowd Feedback

When the event finally took place, Twitter was tweeting about it with #OneTeam. I was able to find some tweets and see photographs people were taking from the audience. At least a few people seemed to enjoy the vis, so I was pleased!

Wrapping Up

In the end, it was a lot of work, but a ton of fun learning all the new technologies I needed to to make this unique visualization. The team was pleased with the result and it seemed to support our message conveying the variety of conversations taking place on twitter in terms of depth and toxicity. Perhaps next time I’ll look into trying a GPU renderer like , , or to speed up the process!

Work at Cortico

If you like working on exciting, creative projects, enjoy things like machine learning, natural language processing and data visualization, and want to help make the public sphere a bit healthier, come work with us at Cortico! We’re hiring several positions and would love to hear from you.

Thanks for reading! If you have any questions or comments, feel free to reach out to me here or on twitter .


Updates from our public sphere.

Peter Beshai

Written by

Director of Engineering and Design @Cortico. I enjoy data vis, javascript, generative art, and user experience design.



Updates from our public sphere.