A text analysis on why do planes crash

Hannah Yan Han
Apr 15, 2017 · 2 min read

Today I’ll explore text mining to examine reasons for plane crashes.

Top frequent words in plan crash description

The frequent terms revel useful keywords related to causes like pilot, fire and fog. It also contains another set of words description flying phases: take-off, landing, en-route.

Looking into words associated with most frequent keywords will give a clearer picture. As a word could be associated with multiple words, they can be viewed on a network.

Now we can make sense of the keywords with interactive graph: fire from overheating, fuel starvation, poor visibility as well as poor judgement, low altitude, overrun the runway, navigational error, and so on.

What I learnt today:

Using wordcloud2 package, one can turn term frequency into wordcloud masked by custom images or letters. However, one need to be mindful of font size, else top words may not show up.

This post is a continuation of the previous post on world’s most dangerous airlines.

This is #day14 of my #100dayprojects on data science and visual storytelling. This is a continuation from previous post that listed world’s most dangerous commercial carriers. Full code on my github. Thanks for reading and welcome to send me ideas and suggestions.

Written by

#100daysproject on data science and visual storytelling ✈️🗺️

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade