A text analysis on why do planes crash

Hannah Yan Han
2 min readApr 15, 2017


Today I’ll explore text mining to examine reasons for plane crashes.

Top frequent words in plan crash description

The frequent terms revel useful keywords related to causes like pilot, fire and fog. It also contains another set of words description flying phases: take-off, landing, en-route.

Looking into words associated with most frequent keywords will give a clearer picture. As a word could be associated with multiple words, they can be viewed on a network.

Now we can make sense of the keywords with interactive graph: fire from overheating, fuel starvation, poor visibility as well as poor judgement, low altitude, overrun the runway, navigational error, and so on.

What I learnt today:

Using wordcloud2 package, one can turn term frequency into wordcloud masked by custom images or letters. However, one need to be mindful of font size, else top words may not show up.

This post is a continuation of the previous post on world’s most dangerous airlines.

This is #day14 of my #100dayprojects on data science and visual storytelling. This is a continuation from previous post that listed world’s most dangerous commercial carriers. Full code on my github. Thanks for reading and welcome to send me ideas and suggestions.



Hannah Yan Han

#100daysproject on data science and visual storytelling ✈️🗺️ https://www.hannahyan.com/