Harnessing Machine Learning for Disaster Tweet Identification

3 min readNov 27, 2023

GitHub Link: https://github.com/omkararade

Web Application Link: https://tweeterdisasterprediction-cr9v76rgtymn62s9uptsh8.streamlit.app/

In today’s hyperconnected world, social media platforms like Twitter have emerged as invaluable sources of real-time information, particularly during times of crisis. In the face of natural disasters, Twitter serves as a lifeline for individuals seeking assistance and organizations coordinating relief efforts. However, the sheer volume of tweets can make it challenging to identify those that are genuinely relevant to a disaster. This is where machine learning steps in

Machine Learning to the Rescue

Recognizing the potential of machine learning to address this challenge, I embarked on a project to develop a machine-learning model capable of accurately classifying disaster tweets. I meticulously compiled a comprehensive dataset comprising over 10,000 tweets, each meticulously labeled as either disaster or non-disaster. The dataset encompassed tweets related to a wide range of natural disasters, including hurricanes, earthquakes, and floods.

Exploring Diverse Algorithms

Through rigorous testing, I evaluated a variety of machine learning algorithms, including GaussianNB, MultinomialNB, BernoulliNB, LogisticRegression, SVC, MultinomialNB, DecisionTreeClassifier, KNeighborsClassifier, RandomForestClassifier, and AdaBoostClassifier. Each algorithm was carefully evaluated to determine its suitability for the task at hand.

BernoulliNB Emerges as the Champion

After careful consideration, I discovered that the Bernoulli naïve Bayes algorithm outperformed its counterparts, achieving an impressive accuracy of 81% on the test dataset. This algorithm proved to be particularly adept at handling the complexities of the task, consistently delivering reliable results.

Refining Performance through Hyperparameter Tuning

To further enhance the model’s performance, I employed hyperparameter tuning. Hyperparameter tuning involves optimizing the parameters of a machine learning algorithm to maximize its effectiveness.

In this instance, I meticulously tuned the hyperparameters of the Bernoulli Naïve Bayes algorithm, including the alpha parameter. The alpha parameter plays a crucial role in smoothing probability estimates. By fine-tuning the alpha parameter, I successfully elevated our model’s accuracy to 82%.

Conclusion: A Testament to Machine Learning’s Power

The findings of this project demonstrate the effectiveness of machine learning in classifying disaster tweets. My meticulously crafted Bernoulli naïve Bayes model achieved an accuracy of 82%, significantly exceeding random chance. This indicates that my model can identify disaster tweets with a high degree of precision.

My work holds the potential to make a tangible impact on the real world by assisting emergency responders in coordinating their efforts during times of disaster. I aspire that my work will inspire others to develop even more sophisticated machine learning models for disaster classification.

Additional Notes

Beyond the work described in this blog post, I delved into exploring various other approaches to disaster classification, including utilizing deep learning algorithms and sentiment analysis. Additionally, I considered incorporating additional data sources, such as news articles and social media posts, alongside Twitter data.

I firmly believe that there is still immense scope for advancement in the field of disaster classification. I remain committed to continuing my research in this domain and developing innovative approaches to identify disaster tweets with unwavering accuracy.

For more context

https://omkararade.github.io/Omkar_Portfolio/

Harnessing Machine Learning for Disaster Tweet Identification

Written by Omkar Arade