The DAP Journey: Predicting Oil Prices

Churning words into figures through Sentimental Analysis

In this Medium series, BIA extracts the introspection of our Data Associates as they recall their academic exploration. This post features an analytics project on sentimental analysis, directed by Wayne and Zexel, supervised by Vikram and Randall.


“A billion hours ago, modern homo sapiens emerged. A billion minutes ago, Christianity began. A billion seconds ago, the IBM PC was released. A billion Google searches ago… was this morning”.

This quote by Hal Varian, the chief economist at Google, sparked my interest to make sense of big data.

Under the guidance of our mentors, Vikram and Randall, the team, comprising Zexel and I (Wayne), explored how we can predict oil prices from news data. As oil prices are largely influenced by perceptions of demand and supply, we wanted to predict sentiments in the market through sentiment analysis of news articles.

We explored the use of NLTK and TextBlob to process the text data, by generating positive, neutral and negative sentiments from the news data and using the linear_model from the sklearn library to correlate the changes in sentiments to the changes in oil prices.

Positive, Negative and Neutral sentiments

We also used standard numpy and pandas libraries to perform further statistical analysis to explore the times of the day with the greatest volatility in prices to catch the optimal time that we should enter the markets so that we can improve the accuracy of the model.

linear_model

However, we soon detected several challenges in our project. For one, there is an inconsistent number of articles written in a given day, causing the model we built to be inefficient as we have gaps in our analysis.

While one would expect news sentiments to be largely neutral, we found that differences in writing styles, unfortunately, caused us to have varying and inconsistent results. As the model‘s training data comprised articles from many authors, leading to varying levels of opinions for the same subject matter, this yielded extreme results.

Furthermore, due to the limited amount of data that our Twitter API could give us access to, we only relied on news sentiments for this model. Should we have had social media data, this could have increased the efficiency of our model as tweets are generally more opinionated than news articles.

While this project might not have yielded the best model to predict oil prices, I was able to deepen my knowledge about how to use Natural Language Processing tools in the form of various Python libraries, and learnt how to apply them in correlation to prices, which are a form of Time Series Data. This project also served as a stepping stone for us, as we explored libraries to handle different types of data and perform financial and social media analysis, areas in which we are hugely passionate about and would continue to explore in the future.


Why DAP? Let’s hear it from our Data Associates

Wayne Lim, SIS

“The opportunity that the Data Associate Programme (DAP) offers — a structured curriculum in teaching key concepts in Data Analytics and Data Science, and meet peers of similar interests — is what made me join DAP in an instant.”

Zexel Lew, SOE

“My journey in Data Science started when I attended a Python for Data Science course at Hackwagon Academy in 2018. Through the course, I picked up technical skills and grew my passion for Data Science. To my curious and investigative mind, it seemed a no-brainer to continue pursuing it after the course, to expand my knowledge by embarking on more projects and learning alongside others.

The Data Associate Program offered this opportunity. Knowing that it was a safe co-learning environment made up by a humble and selfless community, I joined the DAP to learn alongside many others who shared the same passion. By exchanging ideas, projects and resources with Data Associates and mentors alike, it was a stepping stone that would thoroughly enrich my Analytics journey. Throughout the program, it was characteristic to conquer challenging material alongside my peers, and to this date, I’m extremely grateful to have joined the DAP, to have met truly great people and explore the world of Data Science alongside them.”

SMUBIA

To be a community for people passionate about data to learn, grow and connect with one another.

SMU Business Intelligence & Analytics Club

Written by

To be a community for people passionate about data to learn, grow and connect with one another.

SMUBIA

SMUBIA

To be a community for people passionate about data to learn, grow and connect with one another.

More From Medium

More on Data Associate Programme from SMUBIA

More on Data Associate Programme from SMUBIA

The DAP Journey: Alcohol Analytics

Kaihui
Dec 31, 2019 · 5 min read

50

More on Data Science from SMUBIA

100

Also tagged Data Analytics

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade