The DAP Journey: Predicting Oil Prices
Churning words into figures through Sentimental Analysis
In this Medium series, BIA extracts the introspection of our Data Associates as they recall their academic exploration. This post features an analytics project on sentimental analysis, directed by Wayne and Zexel, supervised by Vikram and Randall.
“A billion hours ago, modern homo sapiens emerged. A billion minutes ago, Christianity began. A billion seconds ago, the IBM PC was released. A billion Google searches ago… was this morning”.
This quote by Hal Varian, the chief economist at Google, sparked my interest to make sense of big data.
Under the guidance of our mentors, Vikram and Randall, the team, comprising Zexel and I (Wayne), explored how we can predict oil prices from news data. As oil prices are largely influenced by perceptions of demand and supply, we wanted to predict sentiments in the market through sentiment analysis of news articles.
We explored the use of NLTK and TextBlob to process the text data, by generating positive, neutral and negative sentiments from the news data and using the linear_model from the sklearn library to correlate the changes in sentiments to the changes in oil prices.
We also used standard numpy and pandas libraries to perform further statistical analysis to explore the times of the day with the greatest volatility in prices to catch the optimal time that we should enter the markets so that we can improve the accuracy of the model.
However, we soon detected several challenges in our project. For one, there is an inconsistent number of articles written in a given day, causing the model we built to be inefficient as we have gaps in our analysis.
While one would expect news sentiments to be largely neutral, we found that differences in writing styles, unfortunately, caused us to have varying and inconsistent results. As the model‘s training data comprised articles from many authors, leading to varying levels of opinions for the same subject matter, this yielded extreme results.
Furthermore, due to the limited amount of data that our Twitter API could give us access to, we only relied on news sentiments for this model. Should we have had social media data, this could have increased the efficiency of our model as tweets are generally more opinionated than news articles.
While this project might not have yielded the best model to predict oil prices, I was able to deepen my knowledge about how to use Natural Language Processing tools in the form of various Python libraries, and learnt how to apply them in correlation to prices, which are a form of Time Series Data. This project also served as a stepping stone for us, as we explored libraries to handle different types of data and perform financial and social media analysis, areas in which we are hugely passionate about and would continue to explore in the future.
Why DAP? Let’s hear it from our Data Associates
Wayne Lim, SIS
“The opportunity that the Data Associate Programme (DAP) offers — a structured curriculum in teaching key concepts in Data Analytics and Data Science, and meet peers of similar interests — is what made me join DAP in an instant.”
Zexel Lew, SOE
“My journey in Data Science started when I attended a Python for Data Science course at Hackwagon Academy in 2018. Through the course, I picked up technical skills and grew my passion for Data Science. To my curious and investigative mind, it seemed a no-brainer to continue pursuing it after the course, to expand my knowledge by embarking on more projects and learning alongside others.
The Data Associate Program offered this opportunity. Knowing that it was a safe co-learning environment made up by a humble and selfless community, I joined the DAP to learn alongside many others who shared the same passion. By exchanging ideas, projects and resources with Data Associates and mentors alike, it was a stepping stone that would thoroughly enrich my Analytics journey. Throughout the program, it was characteristic to conquer challenging material alongside my peers, and to this date, I’m extremely grateful to have joined the DAP, to have met truly great people and explore the world of Data Science alongside them.”