My first Deep Learning Project

Mateus Nobre Santos

4 min readAug 17, 2020

Looking for patterns on stock market using sentiment analysis

TL;DR:

It’s pretty difficult to notice patterns on stock market. Correlating it with financial news sentiment analysis can give us a glance of those patterns.
Transfer Learning is a extremely useful technique when you have limited time and resources.
Interpreting the results of the model and translating it to real life context is a very important and hard part of the process.
Working alone is great, but on a team you can do bigger things and take a bit of knowledge from each person.

Before everything..

Huge SHOUTOUT to my teammates:

João Sarmento (https://www.linkedin.com/in/joaolrsarmento/)

Kenji Yamane(https://github.com/kenji-yamane)

Huge THANKS to my professor:

Marcos Ricardo Omena de Albuquerque Máximo (Escavador)

Problem:

The original idea of the project changed a lot. But the major idea was to identify patterns on stock market and correlate it with sentiment analysis of news.

The success criteria was generate valuable insights and get a proof of concept on real life (a $0.10 profit for example).

Solution:

We used a dataset from Amazon Reviews to train a model from BERT on sentiment analysis, applying transfer learning to our problem.

Financial news about Netflix, Amazon, Facebook and other companies were taken using Stock News API and the historical data using Nasdaq website.

Using a “degree of positivity” metric, created for us to be able to see the patterns, we merged data from financial news sentiment analysis and Nasdaq.

The graphs for analyzing results are time-series of our metric and the fluctuations of the market.

Model:

BERT, Bidirectional Encoder Representation for Transformers, is designed to train deeply bidirectional representations by considering both left and right context. We used Transfer Learning to our problem from the BERT base model (109M parameters).

Transformer structure, i’m still trying to understand it

Results:

Fine-tuning result after 2 epochs

The model improved a lot after 2 epochs, so our fear of getting stuck and don’t get results using transfer learning was unfounded. We left everything to the last minute, so we had very little time to train the model.

Daily traded volume of NFLX (Netflix) compared to the ‘degree of positivity’ of the market on the day.

Value of NFLX shares (Netflix) compared to the ‘degree of positivity’ of the market on the day.

These 2 graphs show some of our results. The most promising result was on the daily trade volume, but even that can’t tell us a story about the market.

On the volume graph, we can notice that bigger fluctuations on ‘degree of positivity’ are sometimes related to bigger traded volumes. But the data are not able to relate to that big booms on traded volume.

What I Did:

I was in charge of:

Getting the results of the model on stock news.
Interpreting the results,
Cross data from Nasdaq to the sentiment analysis on a meaningful way and provide insights.
Getting real time predictions with our model using Gradio
Orchestrate it on a project, so anyone can use our work and reproduce our results (that part really takes time)

I got the trained model.h5 from Kenji and the news from João and did my job on top of their work.

What we could’ve done better:

A decent statistical analysis on the stock market data, trying to understand days with large variation, min and max and seasonality.
Use a faster version of the model like TinyBERT to be able to process more data.
Fine tuning on financial news data to better ‘transfer learning’ from BERT.
Analyzing the results more carefully, trying differents visualizations and ways to build that ‘degree of positivity’ metric.
Train more.

Context about the project:

I enrolled on CT-213 (Artificial Intelligence in Mobile Robotics) that semester. With no experience in python, but a lot of willpower and help of other colleagues (as the ones that developed that project with me), i struggled and learned a lot!

From gradient descent to genetic algorithms and reinforcement learning, i learned about the possibilities and limitations of neural networks, the math behind it and how to apply it on toy problems.

Source Code and Project Report:

See the source code at Github