Sentiment Analysis for Stock Price Prediction using Bloomberg Through utilizing Python and Machine Learning

Em Fhal
CodePubCast
Published in
12 min readJul 29, 2021

In this tutorial, I will explain in the simplest manner why it is imperative for an investor in the capital market, with access to a Bloomberg account, to have an understanding of Python and the machine learning add-on. Since the world is consistently evolving and utilizing technology across all sectors it is important for people working in the finance sector to remain on top of technology and therefore modernize the sector by employing programming tools, such as Python.

Python is one of the easiest and user-friendly languages available for the technological advancement of a private or professional investor. It is built on importing and using different packages.

In this tutorial, I will go over everything an investor, with no previous experience in Python and machine learning, will need to succeed in gaining a basic understanding of these programming tools.

Before we begin I will first introduce myself. My name is Emmanuel and I have been working as an Investment Portfolio Manager for the past 6 years. I graduated with an MBA in finance and a bachelor’s degree in banking and the capital market. In addition, I recently completed a BSc degree in computers and I am currently studying for an MSc degree in computer science.

The first part of the tutorial will focus on the installation of Python through first installing Anaconda. In the second part, we will import data from Bloomberg to a CSV file and use the data as a dataset. The third part will work with Keras (Tensorflow) and engage in the machine and deep learning. Finally, in the fourth part, we will deal with the virtualization of the data.

So let's begin.

Part A: Installation

A.1. Anaconda Installation

First, go to the Anaconda website. In the main menu select products and then select “individual” or enter directly to https://www.anaconda.com/products/individual and then click on the green “Download” button on the left, as seen in the below image.

Figure A.1.1.— Download the Anaconda Individual Edition according to your Operating System

Next, download the software. When the software has finished downloading, double-click the installation file and click “Next >”. Continue with the installation until you reach the next screen(Figure A.1.2.). It is important that the second option is checked (Anaconda definition as main Python).

Figure A.1.2.

After clicking “Install” you will need to wait around 5 minutes for the software to be installed. It is important not to close the screen before you see the next screenshot (Figure A.1.3.). Leave the 2 markings on the screen and click “Finish”.

Figure A.1.3.

You have now successfully installed Anaconda. To start using Anaconda you can double click on the desktop icon (or just search “Anaconda” in your search bar) and run it.

Figure A.1.4. — Anaconda Navigator Home Screen

A.2. Working with Jupyter Notebook

Welcome to Anaconda. Now it’s time to start working on Python's Jupyter Notebook, all you need is to double click on the Notebook (Jupyter) Launch button.

Figure A.2.1. — “C:\Users\Administrator” Jupyter Netbook Files Screen

You’re probably wondering what exactly is Jupiter and how it relates to working with Bloomberg and machine learning. Jupiter is in fact your gateway to using Python in the most user-friendly way. It works by creating .ipynb files that work on a code line and an output line making it very convenient and easy to use (more so than Excel in my humble opinion).

Figure A.2.2.

Let’s start by demonstrating how to work with a notebook. First, we will create a new notebook by clicking the “new” button that appears on the right side of the screen (Figure A.2.2.). Then select Python 3.

The image below illustrates what the Python Notebook looks like (Figure A.2.3.). All you have to do is type a line of code in the textbox and then click on the run button. Don’t panic, I will next explain how to write a simple code.

Figure A.2.3. — Jupyter Notebook screen

For convenience, I will use the most common code, print(“hello world”), to demonstrate. First, type print(“hello world”) into the textbox and press “Run”. Next, you will get an output line below the textbox. It is important to know that not every input code has an output.

Figure A.2.4. — Print “Hello world”

There is a very important emphasis on the order of the lines of code. Usually, the first line of code is used for importing different libraries. I will now go on to explain how this works, in the next section.

A.3. Installing Bloomberg Python add-on

If you’ve got this far, it means that you have already survived the complicated part. Now all you have left is to install the Bloomberg plugin through the notebook (it is mandatory to install it on a computer with an active and working Bloomberg terminal).

As I mentioned in the previous section, we usually start the notebook by importing libraries. In our case, we will work with the “xbbg” library, which is the Bloomberg Python library.

Upon trying to import the Bloomberg library you will get the following error.

Figure A.3.1 — import blp xbbg before the installation

Please follow the below instructions to successfully install the Bloomberg xbbg library.

Figure A.3.2 — https://pypi.org/project/xbbg/ Requirements and Installation screen

Now to install the package to your Python notebook, all you have to do is add an exclamation mark and write down the instructions from the website (https://pypi.org/project/xbbg/), as seen in the below image (figure A.3.3).

figure A.3.3 — xbbg installation screen from Jupyter notebook

Now, if you run the xbbg import screen again (figure A.3.1) you will get the following error.

Figure A.3.4 —import xbbg error output before blpapi installation

Finally, to complete the installation you need to install the blpapi plugin by accessing the API Library | Bloomberg Professional Services website(of course from a computer with an active and working Bloomberg Terminal). You can find instructions for the Python API library plugin at the bottom of the page.

Figure A.3.5 —Installation instructions for the API Python

Copy the line marked in yellow in the above image and paste it into your notebook. You will need to make 2 changes, the first is to delete everything before the word pip and the second is to insert an exclamation mark at the beginning. As demonstrated in the following screen:

Figure A.3.6 — blpapi installation

You now need to check that everything is in order and working properly. After typing in the notebook textbox “from xbbg import blp” if you do not receive an error you are ready to work and move on to the next part.

Figure A.3.7

Part B: Creating a dataset

B.1. Working with Bloomberg Python add-on

Before creating an advanced dataset in a CSV file we will play around a bit with xbbg’s options. Similar to the plugin in Excel here you also have bdp and bdh functions, where bdh function is for historical data (with which we will work more).

We will start with a new notebook (or just delete all the installation lines you did before) and write down the lines of code as in the screenshot below (these lines of code appear as examples on the xbbg page we attached earlier).

Figure B.1.1 — bdp and bdh functions

As you can see, the first line of code is the import of the libraries you will work with. The second line is working with bdp function which is for live data so the first parameter is the ticker name (as we know it in Bloomberg) and the second parameter is the list of fields we want to display (in our example it is the name of the company and its sector).

The third line is working with a bdh function that is historical data that will be used in the later parts. As you can see we will be working on this example with an index called “SSE Composite Index” stocks traded on the Shanghai Stock Exchange. The second parameter similar to bdp function is the list of fields. The third and fourth parameters are the start date and the end date of the data we want to import (it is important to use the format yyyy-mm-dd). The fifth parameter is the resolution of the data (when W is at a weekly level you can change to a daily or monthly level, etc.) When the sixth parameter is “Fill= P” if there is no available data on the selected day it will automatically use the data from the previous available data.

Bonus. To get acquainted with the codes of the various fields in Bloomberg (and there are lots of them) I recommend working with the FLDS screen on the Bloomberg Terminal screen. Here you can easily find your way around Bloomberg’s, particularly large field list.

Figure B.1.2 — FLDS <GO> (Data Field finder)

B.2. Export and import dataset

In the next screen, I will demonstrate the format you will work with for importing and exporting data from Bloomberg to a CSV file with which you will use later for machine learning. Now, I will demonstrate how to forecast a particular stock price using certain fields (like a high and low daily price, volume, etc.).

I will use the basic example of creating a data set based on the price of the euro/dollar from 2019–06–20 to 2021–06–20.

The value of the first row is the tickers of the securities I am interested in importing. If we want to work with a large number of tickers, we will have to separate them with a comma and space (tickers = “EURUSD Curncy, USDMXN Curncy, USDCAD Curncy,“). In the example, we will list the ticker of the euro/dollar exchange rate.

In the second row, we will enter all the fields we are interested in importing. In this example, we will simply import the gate price and the low and high price (in the previous part I explained how to find the codes of the fields using the FLDS field). In the next 2 lines, I just entered the start and end dates.

Figure B.2.1
tickers = “EURUSD Curncy”
commands = “PX_LAST, PX_HIGH, PX_LOW”
start = ‘2019–06–20’
end = ‘2021–06–20’
filename = (‘’.join((tickers,”+”,
commands,”+”,
start,”+”,
end)))
if os.path.exists(filename+’.csv’):
data = pd.read_csv(filename+”.csv”, header=[0, 1],
parse_dates=True,
index_col=0)
else:
data = blp.bdh(tickers=tickers.split(‘, ‘) , flds=commands.split(‘, ‘), start_date=start, end_date=end,Per=’D’, Fill=’P’, Days=’A’, adjust=’all’)
data.to_csv(filename+”.csv”)
data.head()

Basically what the lines of code do. First of all, we define what we want to import. We then create a file name from the same data.

The second part is the conditional part which checks if we have already created a dataset for the same data and if not it will automatically import the dataset from Bloomberg using the “bdh” function and save it as a CSV file.

Figure B.2.2 — data.head() = Viewing the first 5 lines

In order to display the dataset, we will use .head() and we will get only the first 5 rows. If we want to see the last 5 rows we will use .tail() instead.

In addition, we can access the CSV file (the dataset) we created in the folder itself (the address is usually c:/Users and the current username folder).

Figure B.2.3

B.3. Dataset stock price prediction

Everything we have done so far has been in preparation for this moment. Now it is time to work. The purpose of this work is to predict a specific stock price (in our case I randomly chose Apple Inc*) by additional fields. I chose to use sentimental analysis by using Twitter sentimental analysis on the stock. Is the tweet either positive or negative or just neutral? We will get the data through the use of Bloomberg.

* It is important to note that everything listed in this article is not a recommendation for these investments but are only for demonstration purposes.

Figure B.3.1

Part C: Machine Learning

I have added all the following codes to my Github account and they will be available for everyone to copy.

In machine learning we use the dataset we defined in the previous section. In it, we define a target variable (called y) and separate it from the rest of their variables called X.

Now in the second stage after we have separated the target variable from the other variables we basically divide the lines into training data and test data. In our case, I decided to give the training data 90 % and the rest of the data to the test data.

C.1. Keras TensorFlow installation

In this work, we will work with Keras embedded within Google’s Tensorflow. In order to start working, you will need to install it. In most cases, the following line of code will suffice. If there are no detailed instructions on the Tensorflow website.

!pip install tensorflow

C.2. Time Series Prediction with LSTM and GRU

First I will explain what exactly LSTM is. LSTM is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. LSTMs were developed to deal with the vanishing gradient problem that can be encountered when training traditional RNNs. They can process not only single data points (such as images) but also entire sequences of data.

In this example, I will use LSTM and GRU models to predict Apple stock price by different parameters (Twitter sentimental data and more).

Figure C.2.1 — Splitting the dataset to X and y and to training and test data

After we split the dataset into training and test its time to create the LSTM model:

Figure C.2.2—LSTM model
Figure C.2.3— LSTM model summary (Layers, Output SHape and Params)

Next, we create a GRU model.

Figure C.2.4 — GRU model

After creating our 2 machine learning models it's time to visualize and understand the different results.

Part D: Virtualization

In this part, I will present only the results of this guide. And finally, a small summary in which we compare the performance of technical analysis versus machine learning.

Figure D.2 —

D.1. Technical Indicators

Figure D.1.1 — Technical Indicators
Figure D.1.2 — Last price, Upper and Lower band
Figure D.1.3.—Moving Average
Figure D.1.4. — Summary of Technical indicators

D.2. Machine learning Forecasting

LSTM Forcast

Figure D.2.1. — LSTM Forecast

GRU Forecast

Figure D.2.2. — GRU Forecast

D.3. Summary

Figure D.3.— Comparing between the forecasting

First, in order to understand what is presented in the above pictures, I will explain what MSE and MAE are.

  • Mean Squared Error* (MSE) represents the average of the squared difference between the original and predicted values in the data set. It measures the variance of the residuals.
  • The Mean absolute (MAE) error represents the average of the absolute difference between the actual and predicted values in the dataset. It measures the average of the residuals in the dataset.

*Credit: https://medium.com/analytics-vidhya/mae-mse-rmse-coefficient-of-determination-adjusted-r-squared-which-metric-is-better-cd0326a5697e

As you can see by comparing the different forecasting over MSE and MAE the best result is the 7 days moving average, and the second is the GRU prediction.

When 7 days moving average technical indicator is working on our target column (PX_LAST) the method of the machine learning (GRU forecasting) is separating between our target column to the other column.

When you feel confident in forecasting stock prices in relation to their sentimental value which is established through Twitter tweets, my job here is done.

It was helpful? I would love to hear your feedback. e@fhal.org / linkedin.com

https://github.com/emfhal

--

--