A Probabilistic Relative Valuation for the Financial Sector Using Deep Learning

Jonathan Legrand
LSEG Developer Community
3 min readAug 17, 2022

The full article can be found on the Developer Portal, written by Martin Zornoza Auñon, the winner of 2021 Q.2 Refinitiv Academic Competition.

This article studies the implementation of deep learning to predict probability distributions using Python. These describe the daily returns of stocks of the companies in the financial sector, listed in the S&P500, using the data obtained from the Refinitiv Eikon API from 2002 to 2020. In addition, it proposes an approach to obtain a relative probabilistic value defined as the predicted mean divided by the predicted standard deviation of the returns of a stock.

Using six variables, obtained from the financial statements of the companies, as input to predict the parameters of four chosen distributions (normal, JohnsonSU, a mixture of three normal distributions, a mixture of two Johnson distributions), the results obtained range from 7 to 518 BPS each year in the test set.

The proposed neural networks obtain lower losses in the test data (including the period of the Covid-19 of 2020) than in the training data, even when they have more than 1400 trainable parameters without regularization nor dropout layers. Although it requires further study, the prediction of a probability distribution of returns allows for more informed decisions, showing an alternative use of deep learning for predicting stock returns.

The proposed approach offers two advantages compared with other deep learning structures. On the one hand, it allows to train more parameters without overfitting to the data as it can be trained on higher frequencies. On the other hand, it allows for more informed decisions as there is a predicted probability of returns. The method of RPV shows an example of using returns distributions for selecting stocks:

However, this method also has some drawbacks. The subjective choice of the distribution influences the predicted results. Moreover, the initialization of the weights acquires greater importance compared with other models.

In conclusion, this model produces more informative outputs, although it comes with the price of added complexity for training. Further investigation could lead to improved results using other frequencies of returns or even tick data. In addition, a smarter initialization of the weights could achieve convergence in a more consistent manner.

The full article, including the python code, can be found on the Developer Portal.

References

RELATED APIS

SOURCE CODE

RELATED ARTICLES

--

--