Reinforcement Learning in Finance

5 min readJul 24, 2022

Reinforcement Learning has immense applications in the field of finance. The recent availability of large datasets has made the application of Machine learning and Deep learning possible in finance. Many problems in finance also fall into the sequential decision making paradigm. This means we can apply Reinforcement learning in one form or the other. We will look at some of the papers applying RL to various problems in finance. This list is by no means exhaustive as the field is continuously evolving but this should be a good starting point for anybody looking to get started. Some of the areas in which it can be applied are as follows :

Market Making
Portfolio Management
Options Pricing
Trading
Optimal Execution

Market Making

Market makers are traders ( individual or institutions ) who provide liquidity by placing buy & sell orders in the LOB. They earn a profit equal to the bid-ask spread while doing so. Market makers provide the market with liquidity and depth while profiting from the difference in the bid-ask spread. Market makers are compensated for the risk of holding assets because they may see a decline in the value of a security after it has been purchased from a seller and before it’s sold to a buyer.

Few papers discussing RL in Market Making:

Trading

Trading is buy and selling securities or financial instruments with the objective of earning a profit. Trading is one of the fields where Machine learning has immense potential. Traders usually look at price information and estimate if a stock is going to go up or down. Trading can be done at different horizons. It can right from intraday ( holding periods of a few minutes to hours ) to multi-day or multi-month holding periods. Different trading strategies use different datasets for better accuracies. Some of the datasets which can be used are Price ( OHCLV data ), News, Fundamental information, Economic data etc.. . ML models are built with these datasets with the objective of predicting the future direction or return of the stock or stocks.

Trading is also a sequential decision making problem and fits naturally in the framework of RL.

Few Papers discussing RL in trading :

Portfolio Management

Portfolio Management is the process of selecting assets ( sometimes from different classes i.e stocks, bonds or cash etc.. ) to maximise a certain objective. This objective is usually to maximise profit along with some notion of risk management. Portfolio Management involves the following main steps

Portfolio Allocation : Allocation refers to the task of selecting which assets are part of the portfolio and how much weightage it receives. Changing the allocation can impact the performance of the portfolio quite drastically.
Diversification : Diversification refers to the selection of ‘diverse’ or uncorrelated assets. This ensures that we pick assets which behave differently thereby reducing our risk. Picking uncorrelated assets ensures we are hedged against market movements or cycles. This is essentially of form of risk management.
Rebalancing Portfolio : Market conditions are always changing due to evolving economic or political news or situations. Therefore our Portfolio Management system should be able to respond to these changes and reallocate & diversify over time. This process of changing allocation periodically or in response to events is called rebalancing. This ensures our portfolio aligns with the market regime and performs accordingly.

Few Papers discussing RL in Portfolio Management :

Options Pricing

Options are derivatives i.e the options price is derived from an underlying stock or asset price. An options contract offer the holder to buy or sell the underlying asset at a particular price. The holder is not obligated to buy or sell the asset but its an option. Each contract has an expiration date by which the holder should excise their option. The stated price of the option is called the strike price.

Options are priced using the well known Black-Scholes Merton pricing equation. The idea behind options pricing is that it can be priced as a portfolio of other tradable assets. This is known as dynamic option replication. The option is replicated using a portfolio of shares of the underlying asset and cash component. The portfolio is continuously rebalanced to match the option price. Cash inflow or withdrawal is not allowed in the portfolio.

Few papers discussing RL in Options Pricing :

Optimal Execution

Optimal execution is a fundamental problem in financial modelling. The simplest version is the case of a trader who wishes to buy or sell a given amount of a single asset within a given time period. This is a challenge because any activity in the market ( buying or selling ) has a market impact. For example if you are buying a large amount of shares then the avg price you will get is higher because the action of buying will increase the prices and similarly if you are selling the price will decrease giving you a lower price. The market impact is always negative hence the trader who is buying or selling will have to take into account this impact else it can have negative impact on the transaction.

Few papers discussing RL in Optimal Execution :