Hedging with Machine Learning

Todd Moses
Fintech with Todd
Published in
6 min readMay 25, 2018

There are many ways to reduce trading risk through hedging. Funds typically use futures and options to hedge each trade. Similar to insurance, this safety net comes at a price. Perhaps using an AI based strategy, there is a way to protect a position at a much lower cost.

Before McDonalds introduced Chicken McNuggets, they had to hedge against the cost of chicken. If chicken prices rose dramatically, they would no longer be able to offer the product. After some time, a Financial Consultant determined that their were two costs associated with raising chickens. That is grain and water.

To ensure that the restaurant could continue providing chicken at a fixed cost, the company purchased an option on grain. Meaning that they had a rite but not an obligation to purchase grain at x amount. This ensured that regardless if grain prices rose, McDonalds could protect against the risk with this option.

A very smart move on the part of McDonalds corporate team, the only problem is that the cost of the option meant a larger overall cost for the chicken they served. This is not unique. A hedging strategy always cuts into profit while reducing risk.

Now that Machine Learning is a reality, is there a better method to reduce risk at less cost? Before diving in, a little background is needed.

Volatility represents uncertainty. A hedge represents the what if of a near worse case scenario. It is not all encompassing protection but a general safety net. In finance, risk is the possibility that an actual return on an investment will be lower than the expected return. This is measured with Value at Risk (VaR) or the maximum possible loss during the time, excluding worse outcomes.

Consider that a person purchasing a building for $200,000 and expecting it to rent it for $1000 per month. Before making the purchase, they will want some method to reduce the risk of only getting $800 per month. Their protection against the building burning down and destroying the asset is managed via insurance. Therefore, the only hedge concern is the $200 difference in rent between what is expected and a change in market rate for rents.

The first thing to do is to look at the data and determine the likelihood rents will go below $1000 for the property. There are many online sources for rental data listed by geography to use for this. If the rent is likely to go below the target amount needed for the investment to be profitable then a hedge should be secured.

For example, if the loan on the property is 15 years, one can obtain $200 per month for 15 years with an annuity. However, this annuity will cost $20,000. This will greatly reduce that the risk of generating less than $1000 per month, but increase the overall investment. That is the problem with hedging risk.

If this was a real real-estate transaction, there are several options for such an issue. For instance, one may try to negotiate the cost of the property down from $200,000 to a lesser amount so that the rent required is inline with the market. In addition, one may decide to not make the purchase if the rental market for that building is volatile. However, this is just an example to illustrate how hedging works.

First, consider Natural Language Processing or NLP. This is the technology where machines read written language and formulate rudimentary conclusions. Currently, many funds use NLP for a number of quantitative strategies.

If the investment required the processing of documents for research then NLP could formulate an improved hedging strategy. For example, a NLP based system can process millions of online profiles of people who meet the criteria of good renters. Using this information, a set of parameters can be formulated for persons with the highest likelihood of paying on time and remaining in the property for 2 or more years. Securing such tenants would reduce the risk of purchasing rental property.

Second, one can use a K-Nearest Neighbor algorithm or KNN. This is a basic machine learning technique designed to work with a database of classified data. KNN is non-parametric, meaning it does not make any assumption on the data distribution. This makes it work well when there is little or no knowledge of how the underlying data is distributed. The system just gathers category examples from the database.

Using feature similarity, a KNN system determines what class the new item belongs in. It does this by assigning a class that most closely aligns with the new items neighbors. Currently, KNN is used allot for fraud detection.

By taking both profitable and unprofitable investments, one can classify each type within a database. Each past investment is then placed into one of the following categories: high, average, and low. For a real estate transaction, it may include such parameters as volatility, number of rooms, square feet, zip code, special features, condition of property, shape of lot, number of garages, crime rate, etc.

Taking the potential property, a KNN system would classify it as having high, average, or low profit potential. For example, the system may determine that overall condition of the property, number of rooms, and square shaped lot make the most money. Therefore, when a property meets this criteria it is classified as having high profit potential. Reducing the risk of purchase by n percent.

This scenario is not a traditional hedge but a potential improvement for both buying and selling of assets. The goal is to eliminate the need for a hedge as that reduces the cost most substantially.

Last, one can use Random Forests. This is a machine learning strategy that takes training data and upon execution creates a multitude of decision trees. These are similar to flow charts where each line of decisions comes to a specific conclusion. The more data used for training the deeper the trees become and the more obscure the patterns than can discover.

Random Forests are used to rank the importance of parameters and have similarities with K-Nearest Neighbor algorithms but differ in respect to training data and overall use case. They are meant to discover clusters of samples based on a set of parameters.

In determining the best hedge, one can teach the system from the best and worst hedge examples for a specific investment type. Thus resulting in a system that can take an investment as input and return the best hedging strategy.

Conclusion

All three of the Machine Learning strategies discussed may hold promise in determining improved hedging strategy for specific investments — either alone or as part of a combined effort. However, they are only as good as the data used. Determining the correct dataset(s) to use is the most difficult aspect to Machine Learning.

The focus on data is the paradigm shift in creating Machine Learning systems. Instead of logic based algorithms, developers working on AI software will need to focus on model selection and training data. However, those able to accomplish such feats will be able to unlock new strategies never before realized.

--

--