AI Asset Management Report (Part 2)

Qraft AI

Published in

Qraft AI ETFs

11 min readDec 30, 2020

A three-part series that detail how AI can be a viable innovation driver for the asset management industry.

— — — — — — — — — — — — — — —

ETF: From Active to Passive

While hedge funds have made high returns by applying statistical arbitrage to portfolios rather than picking out single stocks, innovation has occurred in the way investment products are delivered to customers. Exchanged Traded Funds, otherwise known as ETFs, first appeared in 1993, slightly after Strategy C was introduced. Although it wasn’t popular at first, ETF slowly gained a following as more funds started to abandon selection of individual stocks. Investors leveraged ETFs by allocating assets through benchmark indices. This new approach fueled the explosive growth of ETFs. By the end of 2019, ETFs’ total assets amounted to nearly $5 trillion dollars.

CAGR — Refers to compound annual growth rate, which is one of the ways to determine returns for anything that can rise or fall in value over time.

ETFs’ key feature is the convenience of real-time trading. Before ETFs were introduced, you were limited to just two options when mirroring the movement of the S&P500 index: purchase an index fund or invest in every single 500 stock of the index. It was extremely cumbersome to buy 500 stocks individually. Index funds, on the other hand, provided an easier alternative but the only downside was investors could only buy/redeem once per day. This makes it impossible to respond during the actual trading hour.

The popularity of ETFs continues to grow to this day, absorbing fund outflow from high-cost active mutual funds that are no longer delivering excess returns. Since the early 2000s, actively managed mutual funds have been continuously experiencing capital outflows and most of the funds that have been withdrawn were redirected to the index funds and ETFs.

The future of ETFs has grown tremendously due to the explosive growth in companies from techfin, financial platform, and robo-advisory. Techfin companies pursue real-time personalized recommendations and process real-time customer data for contextual portfolios in sales and management. For this real-time service model to work, we think portfolios should consist of ETFs that have real-time tradability.

Other financial products such as mutual funds are subject to a NAV produced only once a day. They cannot be used for reall0time asset management services.

Active Index ETF

Active Index, Strategic Beta, Active Beta, and Smart Beta are all the same terms in a broad sense. This refers to an investment strategy that tracks the index but also seeks to enhance the return compared to the underlying index through strategic management. Currently, the size of the ETF, which clams to be an active index, is only 10% of the total ETF market, but the growth rate of smart beta is twice as fast as the rest of the ETFs. When we consider the ultimate goal of investing (i.e. higher return), there is little reason not to choose an ETF that seeks to move the same as the index and potentially generate higher returns relative to the index.

According to a global investors survey at etf.com in 2019, 92% of respondents said they invested in at least one active index ETF. Of those respondents, 26% said they chose the ETF as an alternative to an active fund. In addition, 66% of funds with AUM exceeding $10 billion are also adopting an active index fund. And this adoption rate is growing at rapid speed.

The biggest problem with the Active Index business is that, as noted earlier with quant hedge funds, finding alpha isn’t a cheap process anymore. However, ETFs can not only penetrate a larger market at low cost (many operating at less than 1% per annum), but it won’t charge a performance fee due to the real-time daily trading nature. In addition, the demand is ever so growing as there are thousands of indices that need to be enhanced.

The labor-intensive approach and high-cost structure cannot be the ultimate solution. While most active index ETFs are managed cost-efficiently with a maximum of 10 factors in consideration, the strategies created only reflect the linear relationship to those factors. Thus, we think that performance cannot be outstanding but mediocre at best.

To solve this issue, innovation must be necessary to capture two important factors: low cost and high returns.

Automated Quant Research

As explained earlier, the expensive approach to seeking alpha and hire numerous analysts is inevitable for quant hedge funds. Elites who come from Ivy Leagues join quant funds to find excess return strategy by organizing data, preprocessing it, and back testing several ideas. If they feel that “momentum strategy works well after company disclosures,” they would try to back test and forward test that idea several times in order to find alpha. Several reiterations are required from different angles: which universe works better, which disclosures work better, which measure-based momentum strategy works better, which time works better after disclosure, etc. However, those trials end up with little to no useful results most of the time.

1. We can expedite the speed of research on excess return strategies, and

2. It is possible to automatically generate the portfolio management strategy without employing an army of highly paid researchers,

THEN

It is potentially possible to offer alpha at a lower cost to a wider group of investor base with active index ETF. This is a sure way to advance and secure a superior market position in a rapidly growing active index market.

To see if such innovation is even possible, it is necessary to define the main issue more clearly.

The Problem of Automatic Generation of Investment Strategies and AI

Simply put, finding a portfolio management strategy is like finding a function f that is expected to perform well in the future for investment universe (U) and input data (X). For example, if you break down the S&P Index (“P”), the investment universe (“U”) is [US large-cap stocks], the input data (“X”) is [market cap], the function (“f”) is [invest at market cap ratio, with quarterly rebalancing].

To simplify what quant funds do with the above formula, they try to find a desirable function (“f”) by experimenting with various candidates of “U” and “X”.

For “X”, the candidates were much simpler in the past. They included price data, macro data, and financial data of all individual stocks (e.g., interest rates, exchange rates, indices, and economic indicators.) Many researchers still use only these three data types for the “X” candidate. For example, a simple function “f” to buy the bottom 10% of the stock with the lowest PBR with annual rebalancing requires just the stock price and net asset value for the input data “X”.

However, a simple function like above (along with Strategy C) no longer performs well. That’s because too many investors are already using it. It is necessary to find an investment strategy “f” that’s easy to find but can also bring excess returns in the zero-sum market.

More specifically:

[Data Differentiation] — When you use data, X, that others do not consider as possible parameters
[Investment Universe Differentiation] — Cases where the investment universe is dynamically defined.
[Function Differentiation] — If “f” is complex and has a non-linear relationship.

When any of those above conditions are met, other researchers will not be able to detect your winning strategy for a considerable amount of time.

1. Data Differentiation

Attempts to access proprietary and differentiated data sources may sound fancy, but surprisingly there’s only been a few successful cases so far. This is primarily because no matter how differentiated (often private and unstructured) data is, data that is not related to the movement of the actual portfolio is quite useless. There aren’t many private data that contains rich alpha sources. The following examples explain this well:

A big quant fund launched its own satellites to measure the size of the Earth’s glacier to optimize their natural gas futures trading. Eventually, however, they abandoned the project due to a lack of fund performance and all satellites had to be sold subsequently.
Attempts to use Walmart’s parking lot image data from satellites for trading eventually failed.
A hedge fund, which traded the news by measuring the sentiment of each stock with natural language processing, announced that it had to change the strategy because of poor performance.
Another hedge fund founded by a group of professors had used Twitter to mention data for trading. However, the media attention quickly faded when their performance was lacking.

Of course, we cannot simply generalize the above cases. Using good and effective data will obviously give you an edge over the market. However, looking at a few attempts, we can already see that it’s far better to gain an advantage by leveraging public data than to craft a strategy using hidden data.

The reasons:

many of the unstructured data lag the stock price
there is a high probability of overfitting due to an insufficient data samples or difficulty in conducting backtests over a long period of time
the alpha deposited in the data was not very big, presumably

When you search through a vast amount of data (most of which are unstructured) like Google user’s search data, you may feel that these new loads of information might be useful for stock trading. However, upon closer look, the level of true alpha you can extract out of such new data is for the most part, smaller than just mere price data.

2. Function Differentiation and Investment Universe Differentiation

If you can derive very complex patterns that others have not yet seen before from the same-given data set, your chance of seeking alpha will increase. The problem is that human cognitive ability is not really designed to recognize/understand non-linear patterns.

Linear patterns like, “if you invest in low PBR stocks, they will likely rise in the future”, or “if you invest in underperforming stocks, you’ll get higher returns”, fits the human cognitive structure better. However, suppose the stock price follows a random formula below with a considerable probability, it is difficult to spot/identify such pattern (or formula), especially when there is always some level of noise in the data in reality.

Even at a lower degree of complication, the phenomenon that says the predictive power of PBR varies depending on the size of the company is a simple non-linear pattern that is a source of alpha, is not easy to find.

In other words, to succeed in function differentiation, we need the tools that can easily help us find non-linear patterns. Just as we needed the organized data sets and computers to discover Strategy C back in the 1980s.

The same goes for investment universe differentiation. Qraft Technologies (“Qraft”) published the results of a study that momentum and value factor investing works well for US large cap stocks, especially after a month or two when companies post filings/disclosures. For humans to easily find the same results (even when done automatically), an efficient tool must be required to easily back test what happens to a specific pattern within the one to two month period post the public disclosure of individual stocks in the U.S. Of course, without this tool, you can theoretically back test the strategy via multiple layers of complex coding.

However, there is a big difference between finding such a successful strategy from scratch and back testing the already discovered strategy. In other words, what you “could” find is different from what you have found — just as people were not able to notice a simple Strategy C due to easily inaccessible data and computers. It’s nearly impossible to find a winning strategy without an efficient tool that can handle dynamic investment universe.

Qraft was able to find a strategy for the dynamic investment universe because we have an efficient tool that handles the investment universe as a “variable” and not a “constant.”

The well-designed deep learning model works best for finding function “f” that reflects both the non-linear relationships and dynamic investment universes.

3. Dimensions

The combinations of functions to back test are endless. There are literally thousands of data fields available for quant researchers. Considering the degree of freedom that could be examined in an investment universe, the number of functions that can combine each data field is actually infinite. Think about the Korean Go game. The brute force method, which tests and rules out all the cases possible to discover a function (strategy), is nearly impossible.

An experienced quant researcher, like a professional Go player, can narrow down the number of cases to be tested by intuitively figuring out the relationship between the stock performance and its data. This shortlisted function is again tested to produce a candidate for a winning strategy. This naturally increases the probability of finding an investment strategy. To automatically extract a good strategy without an experience quant researcher, you would need to come up with another way to narrow down the vast search space. AlphaGo solved this problem by applying several different techniques, including deep learning technology and outstripped human abilities.

Deep learning technology can solve the problem of finding an optimal function (investment strategy) in a huge search space, as revealed by the AlphaGo case.

4. Overfitting

Overfitting must be handled to ensure the quality of a strategy. It is dangerous to fit the model using all the available data. The results from back testing would be fine, but the same results cannot be achieved in practice which is the out-of-sample test. In the case of financial data, it is not easy to copy with overfitting because the length of the time series data is short and the characteristics of the market changes frequently.

It takes a lot of time for human researchers to find a model that fits well with all the previous data sets available. Hence, some human researchers create a model using the entire dataset (ignoring overfitting) and try to reduce the risk of overfitting by assessing the reasoning (e.g. the rationality of the strategy).

If you build a system that automatically finds the function f by applying deep learning, you can make the system learn to use the data released before a specific time of prediction (inference). This greatly reduces the probability of overfitting.

5. Rusty Strategies

Strategies built by quant research methods are basically static, which means that new data flooding in is not reflected in the strategy. If the strategy deviates from the market and does not fit well, you should either discard it or maintain it with new data. Investment strategies created through a deep learning model, however, learns new data every day and the weight produced by the neural network changes with each new data set. This in turns, expands the life of the investment strategy much longer. (Even deep learning models are subject to new model engineering whenever a new type of data sets is introduced).

To continue reading to part 3, click here.