Transform ETF Rebalancing: Proven Strategies to Optimize Costs and Enhance Fund Efficiency ๐Ÿ“Š

Takahiro Aiki
Blockhouse
Published in
11 min readJun 25, 2024

When managing ETFs, maintaining the intended asset allocation and risk exposure through systematic rebalancing is crucial to align with a fundโ€™s strategic objectives. However, traditional rebalancing processes incur substantial transaction costs due to market impact and timing inefficiencies. Blockhouse harnesses state-of-the-art forecasting models and precise slippage calculations to deliver real-time, actionable insights โ€” to reduce these costs, execute risk management strategies, and enhance ETF performance. By predicting market conditions and optimizing trade execution timing, we help traders capitalize on the most cost-effective trading windows for rebalancing, lowering overall transaction costs, and thereby maximizing potential returns.

How to Save Millions on ETF Rebalancing ๐Ÿ“ˆ

In this article, we tune various advanced algorithms, including CNNs, GBDTs, LSTMs, and ARIMA GARCH models, to accurately forecast slippage and optimize trading windows. We then benchmark these implementations against each other, as well as conventional execution algorithms, to demonstrate effective strategies that reduce transaction costs. Asset managers can glean deep insights from our analytics, which illuminate cost-effective trading strategies to: minimize costs, enhance overall fund performance, and adhere to planned investment strategies โ€” to gain an edge in the market.

Uncover the Hidden Costs of ETF Rebalancing and Save Big ๐Ÿ’ฐ

On June 21, the widely-followed tech ETF, XLK, was significantly rebalanced. This ETF, representative of the technology sector within the S&P 500, includes giants like NVDA and MSFT, which have recently seen explosive growth โ€” with NVDA ascending to become the worldโ€™s largest company by market capitalization. This rapid growth has skewed the ETFโ€™s composition, raising concerns that it might overly represent AI-driven companies, rather than the tech sector as a whole. With such companies now comprising over 47% of XLKโ€™s value, alongside traditional stalwarts like AAPL, the need for rebalancing has become pressing.

Institutional managers like State Street, which oversee the ETFโ€™s operations, are preparing to deploy over $20 billion in trades to realign the fundโ€™s holdings with its intended strategy. At Blockhouse, we take a critical approach to traditional rebalancing strategies by benchmarking them against various advanced execution algorithms. Our aim is to identify alternative methods that most effectively minimize costs and optimize execution, offering a valuable perspective for asset managers that are looking to rebalance ETFs internally within their own fund.

We conducted a thorough analysis of the existing transaction costs for rebalancing XLK (S&P 500 Tech Sector ETF), and extended our research to the ITB (iShares US Home Construction ETF) and SMH (VanEck Semiconductor ETF). We specifically focused on key areas of improvement on slippage, market impact, bid-ask spread costs, and commissions. Our findings reveal potential areas for cost reduction and performance improvement by executing trades based on our models, compared to traditional execution benchmarks โ€” like Time Weighted Average Price (TWAP).

How Advanced Forecasting Models Can Transform Your Trading Execution ๐Ÿ”

In this section, we demonstrate the suitability of Convolutional Neural Networks (CNNs), Gradient Boosted Decision Trees (GBDTs), Long Short-Term Memory (LSTM) and AutoRegressive Integrated Moving Average โ€” Generalized AutoRegressive Conditional Heteroskedasticity (ARIMA-GARCH) models perform when predicting market movements in various scenarios. We highlight the efficacy of these models in forecasting transaction costs, identifying optimal trading windows, and executing effective trades โ€” to reduce market impact and slippage.

Convolutional Neural Network (CNN)

A Convolutional Neural Network (CNN) is a deep learning model that processes structured grid data, such as images, through convolutional layers to detect features, pooling layers to reduce dimensions, and fully connected layers for classification. When analyzing historical price data, CNNs can detect patterns and classify between periods of high volatility(high transaction costs) and low volatility(low transaction costs). This helps a trader identify these intraday regimes and trade in the window when the price is most stable and the costs are the lowest.

In this graph, we use a confusion matrix to compare the actual vs predicted values(in bps) of the transaction costs for the CNN model, with the temperature representing the accuracy of the model for a given prediction. We can clearly see that the model is very accurate for small levels of transaction costs and becomes more and more noisy when predicting large transaction cost values. Thus, the model is most accurate when seeking out low transaction costs, which is precisely our goal.

Gradient Boosted Decision Trees (GBDTs)

Another model for forecasting are decision trees, which use complicated networks of nodes and branches in order to segment data in a way that best differentiates between the different features in a dataset. These are useful for traders because of their versatility. They can be used for regression purposes in portfolio optimization and also for classification purposes in trading decisions. Our decision tree would differentiate based on time of day, volatility, level of bid-ask spread, and other factors in order to create a model that best predicts the transaction costs for the next day.

This graph is the confusion matrix for the GBDT model, and though the results of the confusion matrix are generally similar to those of the CNN, GBDTs are very accurate at transaction cost predictions up to 10 bps/share as opposed to the CNNs which are only accurate until 6โ€“7 bps per share. However, the drop-off in accuracy is much larger for the GBDTs when outsized transaction costs per share are expected. Thus, when larger share counts are being traded, and larger transaction costs are expected due to market impact, the CNNs are more useful due to their increased parameter stability. For retail shareholders and small trades, the GBDTs are much more effective and much easier to train as well.

Long Short-Term Memory (LSTM)

LSTM is a machine learning model that is particularly useful for identifying longer-term trends in ordered data, like a time series. By introducing the unique structure of having a memory cell along with an input gate, output gate and a forget gate, LSTM can mitigate the vanishing and exploding gradient problems. This allows the model to keep improving as new data is fed to it, while also grasping the overarching trends that can be overlooked if you only see a few data points.

The graph shows how LSTM predicted the transaction costs of Broadcom Inc (AVGO), one of the largest holdings of SMH, an ETF we analyzed. The confusion matrix below implies that the overall accuracy of the model tends to be lower than the two models above, but it outperforms others when actual transaction costs are higher. This allows the models to avoid times when trading implies very high costs, improving performance.

ARIMA-GARCH

AutoRegressive Integrated Moving Average โ€” Generalized Autoregressive Conditional Heteroskedasticity, more commonly known as ARIMA-GARCH, combines two models, ARIMA and GARCH. ARIMA runs regressions on a differenced time series, trying to predict the mean and the confidence interval of where future values will lie for a given number of time steps ahead. GARCH tries to predict the variance in future values, giving a more accurate and tighter range that was obtained as a confidence interval in the ARIMA part of the model. Combining these two, we see in the graph below that the model predicts transaction costs for Microsoft (MSFT) at a very high level.

From the confusion matrix, we can observe that the ARIMA-GARCH model has a similar level of accuracy in comparison to LSTM, which is lower than the accuracy of the GBDT and CNN models. As a general trend, ARIMA-GARCH tends to overestimate transaction costs over our period, but this can be perceived as being over-cautious towards adverse trading environments, which is not necessarily a bad thing as tail risks are mitigated greatly.

We looked at the most recent rebalancing for each of the 3 ETFs, XLK, ITB and SMH. Since these ETFs all rebalance quarterly, this points to the period of late March 2024. For each ETF and its constituents, we ran a forecast on future bid-ask spreads and market liquidity using our machine learning models and top of the book order data to better understand the expected transaction costs. Then, through our formula that accounts for bid-ask spread, market impact and order book depth, we estimate the transaction costs that come with rebalancing.

How Our Models Slash ETF Rebalancing Costs ๐Ÿ“Š

Our models estimated that transaction costs will be much lower than simply using TWAP if you used our models. Some snapshots of the results are attached below.

It is clear that rebalancing costs are higher for XLK and SMH compared to ITB. This can be attributed to the fact that ITB is smaller in magnitude, but it should also be noted how much these ETFs grew from 6 months ago to 3 months ago, which is the period where the rebalancing we are analyzing happens, as well as the 3 months that lead up to it. Some of the largest constituents of XLK and SMH outperformed their ETFs by a significant margin, most notably Nvidia and Taiwan Semiconductor Manufacturing. On the other hand, most of the components of ITB roughly grew at the same pace as ITB itself. This means that more shares need to be bought or sold for XLK and SMH in order to maintain desirable weightings, and less for ITB, explaining the difference in transaction costs.

Given this background, we then see that our models perform closer to TWAP when there are many shares to trade (XLK and SMH), and significantly outperform TWAP when there are fewer shares to trade (ITB). This can be attributed to the quadratic nature of market impact, where each additional share increases the transaction cost by an increasing rate. When market impact becomes more significant of a factor than the initial bid-ask spread, entering at a cheap timing becomes less important, leading to the trends in the graphs.

By running our 4 models on the 3 ETFs, we saw a general trend in our results. For example, LSTM always outperformed ARIMA-GARCH, and this tendency was stronger the less shares needed to be traded.This can be explained by LSTMโ€™s capabilities to incorporate longer trend data into its predictions, meaning it could better time entry points for trades. This is more significant when there are less shares to trade, as finding an optimal entry point and sizing as large as possible becomes more of an effective strategy. Also, the GBDTs performed much better when low share counts were traded and there wasnโ€™t much chaos in the market prices, while the CNNs had a slightly lower overall accuracy but were much better at predicting transaction costs for high share counts.

Actionable Recommendations: Cut Your Transaction Costs by 60% ๐Ÿ’ก

Below is the average transaction cost for rebalancing XLK, based on setting a threshold for entering a trade, which is if the predicted transaction cost is lower than a certain percentile of historical transaction costs. We see that in general, setting a lower percentile in historical transaction costs as the threshold yields better results.

Our models perform particularly well when the number of shares to be traded is low, leading to around a 60% cost in transaction costs in rebalancing ITB, and still show significance in other ETFs, showing decreases of around 15%.

Given that all shares to be transacted must be transacted by rebalancing day, we recommend that ETF managers set a threshold around the 30th historical percentile to ensure that a timing to enter trades will appear enough times.

Now, returning back to our CNN and GBDT models, we have to determine an optimal threshold to listen to model recommendations that minimize transaction costs.The 2-d density plot below shows the lowest transaction costs per share per tick, where the x-axis represents the percentile threshold for model recommendations that are accepted and the y-axis represents the size of orders. The optimal execution strategy places orders for the 82nd percentile or better of model predictions and places an order with a size of 1068 shares on average.

In other words, when the model outputs predictions, it will rank them with all previous predictions. If the latest prediction is in the 82nd percentile or higher, we trade on the following day. This specific recommendation outperforms both the naive strategy of always following the model and using random order sizes and the TWAP trading algorithm by 13%. It beats always following the model recommendation with a 21% savings in transaction costs and beats TWAP with a 13% reduction in transaction costs.

Conclusion: Uncover the Power of Advanced ML Models for Trading Execution ๐Ÿš€

Machine learning has made significant strides, evolving beyond simple regression and classification techniques to encompass a wide array of models useful for time series forecasting. It is imperative to leverage these advanced models to gain insights that inform your trading strategies, lest you fall behind competitors who do. More importantly, understanding which model to use for specific scenarios is crucial, as improper application can be more detrimental than not using machine learning at all.

Our findings indicate that Convolutional Neural Networks (CNNs) are highly flexible and adaptable across various transaction cost scenarios. They excel in volatile markets or for institutions with significant market impact due to their robust structure. Gradient Boosted Decision Trees (GBDTs), on the other hand, are lightweight, fast, and accurate but only within a narrow range of conditions similar to their training data. These models underperform in unstable conditions and should not be relied upon for predicting transaction costs during market volatility.

Long Short-Term Memory (LSTM) models offer a blend of the flexibility of CNNs and the accuracy of GBDTs, thanks to their dual-attention structure. However, their intensive training requirements make them less suitable for real-time trading by retail traders. With ample data and computational resources, LSTMs can provide significant predictive power, facilitating efficient trade execution.

Lastly, ARIMA-GARCH models are the easiest to train and understand. Despite their structural limitations, such as not accounting for external factors affecting transaction costs, they serve as a reliable benchmark and outperform naive strategies like Time Weighted Average Price (TWAP), which remains a surprisingly common industry standard.

In summary, effectively leveraging machine learning models requires a nuanced understanding of their strengths and limitations. By doing so, asset managers can optimize trade execution, reduce transaction costs, and maintain a competitive edge in the market.

Unlock Personalized Alpha: Optimize Your ETF Rebalancing Strategies with Expert Analysis ๐Ÿ“ˆ

If you want us to analyze specific ETFs or conduct a personal analysis of your portfolio and trade history, please email us at accounts@blockhouse.capital.

Credits

Iโ€™d also like to thank my fellow co-collaborator on this Balaji M aka Sribalaji Marella who was responsible for the creation of multiple sections and graphs in this article!

Disclosure

The content of this blog is intended for informational and educational purposes only and should not be construed as financial advice. The strategies and insights discussed are meant to provide a deeper understanding of ETF rebalancing and should not be interpreted as specific investment recommendations. Readers are encouraged to consult with a professional financial advisor before making any investment decisions.

--

--