Return Forecasting Via Moments, Part 4: Vectorization of Input Streams

17 min readDec 8, 2023

Two supporting columns are sometimes better than one. Photo by Danielle Rice on Unsplash

Figure 1: High level conceptual diagram of what we are doing. Distributions are “to scale” horizontally. TSLA is much more volatile than the market as a whole (roughly represented by SPY). No kidding, right? These plots were made with our macOS app MCarloRisk3D, the portfolio half of the app.

In the prior articles related to this study, we think that we have demonstrated that prior return moments have some predictive power for at least one interesting stock market asset (if we didn’t make any mistakes in the formulas, and we didn’t do too much “data snooping” into “future-boy” data). If anyone sees any errors in our formulas or sheets, please let us know!

Prior articles start here:

https://medium.com/@nttp/n-day-ahead-stock-return-forecasting-with-moment-like-predictors-5d73793c4f15

For the next set of model enhancements described in the present article, please download this Excel file:

https://github.com/diffent/excel/blob/articlePart4/forecasttestRoll9TSLAztolBaddSPYmacroModsRun3.xlsm

Now we expand our model to see if we might get even more predictive power by including moment-like quantities from another asset into our model. For the first test, we use SPY as the other asset (SPY = S&P 500 market-wide ETF). In a sense, this is analogous to going from an autoregressive model to a vector autoregressive model, though we are only trying to forecast one output at a time. Vector auto-regressions and various flavors of them are one of our favorite model types in concept, because all they say is: Let’s see if we can take all this data from many time series(es) from the past and try to forecast the future, without any a priori knowledge of how that data relates to itself and to the target. Seems like a pretty cool thing to do. Our older app [Note 1] does some of this; though seeing the decent results from these moment models makes us think that we should augment that app with moments. So much code, so little time…

Of course there is the issue of spurious correlations, but good econometricians know how to sniff those out. Even in our own Part 3 of this series, we demonstrate how randomly generated data columns can show up as “statistically significant predictors” with respect to our target sometimes. Yikes, right?

Examining the original sheet that we set up for a single asset predictor, we sigh and think, “Man… shouldda wrote this in Python…” Because adding additional data columns cause other columns to be pushed over, and… ugh. But all is not lost: Excel is good at adjusting its own on-sheet formulas when inserting columns, and we won’t be inserting any new rows; so we only have column adjustments to worry about. The LINEST formula has to be adjusted to include the new data columns, and some other adjustments in formulas need to be made to handle the additional candidate predictor variables and related quantities; but overall, it was not too bad to do.

Sheet adjustments for another asset as input data

Let’s first look at our new candidate predictor variable setup in the left columns of the sheet:

Figure 2: Predictor column setup. New columns C, D, and M thru R related to the SPY asset.

We add [date, SPY] closing price columns to the right of the similar TSLA data (SPY in columns C and D), then we add column F (SPYclosertn) to compute the return for SPY just like we did for TSLA. The 2nd date column is just for an inspection check to make sure that our dates are aligned properly (you can scroll down a ways and make sure that dates match in columns A and C, or you could add a formula to double-check this date matching using your “wicked-awesome Excel skillz”). Then we skip over columns G thru L (previously computed TSLA moment-like metrics described in our earlier articles) and add columns M to R to compute similar moments and tail ratios for SPY. The formulas are the same as for TSLA as noted in Part 1 of this article series, but referring to SPY return (column F) instead of TSLA return. Easy.

Model solve

Next, we examine updates to the LINEST formula.

Figure 3: Array formula outputs in yellow.

This array formula is now over in cell Z2 (upper left of the now yellow “rectangular Excel array range” in which it is stored),, due to the new columns added:

Note the non-Excel worthy comments starting with #

=LINEST(
OFFSET(OFFSET(U2,0,0,$S$12,1),$S$9,0), # Y value target to fit now U
OFFSET(OFFSET(OFFSET(G2:R2,0,0,$S$12),$S$6,0),$S$9,0), # X columns now G thru R
TRUE,
TRUE)

Critical model setup params get auto-converted in the formula to column S when we added the new columns (these are now S6, S9, S12) … these were previously in column J, same row numbers (in the single asset setup from Part 1). And the LINEST entry “knows” that the output Y (target data to fit) moved to column U. So we “merely” had to add more variables into the X column range (now G2:R2, first row of the X data block). Easier said than done when modding array formulas. If anyone from Microsoft is reading: “Excel is groovy and all, but it needs better array formula UX.”

We also had to modify the output array rectangular range of LINEST, which required us to delete the array formula, then re-add it at the larger cell count (width) to account for more outputs now (more coefficients being output, since we have more input X variables). And we cleaned up the height of this array output block (set to yellow) so that no extra rows or columns are output with #N/A values (versus the prior sheet where we had specified the array formula in too large of a region and just ignored the excess #N/A values). There still are #N/A values within the now-minimum rectangle of the LINEST array formula, by Excel’s convention.

A good trick we found for doing this array formula entry is:

1. select more cells (width and height) than you think you need for LINEST output

2. type or paste the formula into the formula bar

3. press ctrl-shift-enter to enter it (standard Excel array formula entry method)

4. observe the #N/A values where they should not be and then colorize the cell backgrounds (set cell format color) manually, to highlight the cell rectangular range where outputs should be

5. then clear the array formula by “cell contents” (leaving the cell format background color you just added)

6. select the cells you just colorized and re-enter the same array formula in the formula bar then via (ctrl-shift-enter)

Now you have a nice compact array formula output without extra #N/A values reported out in your sheet.

We think that you have to select the whole cell range that you originally had entered the array formula into when you delete it, or else Excel reports the error “cannot modify part of an array” or some similar message. They really should do this error feedback better, like maybe highlighting or outlining what the actual array range is at the time of the error… or… I don’t know. Let’s just bin it as: Excel array formula UX improvements.

Figure 4: Your albatross, when sailing the rough but navigable seas of array formulas in Excel

And again a reminder that because “LINEST reverses its outputs,” we put some annotations in the first row Z1:AL1 to indicate which outputs are from TSLA moments and which are from SPY moments (the model constant output is all the way to the right, as before).

Then because we have more variables, we need to expand the T score formulas below the LINEST formula “to the right” and add more variable mask flags. This is easy, but if you don’t understand, re-read Part 1 [section Model outputs — Standard Errors] of this article series regarding this topic.

All right, so LINEST now solves for double the count of X variables as it was solving for before! Cell S12 is still at 100 (100 points per regression), so we still have plenty of data for our expanded X variable set: 12 variables plus a constant, so 13 coefficients solved for, using 100 points. Not a bad starting setup.

Model evaluate

Now we have to add the new terms into the evaluation of the linear model in column V, starting at the top at cell V2. This is just a linear model A*x1 + B*x2 + … + const. As a reminder, the OFFSET business is just so we can push this model down (back in time) for our rolling window backtest. Note how the OFFSET function refers to G thru R (the X data for the regression), and the $AK thru $Z (then $AL) is the reversed order coefficients from LINEST outputs.

Count up from $Z$2 at the bottom (of the following formula) and you can see that it goes Z, AA, AB, in sequence to … AK, then AL at the end for the constant

Linear model evaluation. Simple weighted sum of moments and ratios. Boom. Done.

As we mentioned in Part 1 of this article series, there are ways to fix this coefficient reversal with a formula modification, but this might start to make the weighted sum seem more complicated than it really is, so… we leave that for later.

This final model evaluation formula below is from V2, and it then gets copied down that column V:

=OFFSET(G2,$S$6,0) * $AK$2+
OFFSET(H2,$S$6,0) * $AJ$2 +
OFFSET(I2,$S$6,0) * $AI$2 +
OFFSET(J2,$S$6,0) * $AH$2 +
OFFSET(K2,$S$6,0) * $AG$2 +
OFFSET(L2,$S$6,0) * $AF$2 +
OFFSET(M2,$S$6,0) * $AE$2 +
OFFSET(N2,$S$6,0) * $AD$2 +
OFFSET(O2,$S$6,0) * $AC$2 +
OFFSET(P2,$S$6,0) * $AB$2 +
OFFSET(Q2,$S$6,0) * $AA$2 +
OFFSET(R2,$S$6,0) * $Z$2 + AL$2

It may seem a little unwieldy, but it is just an ordinary sum of products with a user-settable (and hence, macro-settable) offset down the rows.

More variable mask flags

And we have to be sure to add the variable mask flags from Z25 to AK25 to the X variable column formulas in columns G thru R. Note again the mask flags are associated with T scores, which are in turn computed from LINEST outputs, which are in reverse order as the X variables. So the right most X variable in column R (tail7525SPY) gets mask flag $Z$25, and the left most X variable in column G (meanTSLA) gets mask flag $AK$25. Interpolate cell refs in your mind between those two endpoints.

Figure 5: Mask for left-most data column G uses the value from $AK$25 (right-most mask, due to reversed LINEST outputs)

Figure 6: Mask flag for right-most data column R uses the value from $Z$25 (left-most mask, due to reveresed LINEST outputs)

Recall that you can test the mask flags manually by changing each one to zero and watching the associated X variable data in columns G thru R get set to 0.

With these changes, the sheet itself should be pretty good now. If we forgot to mention any of the changes we made, please let us know, as “diff” does not work so well (or at all?) on Excel files. Maybe if we save the sheet in a non Excel public XML type of text file format, we could do the diff? Hmm…

Macro adjustments

Next, we have to adjust the cell references in the macros, because when you add columns, macros don’t know about these new columns, and cell references have to be adjusted manually in the macro code. Now there is something that Microsoft could do with some of their $10B A.I. investment! Fix the macros when you add/delete rows and columns in Excel! After all, they — the ubiquitous “they” — claim that A.I. is going to replace thousands of programming jobs. Might as well start with this “simple” project right? It is just decades old Visual Basic, how hard can it be? (Ha!) Adjust the macros when rows, columns, or cells are added or deleted! We are kidding, it is probably very difficult, especially if you use numeric cell references in the macros instead of “A1” type references. It would be a good test of this new coding-focused A.I. coming online though.

While we are waiting for A.I. to be ready to do this for us, we will make the macro adjustments manually and run our sheet to see what happens. We won’t re-print the macro code here, since it is similar to what we described in Part 2 of this article series. You can open the macro code in the Excel sheet by searching for “view macros” in the Excel command search bar.

Figure 7: They keep moving the menus around in Excel, but you can find features by typing keywords in the Edisonian lightbulb search field. The search is strange, though, not like a lot of searchers. You’ll see if you try it.

We did make a small modification to the backtest macro so that the display gets updated every 10th backtest trial. This is so we can monitor backtest progress and still run the macro faster than if we have the code update the screen with every sheet modification. This is done via an Application property setting in the backtest macro:

If ((nback Mod 10) = 0) Then
  Application.ScreenUpdating = True
Else 
  Application.ScreenUpdating = False
End If

Baseline test

[The results presented in-line from these trials are summarized in a table at the end of this article, Figure 12.]

Okay so, first: A baseline backtest: T cutoff = 0 (all variables kept), with all other model parameters the same as in prior Parts of this article… so that we can compare “apples to apples” to see if our new data candidate predictor columns from SPY improve our results.

From this backtest, we get OOS backtest R2 = 0.282, and percent directionally correct of 185/252 = 0.734 = 73.4%.

Figure 8A: First 1 year backtest with SPY added as a predictor, T cutoff = 0 (keep all variables)

Compare to our baseline run from Part 2 of this study [section First Trials Of Term Elimination]:

T cutoff = 0
R2 of backtest 0.256
169/252 = 0.670 directionally correct during backtest

→ Both backtest R2 and percent directionally correct was boosted by the addition of these SPY moments and ratios to the model. Not a lot, but: Good! ✅

This is more impressive than it may initially seem, because the Efficient Market Hypothesis seems to say that the R2 of a backtest like this should be close to 0.0, only wandering from 0.0 due to random chance. So when we boost our R2 by a couple of percent (and if this boost is real of course — that is, statistically significant — and not accidental), this is something worth looking into more. We will not yet address the “statsig” of these R2 improvements; we merely look at nominal values to start as we are building up this model. Similarly, with 67% vs. 73% directionally correct: Absolute 6% better in directional forecast? Relative improvement of about 10%? We’ll take it.

Variable elimination trials

If we now set T cutoff to 2 (cell S18) so that variables which have a larger uncertainty in their coefficients are eliminated from the models via backwards stepwise regression through our custom macro (since the built-in LINEST does not have stepwise regression available), OOS backtest R2 is boosted to 0.354, and OOS direction correct drops to 71.4% (but is still larger than the single asset case). This is without any zero tolerancing applied yet to the directional forecasts, as discussed in the Part 2 article.

Figure 8B: T cutoff = 2 backtest for two asset model

Doing a quick test of our zero tolerancing method, we can set the forecast zero tol to, say, 0.05 (now in S21, same row but new column vs. Part 2 article sheet references).

Figure 8C: T cutoff = 2 and set zero tolerance to 0.05

We now get 101/125 correct (125 outside the +- 0.05 tolerance and 101 of these in the correct direction) or 80.8% directionally correct. Another 9% or so gain (absolute percent). Pretty good for a small model modification.

If we bump up the zero tol to 0.1 like we did in our prior study in Part 2, we get 42/46 = 91.3% directionally correct. Pretty good, right? Careful, though… when we tune the zero tolerance while looking at the backtest, we are peeking into the future, the withheld one step forward data… “Here be dragons.”

Figure 8D: Bump up zero tolerance to 0.1, percent directionally correct goes up to 91.3%

Once we get a stable model (and this one seems to be close), we could do a design of experiments parameter scan to efficiently study the effect of simultaneous parameter changes on this model, experiments defined by methods such as Taguchi, latin hypercube, or the Sobol design space sampling approach (which may be a topic for a future article). And of course, a numeric optimizer (Excel Solver, anyone?) approach could be used to boost the OOS R2. That is, if you can get the solver to solve. Sometimes it seems to get locked up and gives you the same answer that you started with.

But an optimiz-er then requires something to be optimiz-ed, so we have to be cautious at not trying to tune too much to a specific backtest, or trouble could result. Hence, it may be better to work on our model selection criteria more before model tuning, criteria that can be computed only from the fitting process and not from the nominally OOS data (the 1 step ahead data points that we are checking). Simple Adjusted R2 first comes to mind (now that we are eliminating variables to good effect, this can become relevant), AIC, BIC, and so on.

But for initial trials, it is often better to use the ceteris paribus approach (all else constant) to test that everything is first working okay. Doing a parameter study on a bad model is contraindicated, as they say in pharmaceutical ads.

In any case, the zero tolerancing seems to improve results for the dual asset model as well as for the original single asset model. After all, we are merely discarding weak forecasts near a critical data value edge.

From the above trials, we can say that for this one year backtest, adding SPY moments seems to have boosted our backtest prediction quality noticeably. Whether this boost is statistically significant, we do not know yet. That will take more tests. But adding these extra SPY-derived variables does seem to be moving the nominal results in the right direction.

We can zoom out on our sheet to get a birds-eye view of the T scores that were saved during one of the above backtests, knowing that our 3 digit code of 999 means that a particular variable is eliminated during a given backtest.

Figure 9: Bird’s eye view of T scores during backtest, one row per backtest step. Here we don’t care about the particular values, we just want to notice short versus long strings (999 short string = variable was eliminated).

At a glance and ignoring specific T scores, we can see that all of the variables we gave to the regressor and term eliminator are being used at some point or another during the one year backtest. We might think to color highlight cells that are “real” T values instead of the 999 code if we decide to make the sheet fancier as we go along; but for now we rely on visual inspection of short strings (999) versus long strings (actual T scores).

We do not see a full column of 999 (the short cell entries) which would indicate that a given variable had no significance in any of the 252 backtest days.

Keeping in mind our findings in Part 3 of this article series (where purposefully random variables were showing up as significant occasionally), we probably should re-run this backtest with T cutoff of 3. This, we do next.

T score cutoff of 3 … a cutoff too far?

By increasing our T score cutoff to 3 (to throw out more variables), backtest R2 goes down a bit to 0.333 and OOS directionally correct is now a lower 67.1%. Are we throwing away too much information when we set T cutoff way up to 3? Possibly…

Figure 10: T cutoff = 3 results in fewer correct OOS than T cutoff = 2 case

Now if we set the zero tol to 0.05 for this case, we get 103/134 = 76.8% directionally correct OOS… also a degradation from the T cutoff = 2 case above. Hence, it seems that more study needs to be done on “model selection criteria” for this type of model, as we discussed in Part 2. T cutoff = 2 seems to be giving better OOS results than T cutoff = 3; but are those accidentally better results or really better results? We hope for the latter, but these types of models need to be tested and prodded thoroughly, since hope — while an admirable state of mind — is a poor modeling practice.

Figure 11: T cutoff = 3 and use ztol of 0.05

Figure 12: Table of results of an informal parameter study to test this new dual asset model (not yet involving all model parameters)

Figure 13: T cutoff = 2 case from table (no zero tolerance) 1 step ahead OOS backtest results. Actual versus predicted. Not bad for a linear model. The dotted line on this chart is from Excel Charts for visualization purposes; it is not our model.

Hints toward more enhancements and checks

As a look-ahead to the next installment(s?) in this article series, you can replace the SPY closing price column with any data series you like and re-run these tests! For advanced Excel users, the Excel Scenario Manager may be helpful here. Be sure to align the dates properly when adding more data or swapping out data, or trouble will be had. We suggest trying an ETF for a sector such as consumer products or technology. VGT Vanguard tech ETF or VDC Consumer staples ETF seem like interesting candidates:

https://finance.yahoo.com/quote/VGT?p=VGT&.tsrc=fin-srch

https://finance.yahoo.com/quote/VDC?p=VDC&.tsrc=fin-srch

[Update Dec 9 2023: We should have written Vanguard Consumer Discretionary VCR instead of VDC. That perhaps is a little symbol joke about buying VCRs in the 1980’s being the acme of discretionary spending? At any rate, Tesla cars seem to be more discretionary than staple, unless you are a Tesla Stan.]

https://investor.vanguard.com/investment-products/list/etfs

The above Yahoo links have a “download historical data” option in there if you poke around a bit. Or maybe you have your own favorite data source. We typically get our data from iexcloud.io. Take care also to delete rows that have only a dividend indication if you get your data from Yahoo. Those will disrupt the sequential nature of the time series, so, as always: Data prep and examintion before modeling is essential.

When adding price data to our Excel sheet, make sure you first sort it so the most recent data is in row 2, as that is the formula convention in this sheet. If you put the data in “upside down,” you will probably get huge R2 values out-of-sample and think you have found a gold mine. But alas…

Perhaps you might want to try an industrial ETF also, TLSA being a manufacturing concern? Maybe try another auto concern’s stock price? Toyota, Honda, et cetera.

Of course you can also or instead replace the TSLA target with any asset of your choice and try to solve for a different target return. There is nothing TSLA specific about the formulas in this sheet.

Additionally (and with a lot more work) an Excel wizard could vectorize the inputs further in this sheet to allow yet another related asset’s moments into the model as candidate predictors. However, it may be better to first recode this method into Python or another traditional language to make the process of adding more input assets less labor intensive and less prone to error in the formulas.

Finally, though we are reporting OOS percent directionally correct, once we get our model to improve more, we should probably start checking the time-series aspects of this correctness. That is, are the correct result days bunched up anywhere during the backtest, or are they more evenly distributed through the backtest time window?

Summary

Adding SPY moments and tail ratios as candidate predictors of N-step ahead TSLA returns seems to boost prediction quality somewhat — though not spectacularly — and without any additional model tuning. We expect that a bit of model tuning might be in order, but we do not want to over-tune to backtests, or bad things may happen with the model when run it in the future. So we defer this tuning until we investigate assets other than SPY. These other assets might serve to boost forecast quality even more than SPY does. We note how our backwards stepwise approach to variable elimination gave boosted backtest quality, now that we have a larger set of candidate predictors in the model. This OOS forecast improvement upon variable elimination was not noticed when we tested term elimination for the single asset TSLA model in Part 2 of this study. Hence, our effort put into writing a variable elimination macro now seems to have finally paid off.

Updates

Dec 9 2023: Add new cover photo. Add note about other Vanguard ETFs to try.

Notes

[Note 1] The MVAR Market Vector Auto Regression app can be found on macOS and iOS app stores, the Google Play Store, and store.microsoft.com (the latter for the Windows version).

Return Forecasting Via Moments, Part 4: Vectorization of Input Streams

Written by NTTP