Backtest…to the Future!

Theory alone can yield poor asset price forecasts

NTTP
25 min readOct 3, 2023
Photo by Elena Theodoridou on Unsplash

“I had not so much as heard the numbers upon which the previous coup had fallen, and so took no bearings when I began to play, as, in my place, any systematic gambler would have done.”

-Fyodor Dostoyevsky, The Gambler

There is an old Simpson’s (cartoon) episode where Homer himself is getting convinced to invest in a dot com company. The animated financial professional reminds him to consider risk and that he may lose all of his principal (or some-such boilerplate), “do you understand, Mister Simpson?” “Sure, I understand…” Then a thought bubble appears above Homer’s head containing a vision of him wearing a tuxedo and black top hat, smoking a cigar possibly? His eyes may have turned into dollar signs, and a melody from the old song from, ohh about ’33 or so if memory serves (nineteen, that is, if you are reading this in the future; officially called “The Gold Digger’s Song”(!) which has the lyrics “we’re in the money!”) plays in the background. Was it fancy car horns — from a Pierce-Arrow maybe? — making this melody? Perhaps.

The point, driven to hyperbolic levels — as cartoon writers are wont to drive — is that many retail investors don’t fully consider risk when throwing around their digitally tracked assets. A few button clicks or taps on a PC, tablet, or smartphone, and a safe but low return asset — such as cash [Note 1] — can be quickly traded for a volatile, potentially high return asset… with a lot of risk.

On this day in October 2023, the popular consumer trading app Robinhood lists volatility as one of its “stock stats,” but only as a categorical variable [low medium high]. Nonetheless, that is something.

From robinhood.com

If we search in the popular book Models.Behaving.Badly. by quant expert Emanuel Derman, we see that the term “volatility” shows up 42 times. This is an unscientific method of discovering the importance of volatility, but if it makes you want to read the book, all the better. Reading Derman’s book is a much more scientific method of learning this importance, and we recommend such reading; in fact, if you are going to read one book about the stock market this year, M.B.B. should be it, if you haven’t read it already. Even then, maybe you should read it again? We recall that Derman mentions in this book the difficulty of looking up the volatility of stocks on retail investor web sites; yet volatility is one of the most important metrics of an asset to a professional, as Derman writes… ah ha, we found the line:

“I recently tried to use Bloomberg, Yahoo, and Google … to my astonishment there was no easy and direct way to obtain a stock’s volatility σ.”
— Emanuel Derman,
Models.Behaving.Badly.

This book was published in 2012 according to amazon.com, so apparently the RH people in charge have not read it yet? Perhaps there are, as we say in the software business, issues: “Well, if we show numeric volatility, we have to show over what time it was computed and how it was computed etc etc for it to have meaning etc etc.” That is: “seems complicated.” Yet some of these same trading apps have no problem providing highly detailed, precise drawing tools that let a user draw complicated geometric figures on top of historical time series data; to what end? Well, to try to predict the future, of course. If the users want to draw, let them draw!

Some tools for drawing on charts, from finance.yahoo.com

However, it’s a little concerning sometimes as to where some of the priorities lie in software development.

“Being well-known to the attendants, she always had a seat provided for her; and, taking some gold and a few thousand-franc notes out of her pocket — would begin quietly, coldly, and after much calculation, to stake, and mark down the figures in pencil on a paper, as though striving to work out a system according to which, at given moments, the odds might group themselves.”

—F. D., The Gambler [italics added]

In our app’s models, an asset’s volatility automatically comes out as an emergent effect when we resample from the past [Note 10]. We show it on the return histogram graphs (as stdev = standard deviation), but we do not use this numeric value in computations, except when we allow the user to set the computation mode to “assume normality.” [Note 2]

If you want to do any of this math yourself in Excel or even Lotus123 (anyone for VisiCalc?), notice that on many web sites or apps (if you can find it), volatility is annualized; in our app, we merely list it as the standard deviation of daily returns over a user-specified time period, and we make a note of its annualized value in case you want to compare to other sites. Also remember to note (when pulling volatility from other data sources) how it is computed (over how long a time period, with what return sample interval… daily, weekly, monthly, etc).

Volatility being easily available or not, it is generally known among practioners that daily stock (and now crypto currency) returns are typically not normally distributed [Note 3]. This means that we need more than a mean return and a standard deviation of returns (volatility) to describe the return distribution accurately. Even the purveyors of famous analytic forecasting methods such as the Black-Scholes method for put/call option pricing (and similar) — though the method assumes normality of returns — admit that reality is more complicated than the model assumes it to be, and adjustments need to be made to those methods for more accurate real-world application [Note 4]. Derman notes that the Black-Scholes model was so “out there” (our words) when first it was released that the authors involved had difficultly getting it published. Now, it is de rigueur.

Getting back to the present, in our price / probability forecasting app MCarloRisk3D [Note 6], we skip entirely the question of theoretical return distribution shape by instead using an empirical distribution of daily returns for forecasting; that is, we just take the histogram of historical returns data as-is for a given asset, and resample from that distribution for forward random walk path generation.

In the present case, this empirical resampling from the past is advantageous from a “better model of reality” perspective (since it references historical reality more precisely, without the “assume normally distributed returns” approximation involved), and also from an easier coding perspective — not that end users have to be concerned with this, as our app is “no-code” from the user’s point of view — because we can skip entirely the rather complicated step of distribution shape identification and fitting.

Which step are we talking about here? From which process? Why, we are glad you asked! This is step P2, “Estimation” from the world-renown “‘The Prayer’ Ten-Step Checklist for Advanced Risk and Portfolio Management” by Prof. Atillio Meucci:

Verbatim from A.Meucci: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1753788

The alert reader will notice that we have skipped step P1. We will get back to that shortly.

If our reader wants to set a priority order for further reading, we recommend reading Derman’s book first, which has a bit of math… but more the “back of the envelope” type of math; then try Meucci, if you dare. Meucci… well, that’s one reason we made this app, to try to make some of that high end risk math more accessible.

But if an asset’s returns are not normally distributed, then how are they distributed? There is a veritable zoo of potential ways to analytically describe a probability distribution:

https://en.wikipedia.org/wiki/List_of_probability_distributions

Let’s continue forward in Meucci’s step P2 description:

Verbatim from A.Meucci: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1753788

Note the first sentence in the above: “non parametric empirical distribution” (NPED here). In our app, this is what we use by default (though we do have an option for switching to an ordinary normal distribution for comparison with analytic methods). NPED is a fine way of saying “just re-sample from the return data you already have, from history.” In our documentation and app, we typically just refer to this as an “empirical distribution,” implying the non parametric part of it. Even though donuts have no parameters (after they are made, that is; there are plenty of parameters during the making of them: sprinkles or not, what kind of sprinkles, hole or no hole, {jelly, kreme, or not}, chocolate covered or not, glazed or not, ad crustula), we typically don’t request “non parametric donuts” when we visit Timmies; the non parametric is implied.

Photo by Rod Long on Unsplash

We also have options for tweaking this empirical distribution for model tuning (tuning a model so that it backtests better) by methods or variables such as: How far back in time should we pull data to generate this empirical distribution? Should we maybe weight more recent days higher when re-sampling from the past, or should we give equal probability to all days in the past when we resample that past, as Meucci suggests above (“equal probability 1/T to each of the past observations”)? Should we “put our thumb on the scale” and tilt the observed historical distribution of returns to be more bullish or bearish than it was? This makes the return data less historical, but if a model doesn’t backtest well, what should one do? Should we just say: Yeah… this model didn’t work so well in the past, but it surely will work in the future (leading to audience eyerolls). This is actually a step beyond what some market analysts do when they merely 1) make forecasts, 2) do not back test those forecasts, and 3) expect that anyone will believe those forecasts. As Taleb points out, the stochastic nature of the market is such that some of these predictions will come true, just because of the sheer number of predictions being made. This can create a feedback loop where some analysts double down on their predictions because they got lucky randomly (mis-attributing their luck to skill), but where there was no cause-effect relationship between their analysis and the outcome. The result, over time, is failed predictions and monetary loss. On the other hand, as some researchers note (and which is obvious in retrospect), there is always someone on the other side of a trade.

“But in the aggregate we* must own the market; and if someone holds a portfolio of value stocks, then someone else must hold a portfolio of expensive stocks. If someone is contrarian rebalancing into recent underperformers, someone must take the other side of the trade and chase into recent winners.”
“If regularly rebalancing into value and low beta stocks are such good investment propositions, who is investing in expensive stocks and high beta stocks? Who is on the other side of the trade?”

* [editor’s note:] By “we,” he means all investors across the whole market; the general “we,” not any particular “we.”

— Jason Hsu, Part 6 — Who Is On The Other Side Of the Trade?

If someone is losing, there is someone else winning. Once again: “seems complicated.”

But back to our own less directional models:

Using an empirical distribution is even more of a key feature when we start getting into portfolios of assets, where returns can be partially correlated among assets. Meucci covers this in his ARPM class; if you are interested in further reading, a good topic to Google search is “fitting elliptical distributions,” or start with the Wiki:

Fitting a multi dimensional probability distribution is more involved than doing so for one asset at a time. Maybe one asset is more risky than another in a portfolio and has more extreme return movement events in it its history. So then one dimension of the distribution needs to model fatter tails than the other, one asset may have positive skew, another negative skew… a multi dimension distribution formula can get complicated, fast.

But by resampling from reality rather than making too many assumptions about that (historical) reality, we may be able to get more accurate forecasts. Our understanding is that the assumptions of normality (in analytic methods such as Black-Scholes) is done so that it makes the symbol-pushing (stochastic calculus / differential equations) tenable to solve. This was much more important back when the Black-Scholes method was discovered (or invented, if you will). It is less important now, when our smartphones have more compute power than supercomputers of prior decades:

https://www.pcmag.com/news/space-wars-the-cray-2-supercomputer-vs-the-iphone-12

Hence, we can run a monte carlo forecast for millions of iterations in a reasonable time frame on any available laptop, tablet, or phone, and we don’t necessarily need the pencil-and-paper solvable analytic equations directly.

But now, let’s say, as we do in MCarloRisk3D, that we do allow this empirical distribution, with all its inclusion of (in addition to mean return and volatility of returns):

1. fat tails, or, more outlier events or excess kurtosis, to use what we call a “statistician job security technical term”; but it is merely excess with respect to the normal distribution, not to reality; to reality, it is spot-on, and observed extreme events in the past may even be under-estimating extreme events if we tend to believe Prof. Taleb.

2. skew (weighting the distribution more toward bullish or bearish behavior) [https://en.wikipedia.org/wiki/Skewness]

3. higher order behavior that gets complicated (there’s that “c” word again) [Note: higher order “moment” behavior is included in the empirical distribution, the raw return data] https://en.wikipedia.org/wiki/Moment_(mathematics)

How well does a model predict when built from these assumptions?

There are monte carlo type forecasters available for individual assets and portfolios (say, for estimating your future retirement income), but the problem with many of them is that they do not allow you to backtest your forecasts. Luckily, this is changing.

Sure, some systems like finviz.com and others have backtesters for “trading ideas” and “trading systems” based on computations from historical data. But how many of these retail backtesters allow you to backtest models that predict price/probability ranges, rather than backtesting buy/hold/sell trading systems? The monte carlo forecasters in our app make more modest forecasts of prices at different probabilities and time horizons, and these can be loosely translated into estimated ranges of prices… as opposed to the less modest models that try to predict “bullish or bearish” directionality, which is a much more difficult forecast to make (and some say an impossible Quixotian quest).

Figure 1: Daily returns distribution for SPY, snapshot from MCarloRisk3D app

I SPY an example nigh

We can start with data from a popular S&P 500 ETF, trading symbol SPY, to show an illustration of this concept. The histogram above is a 1 year (252 trading day) chart of daily SPY returns, with data from end of day Sept 29 2023 (then going back 252 trading days from this). To start, we will get one year of historical data from SPY and use data from that year to generate forecasts. This period of one year back may not be long enough, as we will demonstrate. Conversely, it may be too long if we are concerned with econometric regime switching. From the statistics noted on Figure 1, and maybe even by visual inspection, we can see that these returns are not normally distributed. Skew is >> 0 (the distribution has more weight on the bullish side, to the right on the above diagram), and excess kurtosis (tail fatness) is high — this latter indicating more extreme events than one would expect from normal distribution — and the interesting statistical property “tail ratio” [Note 7] is somewhat greater than 1.0 (it would be 1.0 for a symmetric distribution like the normal).

Resample And Remix To Generate Random Walks

Once we have an empirical distribution like above, we can resample from it randomly and remix those samples in different (random) orders to generate random walk paths in the price domain, projecting prices into the future… but probabilistically, not absolutely. This is a “monte carlo simulation” version of what Meucci describes as his P3 Projection step.

Verbatim from A.Meucci: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1753788
Figure 2: Screenshot from MCarloRisk3D showing a small sample of monte carlo generated random walk price paths for SPY; note that they diffuse up and down (bullish and bearish) over time. This is not programmed, it is emergent from the resample and random walk generation.

The alert reader will realize that in our code, we assume that our historical distribution of daily SPY returns is the putative “invariant” here, as Meucci refers to it. We are assuming that SPY returns have the same distribution in the past as they will in the future. Econometricians will note that we are assuming that the SPY returns distribution is “stationary”; that is, all of its statistical properties such as mean, standard deviation, skewness, kurtosis, and higher “moments” — if they vary at all over time — vary only in a non-statistically significant manner. What constitutes stationarity (or not) is a fairly involved discussion, but you can visualize it approximately by thinking: Okay, if we took a sample of 1 year’s worth of daily data from SPY from any time period in the past, the distribution would look about the same as in Figure 1. The particular values of skew and so on may change a little, but these small changes would be attributed to noise or otherwise not relevant to the outcome.

This stationarity is a fairly strong assumption, which may not play out in reality. We are only pulling one year of SPY data from the past. Maybe we need to use more data than this to get a stationary distribution, one that does not change (to any statistically significant level) in its distribution moments as time progresses?

Luckily, our modeler has a backtest feature, so we can estimate how well this assumption did play out in reality.

Aggregation of random walks

If we then aggregate all of those many random price walks that we generated, we can slice through them and compute empirical price distributions at incremental days forward in our forecasted future time period. Here, we set the forward investment horizon to 100 trading days, arbitrarily for this example. Our model computes prices at a set of probabilities (stepping from 0.5% to 99.5% by 0.5%) at daily intervals from 1 day to 100 days forward in time [Note 11]. This data can be interpreted as samples upon a 3D surface, with surface dimensions of (time, price, and probability). We do not know the equation for this surface, to be sure. That is the domain of the symbol-pushing guys. We do have estimates of data points on this surface, however.

This is shown by our app as an envelope type of graph if we look at it in 2D (Figure 2), or a probability surface type of graph if we look at it in 3D.

Figure3: Forward forecast model of SPY showing contour on Price/Time plane for probability and one cross section of the price probability graph at 100 days forward. Various display options for this surface are possible in the app.

Both graphs are the same data but visualized differently. The user can select particular points off of the surface by dragging the “cursor beams” on the graphs to select probability (top yellow graph) and time forward (bottom graph). By selecting probability and time, the app tells you the price estimate from the model you just built.

Showing user-adjustable cursor beams to select time forward and probability

This is one concept that may not be apparent immediately when you first study the ideas of “asset prices as random walks.” It is really an aggregation of many random walks that allows us to do these types of probability surface estimates. So the famous book A Random Walk Down Wall Street might be more accurately (if awkwardly) entitled: A Whole Lotta Random Walks Down Wall Street, which gives a better hint of what those guys do.

Moreover, “random” doesn’t have the colloquial meaning of “anything can happen.” No, the things that can happen, in these types of models, are things that have already happened, in the probability that they already happened, but in different time-ordering of happening; with perhaps some tuning of these assumptions as we shall see. Sure, SPY may have jumped or dropped 2% in one day in the last year. But these were fairly rare events. Looking at our SPY return histogram standard deviation of 0.0109, this suggests that SPY is much more likely to move less than 1% per day than it is to move > 2% per day. Yes, some rare days it might move 3% or more. But it is unlikely for a 3% move to occur in SPY many times per year, if we use history as a guide.

Bulk backtesting

After doing this aggregation of random walks into the future zone of our model space, we can withhold some data to see how good our model is if we compare it to what has already happened (that is, we can backtest the model). It seems that Meucci does not mention backtesting per se in his 10 Step document, but the closest would seem to be the last step: P10 Ex-Post Analysis, viz:

Verbatim from A.Meucci: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1753788

… but he is involved in many advanced topics, so we suspect that backtesting is involved at some point.

The final bit of the above excerpt is the key phrase: “P&L is no longer a random variable, but rather a number that we observe ex-post.”

From dictionary.com,

ex post: “considering actual results rather than forecasts”

Here we are not forecasting Profit and Loss yet, we are merely looking at prices of SPY, at incremental probabilities.

In our backtesting, we do not wait for the future to arrive, we merely shift our model backwards in time so that prior known data is “the future” with respect to our model.

We added backtesting to our app because of the common skepticism that we see regarding forecasting models (not just in finance, but everywhere), when a technical person presents a model to his manager: “Well that’s all well and good, but how do I know it’s true? What about the model you told me about 6 months ago?” This often occurs in startup companies: Estimates are made about how quickly customers are going to be attracted, resulting revenue, etc… but how often do those come true? Clearly, “pivots” in startup companies need to happen because original estimates and forecasts did not even approximately come true… ex post, that is. And, not to tread too far into hazardous territory, but this was one critique of early COVID19 models:

“For accuracy of prediction, all models fared very poorly. Only 10.2% of the predictions fell within 10% of their training ground truth, irrespective of distance into the future.”
— A Case Study In Model Failure… https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7417851/

It is known among engineering practitioners that even deterministic physics models that replicate reproducable lab tests (let’s say, in mechanical engineering type of models) need validation and calibration before use, for all but the simplest setups.

There are just too many variables that are estimated and approximated in many physics models for them to be 100% accurate at first run. Now if deterministic engineering models need calibration and validation, how much more so might financial models, with their heaps of randomness and the whims of the investor populace?

Thus, we backtest.

We proceed with our example by withholding the same number of days that we forecasted (100 days in this example; trading days, that is), overlaying the past reality (blue curve) on top of the forecasted envelope.

Figure 5: (deleted)

Figure 6: Bulk backtest in 3D view.

Well it looks like our envelope contains reality pretty well; and maybe too well? Maybe the envelope is too wide (vertically), indicating that we are maybe over-estimating price variance in the future? Over estimating risk might cause one to be more cautious than warranted and “leave safe-ish returns on the table,” so to speak. But remember, this is only one bulk backtest for one day in time. Let’s not be too hasty in judging this model from checking a single time interval of 100 days. Our app can next repeat this backtest one day at a time, back in time, to do an exhaustive backtest and see how our model checks out versus reality.

Exhaustive rolling window backtest results

Here we will roll the whole model structure back in time two years (504 trading days), doing a two year backtest. This may not be a sufficient backtest length for actual use, but it will suffice for an example.

Figure 7: Exhaustive backtest. Blue curve is reality of historical SPY. Red circled area shows time periods where forecast lower bounds at 1% and 5% probability are being exceeded by reality.

Ah ha, in Figure 7, we see that our blue reality price time series is dropping below the 1st and 5th percentile forecasted bands at the bottom of the graph far more than 1% and 5% of the time (respectively). This suggests that our model is under-estimating price risk for some periods of time. This is not good, because it means that we may think that our portfolio of one asset (SPY here) is safer than it really is.

Our app also presents a detailed table indicating how bad the risk band breaches are, at some key percentiles. The graphs give you a general idea of how well the backtest came out, and then you can look at the table to hone in on the specifics.

Figure 8: Zooming in on the table in Figure 7. Note that what should be 5% is showing up as 18.8% and what should be 1% is showing up at 9.3%, therefore: “reality was more risky than model,” as our handy hint indicates.

Let’s think about this for a moment: Even though we did not assume normality of returns, and did take into account extreme events (in the past year), and we resampled from actual return data without assuming any analytic/theoretical probability distribution of returns… our model is still under-estimating risk. Imagine if we had assumed normality? How bad would the model have been then?

Potential remedies

To remedy this partially failed backtest, in our modeler we can adjust the number of days back in time from which we get the original raw data to resample. We started by taking data from a window starting from “yesterday” out to one year back. Maybe we should instead use data from 2 or 5 years back? We can also do things like set the time period from which to pull data to an interval of even larger than 5 years, and set a resampling weighting function that pulls less daily return data from really far back in time and more return data from recent times, thereby weighting our model toward more recent events if we so choose. We can also add artificial black swan events (extreme values) with varying probabilities and magnitudes to the model if we anticipate such events — say, by reading zerohedge.com too much — and adjust the model with a variety of more sophisticated techniques that have been discovered, invented, or observed over time such as stochastic volatility (the volatility of volatility, yet not so simple as that [Note 8]). Further, we can explore models where the resampling from the past is not fully independent each day. That is, we can resample from the past with some memory, rather than resampling i.i.d. (referring to the first “i”, independent).

We even have a method that automatically measures how bad the exhaustive backtest was and then adds adjustments to the empirical returns distribution that we re-sample from to make the backtest come out better… that is, to match to observed reality better. We call this our “predictor-corrector” method. We predict, using data that we have. Then, we correct this prediction based upon how far off the model was from an observed backtest.

These techniques are all covered in the Tutorial slides for our app, also available in the Help tab of the app, and we cover some of the more advanced math (how it’s done internally) in some of our white papers (these too listed in the Help tab of the app).

Back to Step P1

Now, going back to consider Meucci’s step P1…

Verbatim from A.Meucci: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1753788

…maybe we need to put some effort / automation into figuring out how far back we need to pull historical data (for a given asset) before it stabilizes and becomes invariant?

It is simple in our modeler to just set the ‘days backward to sample’ to a large value (many years), but is this the best idea? As we mentioned earlier, there is this idea in econometrics of “regime switching”:

Is pre-pandemic the same economic regime as post-pandemic? Is “during pandemic” a separate regime? Is pre-financial crisis (2008) the same regime as post crisis? What about pre and post 9/11? Pre and post dot com crash? This, we personally do not know, but no doubt many economists and quants have weighed-in on these topics. When economic regimes change, is it better to try to compute stationarity or invariance across regime changes, or should we recompute using only data from the latest regime?

Therefore, our focus on backtests. Nomatter your model assumptions, if a model backtests poorly, then it deserves a closer look.

To the future…

Alas, reality is not so simple in the other direction: If a model backtests well, does this mean that it will always predict well in the future? We have to say, No. This is one reason why automated trading systems fail so often when run: “The system backtested well, but when we ran it forward in time (even with paper trading), losses resulted.” Results? Hedge funds shut down. Failed automated trading systems. Remaining assets returned to investors. There is high interest in what constitutes “robust enough” backtesting in the literature, so we will let you ponder this further. We recommend de Prado and Harvey as starting points to learn about this important topic, in identically named papers:

Backtesting https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2606462

de Prado’s caution in his abstract is sobering:

“This may invalidate a large portion of the work done over the past 70 years.”

Backtesting https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2345489

Now let’s do a thought experiment regarding the “drawing lines upon price time series” method of forecasting: If automated trading systems that backtest well over many historical years can fail in real use (as de Prado and Harvey point out), how well is your trading system involving manually ruling lines and other graphics upon price charts going to work? If it works, it is likely due to random chance (sometimes called luck), and such drawing of lines is analogous to D’s gambler making notes on paper of roulette numbers that have come up in the fictional resort town of Roulettenburg, so as to inform future bets. One may argue: “It’s not the same thing.” But we then posit: It’s much the same thing. Maybe you’ll win, maybe you’ll lose, but such fates won’t likely be due to your efforts with digital T-square and triangle.

Aside from all the fine points we should put upon backtest interpretation…

After we get a backtest that we like, we can go back to forecasting mode (instead of withholding data) and use the adjusted / tuned model for forward forecasts again.

Our 100 day forward investment horizon is maybe not what you want, personally. Maybe you want to set it up so it does 1 month ahead (thinking, okay, maybe I want to look at rebalancing my portfolio every month). Maybe you want to set up a 1 week forecast model (5 trading days) if you are a more active trader? Wouldn’t it be nice if, say, the Robinhood app gave you an estimate of a 1 week ahead price range (with probabilities) of a stock you are about to trade? Then you could estimate what type of risk you are taking immediately, before buying that asset. What could be your potential loss or gain in a week if you invested $100 in fractional shares of TSLA? Wouldn’t that be a fine addition to a trading app instead of (or in addition to) ever-fancier time series graphs of prices from the past?

Even in a casino, seasoned gamblers generally know and keep in mind the odds of roulette or blackjack or a variety of other games. That is, if they avoid doing what Dostoyevsky’s Gambler does: Recording roulette numbers that “came up” on paper and trying to forecast results of the next spins from that time series. He doesn’t call it a time series, but that’s what it is. Though, let’s be careful now: maybe he and his casino companions were merely binning numbers as unsupervised classifiers do, and not assuming time-orders… Ah, but Dostoyevsky… the Russians… fate… I guess we should be gentle toward The Gambler until we have walked a kilometer in his сапоги.

Since our app works with crypto as well, the user needs to remember that crypto trades every day, so a year in crypto forecasts is really 365 or 366 days, and a month is 30 or 31 days. Furthermore, crypto “end of day” (time, price) as reported by our current data provider does not match the NYSE closing time, so the analyst needs to be careful when mixing crypto and stock market assets in the same portfolio. But these are details that you learn as you tinker with the app.

Summary

Even when going beyond return normality assumptions and using raw returns data for making forecasting models, you can find times when probability models like this fail to backtest well, even for “whole market” assets like SPY. And remember we are only forecasting probabilities, not specific prices for SPY. Imagine how much more difficult it is to forecast price risk for individual corporate equity assets (common stocks), with their often even more extreme behavior than the market as a whole.

What about the other of Meucci’s 10 Steps for Advanced Risk and Portfolio Management? The app does have some aspects of the other 10 Steps built in to it, but not to the professional level described by M. For example, our app supports small portfolios of assets so that historical correlation among assets is taken into account when projecting weighed portfolios in the future, it has some portfolio optimization available (to varying targets involving risk versus reward tradeoff), and we allow inter-correlation among many assets to be evaluated using principal components methods rather than pairwise correlation.

The app also has a nifty feature inspired by Prof. Cochrane’s online courses with respect to replicating portfolios: Given a set of assets, how can we combine them to replicate the behavior of another different asset [Note 12]? Such may be useful for hedging, diversification, or just general food for thought.

In the app now, we provide more of an intro to these methods than a fully developed professional system. Our claim is that, simplifications not withstanding, this app provides more risk analysis than you will get from the simple-to-use, modern user interface trading apps that have become popular.

Are retail traders interested in this type of risk analysis, or are they happy just picking favorite stocks or crypto coins and selling (or trying to sell) quickly when those assets start to tank? This, we do not know. But we hope that the app as it stands will give these traders some additional ideas and tools for how to analyze price risk and backtest their portfolio choices.

Further reading

For the next installment in this series, please see https://medium.com/@nttp/a-galaxy-of-daily-returns-6db3a014846f , which shows how we can use historical correlation among assets — along with the monte carlo methods described in the current article — to estimate future portfolio values at various probabilities.

For more discussion of the individual monte carlo generated price paths, you can check out our next article, Memento Fortuna.

For details of the fractional empirical motion tuning parameter of our models, please see our Fractional Empirical Motion article.

Updates

Oct 9 2023: Minor edits for style.
Oct 12 2023: Add Further reading.
Oct 18 2023: Add link to next article in series. Minor edit in Note 1.
Oct 26 2023: Correct typo, add link to the next article in the series.

Notes

[Note 1] Not to be too obvious about it, but in inflationary eras, cash is likely to (or definitely will?) have negative return: https://www.investopedia.com/ask/answers/122214/how-does-monetary-policy-influence-inflation.asp

[Note 2] See our tutorial slides for explanations of all settings available in the app.

[Note 3] https://www.linkedin.com/pulse/why-stocks-normally-distributed-its-implications-ammar-a-raja/

[Note 4] https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model

“One significant limitation is that in reality security prices do not follow a strict stationary log-normal process, nor is the risk-free interest actually known (and is not constant over time).”

[Note 5] https://www.aeaweb.org/articles?id=10.1257/jep.13.4.229

[Note 6] We support a series of apps under the names MCarloRisk3D, MCarloRisk, and MCarloRisk3DLite on several app stores for Mac, Android, iPhone and iPad, and Windows. Search for them by name!

We have also launched a web version for big screen browsers (for the Lite version of this tool) at https://mcr3d.diffent.com

[Note 7] https://medium.com/@matteo.bernard/quant-investing-financial-ratios-that-you-must-know-part-2-114313efa760

[Note 8] See our trial of a stochastic volatility approach with Bitcoin prices: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3693387

[Note 9] deleted

[Note 10] Drift, or the mean return, also is an emergent property if we resample from the past empirically. The mean return is often close to zero, and so volatility often overwhelms it in forecasts.

[Note 11] Our app can be set to forecast any number of days forward from 1 day to N days.

[Note 12] Such replicating portfolios in our app are based upon historical correlation among assets. We do not yet have any indicators as to when historical correlation might change.

--

--