Fractional Empirical Motion In Monte Carlo Forecasts

One way to add Hurst exponents to drift diffusion models

14 min readOct 21, 2023

I am: Dali!
— Fictional(?) Salvador Dali, from the motion picture Midnight In Paris (2011)

The painter Dali created a now-famous work entitled The Persistence Of Memory. Whether he titled it himself, or whether his agent or his gal(lery) pals did — Did he even have an agent? Does Dali! need an agent? “Who says that Dali! needs an agent?!” — we do not know at the moment. This could be something that you might want to research, since it seems to be interesting non-trivia. Amuse your art school friends, right? But one thing for certain is that this title is, at first glance, seemingly redundant; for memory is persistent by definition. If it’s not persistent, then it’s not memory. Okay, yes: We realize that there are various time lengths of persistence that need to be taken into account, or perhaps degrees of persistence if you will.

The length or time-persistence of memory is an important concern in time series modeling just as it is with respect to Dali’s painting. To quote from one famous philosopher of code:

“Today, memory either forgets things when you don’t want it to, or remembers things long after they’re better forgotten.”
— https://www.pbm.com/~lindahl/real.programmers.html

The redundancy of Dail’s painting’s title is a realization that we had, just as we started writing this article. Is the apparent redundancy some kind of Surrealist word-trick he was pulling on us, the next century enjoyers of his works? Or is there some further, more hidden, meaning involved…?

The Hurst exponent and time series models analyzed using this concept (by Hurst and Hölder) [https://en.wikipedia.org/wiki/Hurst_exponent] are notable innovations in modeling, and we found a way to combine these with our empirical Monte Carlo models in our MCarloRisk3D apps. These model refinements address a common criticism of simple Monte Carlo models that specify i.i.d. (independent, identically distributed) sampling. The first “i” refers to “independent,” and this is the “i” that Hurst-type models address. The criticism of independence is, of course: How do you know that what happens today in the markets (return-wise) is independent from all that happened in all prior days? There are statistical tests for this, but we refrain from getting into them at the moment, as it can get fairly involved. You may find these topics engaging though, if you are reading this article:

https://www.sciencedirect.com/science/article/abs/pii/S0304407611000285

But clearly, both humans and fintech computers have memory. Are we to assume that none of the terabytes of data generated every day has influence on what happens tomorrow? Why are we even saving this data, if that is the case? Maybe those of our genus and species just like to save and collect things, genetically? Baseball cards, coins, stamps, Beanie Babies, Matchbox cars, Hot Wheels, Barbie dolls, Simpson’s memorabelia, wrist watches, Porsche motor vehicles (eh, uh, er… regarding the last one: for some of us; not necessarily the current author) et cetera. And finally, financial data.

Daily independence of returns is a strong assumption. This assumption of independence seems to work okay as a first order approximation, as it is deployed in several places in finance. One example from our ground truth source of portfolio risk modeling is as follows:

[https://books.google.com/books/about/Risk_and_Asset_Allocation.html?id=bAS63cyIp0EC]

From A. Meucci’s Risk and Asset Allocation book

But with Hurst methods, maybe we can get improved forecasts and refine our models a notch closer to accuracy. Remember, with models, we know that they are wrong in some ways, and perhaps in many ways; but by doing some thinking, we may be able to make them better… that is, less wrong.

Let’s use a quote from the above Hurst Wiki to get rolling:

“The Hurst exponent is referred to as the “index of dependence” or “index of long-range dependence.” It quantifies the relative tendency of a time series either to regress strongly to the mean or to cluster in a direction.”

We now try interpreting the Wikitext to make it slightly more applicable to our discussion and technique:

Hurst exponent = 0.5 implies independent samples (no dependence on prior samples taken when selecting new samples). Maybe not exactly that, if you read the original Hurst paper (the reference is in the Wiki… from 1951, no less!), but close enough for our purposes. In our code, we just go back to ordinary independent sampling when H is set to exactly 0.5, because the Hurst sample generation code is fairly slow, and we have not put any effort into finding faster methods or speeding it up yet. The general concept is: Let’s see if any of our users want to use this portion of the code before putting an enormous amount of refinement into it.

Hence, if we don’t need the Hurst type algorithm (H=0.5), we go back to the faster ordinary way of resampling, which is: taking historical return data as-is and assuming independence among samples.

So to summarize:

Hurst exponents > 0.5 imply “momentum” in time series.
Hurst exponents < 0.5 imply a tendency to return to mean, or oscillation.

Alternately worded:

The closer to 1 we set H, the more momentum.
The closer to 0 we set H, the more oscillation or tendency to”return to mean.”

Ordinarily, Hurst long-term memory models are applied to individual time series forecasting methods, to analyze a given time series. But the models in our MCarloRisk apps are:

statistical
Monte Carlo
aggregate of random walk

…types of models as we describe in our earlier articles. So what we do in our code is apply the Hurst exponent concept to each individual monte carlo random walk generation during the forward estimation process, using the empirical distribution of historical returns as a base set of data. See below our Method for doing this.

It is easiest to show the overall effects of Hurst exponents on our models by example.

Baseline model, no memory

We start by generating a baseline bulk backtest so that you can see the actual price history compared to the forecast envelope. We’ll just use TLSA stock, since: Why not. This model is with the default H = 0.5, and so the samples that build up the random walk paths are truly independent.

Baseline i.i.d model, no memory, TSLA bulk backtest

Momentum model

Next, for rather extreme example purposes, we set the Hurst exponent to 0.7, implying much higher “momentum” behavior. Random price paths that are going up will tend to keep going up; random paths that are going down will tend to keep going down. There will be jitters and temporary reversals in these momentum paths. They are not “pure momentum.” We highlight one of the random paths in black, to show the character of it.

Bulk backtest of model with Hurst set to 0.7

Here we see that the price range at the end of the 100 day backtest period is much wider than the default i.i.d. model, and ludicrously so. If we did an exhaustive backtest on this example, we would see that the model would be over-estimating the future price range by a large amount. In risk models, this is bad, because overestimates of risk might scare an investor, and he might miss out on valuable opportunities. The price range forecasted is huge and not likely realistic for the 100 day investment horizon in this test. After all, we are violating a fundamental tenet of these types of models: that samples are independent. No, here we say that a (bullish) monte carlo random walk price curve that is tending to go up will tend to keep going up, and a similar walk that is going down will tend to keep going down, for all of the thousands of paths generated. Also, random paths that are kind of flat (neither too bullish nor too bearish) tend to stay flat. This seems to be conceptually why we see a wider forecast range at the end of the 100 days. Once one of our thousands of randomized paths starts taking on momentum randomly, it tends to keep going that way due to the influence of the Hurst-style adjustments to path generation that we added to our system.

Return To Mean model

Now let’s try the other case: Hurst less than 0.5, implying not momentum but more of an oscillation, “return to mean” tendency, than is likely to occur randomly. We set Hurst to 0.3 for this other extreme-ish example:

Bulk backtest of model with Hurst set to 0.3. One random monte carlo path highlighted in black.

Now we see that the forecast price range at the end of the 100 day backtest period is much more narrow than the default i.i.d. case. In fact, most of this historical actual (blue) price curve is outside the 5th to 95th percentile envelope. Sounds a bit sketchy, right? What kind of bizzare-o model is this? The historical reality is way outside the forecasted envelope. “Reject the model.” Here, if we did our exhaustive validation backtest, we would see that the actual price curve would far exceed the forecasted bands on both top and bottom (1st and 99th percentile). It would be a bad model to use for any forecasting, since it would not backtest well. But in this case we are under-estimating risk, and in the earlier shown Hurst = 0.7 case, we were over-estimating risk.

So the first note to make is that our models are very sensitive to this Hurst exponent. Values closer to 0.5 are probably more likely to be useful than values closer to 0 or 1. But on the other hand, this makes the Hurst exponent a powerful tuning factor for our models, and throws a bone towards the believers in momentum or return-to-mean (RTM) ideas:

“How can you claim i.i.d.?! There is clearly dependence on the prior day’s results in many cases!”
“Okay, well… How about this Hurst idea?”
“Yes! That’s what I’m talkin’ about!”

In the momentum case, prior results influence the next day’s results in the same direction, and in the RTM case, prior results influence the next day’s results in the opposite direction. Hence, memory doesn’t always mean “momentum.” It could instead mean: “Do, the opposite!” to quote Seinfeld’s George Costanza television character. If you don’t have memory of what happened previously, how could you do the opposite of it?

So now we have a way to modify our models to take this momentum (or lack thereof) into account, without abandoning our random walk / probability-based forecasting approach.

Not only that, the Hurst exponent has some “physicality” to it; it is not merely an abstract tuning parameter. Greater than 0.5, more momentum. Less than 0.5, more oscillation or return to mean.

Model tuning

We can demonstrate the usefulness of tuning a model with this one Hurst parameter by doing a full backtest of the default i.i.d. model (H = 0.5), then comparing this to a model roughly tuned by adjusting the Hurst exponent to a slightly higher value (which broadens the forecast range). Is this broadening of price forecast range equivalent to model tuning by increasing the historical observed volatility a bit? We do have a scaling factor for historical volatility, and in fact this was one of the first tuning factors we added to our modeler “way back when.” Or are the effects from the Hurst method more complicated? Such requires more research, but we suspect the latter to be true.

1 year validate of baseline i.i.d. model

The baseline i.i.d. model shows that it is underestimating risk. What should be a 1% violation of the 1% bottom end high risk trace (in red in the snapshot) is 5.6%, and the 5% target is coming up as 10.3%. That is, the reality of TSLA is a bit more extreme over 100 forecasted days than our simple i.i.d. model predicted, over a 1 year backtest. No surprise there, right? TSLA more extreme than default models predict? Yeah, sounds about right.

Ah ha, so maybe when there is bearish momentum with TSLA, it stays bearish for longer than i.i.d. models predict? We check this idea and show the results in the next screen shot.

When performing Hurst exponent tuning, we went through a few iterations — and we will not show them here for brevity’s sake — but we will show the final result: An approximately tuned model with Hurst set to 0.59 (implying some momentum, both up and down):

Exhaustive backtest of model with Hurst = 0.59, some momentum

Now our 5% target risk trace is coming in at about 6.3% in reality (within our suggested tolerances and “passing” [Note 3] the chi-square test that we propose in the app), and our 1% model trace is showing zero violations (also within our approximated acceptance tolerances and agreeing with chi-square estimates).

At the top end (which we care less about since this is a risk model, where we are trying to predict worst-case bearish behavior), the 95th and 99th percentiles are showing up okay also, at 93.7% and 100% respectively. This suggests that we are not disrupting the top end of the model too much with our violating the first “i” of the i.i.d. assumption: Good enough for now.

There are methods to estimate the Hurst exponent from raw data [Note 4], but all we offer now in our app is a user interface to tweak the Hurst value.

Is this the ultimate backtest? No. Maybe in reality, we need to backtest much longer than one year. But it gives you an idea on how to employ the Hurst exponent momentum factor in your probability models and tip-toe out of the pure theoretical world of i.i.d., where the past is assumed to not influence the future.

Method

We modify a solid chunk of code which implements the Hosking method [http://www.columbia.edu/~ad3217/fbm.html], which is ordinarily used to generate randomized fractional Brownian motion paths with a specified Hurst exponent (assuming normally distributed random values). We replace the normally distributed value that the Hosking method uses with an empirical daily return value from historical reality of a specific symbol (TSLA in our examples), leading to fractional Empirical motion.

“Whoa, hold on,” you might say at this moment. “Fractional, you say?” This term refers to “fractional diffrencing,” a key concept in some long term memory models for time series. We won’t divert down that path at the moment, since it is a long and winding road, but the concept involves “non integer derivatives” or “non integer differences” in the discrete case. You already know about first and second derivatives from Calculus 1; but what about a 0.5 derivative? Or a 0.7 derivative?

https://en.wikipedia.org/wiki/Fractional_calculus

For example, one may ask for a meaningful interpretation of

Where D is the differentiation operator. From the above FC Wiki

Getting back to our own story here, you can use the “display single random walk trace feature” of our app to highlight each fractional empirical motion generated trace to get a feeling for the qualitative geometry of these traces at different Hurst exponent levels. This is illustrated in our prior article.

Summary

Our innovation here in applying the Hurst method is using it in the context of empirical re-sampling of historical returns and the generation of drift-diffusion models via random walk price paths. Rather than assuming Brownian motion (which employs normally distributed returns distributions), we already had Empirical motion in our code (based on actual observed returns, which are typically not normally distributed). Then by adding the Hurst-inspired concepts of long term time series memory, we expanded our model type to what we call Fractional Empirical Motion, based upon the concept of Fractional Brownian Motion, but applying actual non-theoretical empirical returns distributions to fBm path generators:

https://en.wikipedia.org/wiki/Fractional_Brownian_motion

We see by testing that our model forecasts are very sensitive to the Hurst parameter, and we show that by adjusting this parameter, we can sometimes get more realistic models — the realism being checked by exhaustive backtesting.

Once you get a handle on whether stock returns are persistent at least some of the time, you can then return to pondering Dali’s painting title and mull: Simple redundancy, or Surrealist trick?

Further reading

Training slide set 9 for MCarloRisk3D: Fractional Empirical Motion models

https://diffent.com/mcrtrain/MCRSlideSet9V1.pdf

Notes

[Note 1] deleted
[Note 2] deleted

[Note 3] In our modeler’s exhaustive backtest (the Validate tab in the app), we report-out results from chi-square tests that compare two distributions (observed reality, and our model forecast) at each risk probability level of interest. In our current model, we set these interesting (to us) risk levels to: 1%, 5%, 25%, 50%, 75%, 95% and 99%. We are mostly interested in the 1% and 5% levels (bearish risk) but if we can match the higher levels also, that’s good as well. The more bullish probabilties can give us an idea if the most probable “expected value” (50% mid-range) of our forecast is reasonable, and it helps us ascertain if any tuning we do to the model to hit the 1% and 5% levels is making the model go completely crazy at the top end (or not).

8. The Chi squared tests

The χ²tests The distribution of a categorical variable in a sample often needs to be compared with the distribution of…

www.bmj.com

This chi-square topic is really one for a full article, because it gets kind of confusing. But the important thing to remember is that we don’t really “accept” results from a statistical test like this; we merely “fail to reject” some hypothesis. It can get confusing, fast, and we propose that statistical textbooks need to be slightly reformulated because of this: Instead of The New Math, The New Stats. There needs to be a guy like Richard Feynman coming along (RF being he who created a diagrammatical method to help clarify quantum mechanics); but instead of QM, this New Feynman needs to clarify statistics. We think that the reason one never “accepts” a conclusion from these types of tests is that when you are dealing with incomplete samples of data, you never really know if future samples will “flip over the chessboard” and tell you to make the opposite conclusion that you previously made.

[Note 4] It is interesting to estimate the Hurst value directly from historical TSLA data (only one time series of reality, instead of the thousands of simulated time series that our model generates) and compare it to our manually found value (based on backtesting comparisons).

Just doing a quick test using the free Gretl econometric software package on the most recent 1 year’s worth of TSLA returns (not prices) as of mid October 2023, we estimated the Hurst exponent to be about 0.65; somewhat larger than our manually found value of H = 0.59. But then again, we needed a Hurst value that refines the estimate of a future distribution of prices; we are not estimating spot prices with this type of model. It is encouraging, however, that both the historical estimate from Gretl and our manually tuned value are at least in the same direction (> 0.5, implying “some momentum” in TSLA returns). You might look at it as: Well, it’s only a 0.06 difference. About 10%, if 0.65 is your baseline. But as we noted earlier, our models are extremely sensitive to this Hurst parameter, and a 10% change in Hurst is a lot. Would we need a larger Hurst value (closer to the 0.65 that Gretl reports) to tune the model if we did a longer backtest, or checked an investment horizon of something other than 100 trading days? Such is an interesting topic to ponder, and test, but we leave that for later. Interested readers can try this themselves, with our apps.

Also recall, as we pointed out in a prior article, that this pattern of momentum (quantified by the Hurst exponent) may not be predictive of future behavior. Whether historical patterns are human identified or machine identified does not imply that they are necessarily predictive. They may be predictive, but more thorough study is needed to estimate predictiveness. Could the Hurst exponent be an example of an “invariant” of a time series that Meucci mentions are important to find, in his “quest for invariance?” https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1753788 Perhaps. But we do not know, at the moment.

On this page of the Gretl documentation, search for function “hurst” (for estimating H from a returns series):

https://gretl.sourceforge.net/gretl-help/cmdref.html

Fractional Empirical Motion In Monte Carlo Forecasts

One way to add Hurst exponents to drift diffusion models

8. The Chi squared tests

The χ²tests The distribution of a categorical variable in a sample often needs to be compared with the distribution of…

Written by NTTP