Did the Market Really Freak Out About a Fake Photo? (Or Maybe We Just Need a Better Way to Put Price Moves in Context)

Published in

Proof Reading

16 min readJun 8, 2023

On May 22, 2022, the S&P 500 fell about −0.7% over the course of roughly 28 minutes before substantially rebounding. We can see this movement in the graph of trade prices of SPY as a function of time:

This graph spans the regular trading day from 9:30 am to 4:00 pm, and we can see the somewhat sharp dip happening in the morning, roughly around 10 am.

Staring solely at this, we might guess that something unusual was driving this level of price motion. In fact, it was reported in several places that the market had sharply declined due to the circulation of a fake photo of an explosion near the Pentagon. A Bloomberg article (available here) described the events as follows:

“Just past 10 a.m. New York time, when the photo was circulating, the S&P 500 declined by about 0.3% to a session low. As news emerged that the image was a hoax, the index quickly rebounded.”

There is one thing immediately strange about this description relative to the SPY price graph. Our description of a larger 0.7% decline is based on the full (and relatively continuous) drop from the high price of $420.45 to the low of $417.35, so where is this 0.3% number coming from? We can see it if we mark 10 am on our graph with a vertical line:

This context makes the causal attribution of the following 0.3% drop seem questionable. Another plausible interpretation is that it was a somewhat natural continuation of a movement already in progress. Even putting the thorny issue of causality aside, looking at market movements with more context can help us internalize how common or uncommon a given price movement is in the grander scheme of things. In other words, we should want to know: how often do things like this happen? Is this an unusual and scary occurrence? Or do things like this happen every day and we just don’t always notice?

On the surface, this may seem like a straightforward question. But there is a lot of nuance hiding in the phrase “things like this.” One very direct interpretation of “things like this” could be: price decreases of at least 0.3%) over a similar number of minutes. This version of the question though, is clearly shaped by the exact example we are considering. Would we really say a faster decrease of 0.2% doesn’t count as equally interesting? Or a decrease of 0.9% that happens more slowly? The more we lean on our particular example to inspire our criterion, the more arbitrary it seems. Ideally, a notion of “things like this” would be something we could define more generically, with a motivation that makes sense without reference to this particular example.

Similar issues arise more mundanely in our daily lives. We might wonder: my best friend’s husband has the same birthday as my cousin — is that weird? We might think through this by then asking: well, what’s the chance that these particular two people have the same birthday? The answer is 1 out of 365, which makes it seem somewhat weird. But a better question might be: what’s the chance that two of my acquaintances will have the same birthday? Once I’ve accounted for the larger pool of people who are roughly at least as connected to me as my best friend’s husband and my cousin, it probably won’t seem weird.

Sometimes, the right generalization of our question is not obvious. We might ask: my coworker and I have the same favorite movie — is that weird? In this case, the chance of two particular people having the same favorite movie is itself rather nuanced, as we have to account for the relative popularity of the movie, affinity based on shared cultural context or age, etc. But even then, this does seem like a particular instance of a more general question — would I find it equally interesting if we have the same favorite food? What should be the universe of comparable events like “having the same favorite movie” that I should consider for context? And then, what’s the universe of comparable people? All of my coworkers? All of my acquaintances? Perhaps after accounting for all of this, it won’t seem weird. But then again, maybe it will. Probably not too many people say “A Serious Man” is their favorite movie.

It is not a particularly natural impulse to evaluate observations this way. It is often more tempting to zoom in rather than zoom out: looking into more specific details of something that has caught our interest, rather than stepping back first to clarify what is truly worthy of our attention.

In this report, we will zoom out and attempt to define what may count as an “interesting” price movement in a way that captures the example of SPY on May 22, 2023. We will then screen for such examples across other days, in order to gain a sense of how common such movements are. We will look at further examples that our screen flags as potentially the most interesting, to see what we might learn from them.

Defining Price Movements

For our purposes here, we will stick to trade prices (as opposed to quotes) as our raw data. For now, we will not filter trades based on condition codes, but will rather look at all trades reported in NYSE TAQ data between the regular trading hours from 9:30 am to 4 pm for each day. Since it will be nice to have a framework that generalizes well across symbols and time periods, looking at price movements in percentage terms is likely to be more informative than looking at price movements in terms of absolute dollars and cents. So for each symbol and day of data, we’ll divide each trade price by the price of the first trade reported. Applying this normalization to our trade prices for SPY on May 22, 2023, we get

which looks basically the same as before, but now the price scale is normalized to begin at 1.

We see in this graph some individual trade prices that appear as isolated dots, substantially separated from the general curve. Understanding and filtering out all such examples on a per-trade basis is not really practical if we want to the ability to screen for price movement patterns across a lot of symbols and days, but leaving such examples in is also problematic. We probably don’t want to treat these isolated prices as representative of market-wide movement.

Turning this kind of intuition into a specific filter involves making some arbitrary choices, so here goes. For each trade, we’ll compare its (normalized) price to the average of the 50 trades preceding and 50 trades following it. [Note: this average is taken without volume-weighting.] We’ll then look at the distribution of these deviations from the local average for that symbol and day, and compute the 99th percentile. Finally, we’ll discard all trades that fall above that 99th percentile in deviation. Here is what our SPY example looks like after this filtering:

Now that we’ve removed some local outliers, we’re interested in defining overall price “movements” inside our time series of remaining trades for each symbol and day. Intuitively, we want to compare prices at various points in time to look at how much prices have changed. But which points in time? There are a few different approaches we could take. One would be to compare the price of trade j to the price of trade i for all pairs of trades j and i. We could compute the slope of a straight line connecting these (normalized) prices, as well as the length of time separating trade j and trade i. However, this computation would be prohibitively expensive. If we have tens of thousands or hundreds of thousands of trades, then we will have hundreds of millions to hundreds of billions of pairs of trades: just for one symbol and one day! In particular, our remaining data set for SPY on May 22, 2023 contains 476,178 trades, resulting in over 100 billion pairs of trade prices.

One way to reduce our computational problem would be to divide time into fixed increments and sample prices at these increments, looking at the most recent trade price as of each minute for example. We don’t want to make our time increments too long here, since we can potentially miss interesting phenomena that occur within a single increment. Once we had our price samples per minute, we could potentially look at all “movements” spanning from some minute i to some other minute j. If we look over all pairs of minutes in the regular trading day, we would have 390∗389/2 = 75855 computations to perform. This might be workable, but the tension here between wanting finer increments so that we don’t miss meaningful movements and wanting coarser increments so that our computation is feasible could be hard to navigate.

A different approach is spline fitting, which tries to approximately trace the shape of a data set using a sequence of line segments or low degree polynomials. The “knots” where the line segments or polynomials are glued together can be chosen dynamically in order to create a good fit, rather than being fixed ahead of time. Here is a spline fit for our SPY data on May 22, 2023, for example, that uses 214 knots connected by line segments:

Our hope with using splines is that the knots identify a superset of the truly important points in time for understanding major price movements, and that the number of knots will be much, much smaller than the number of trades. This potentially gives us a good balance of computational efficiency and flexibility for catching phenomena of varying lengths that don’t fall neatly across fixed length increments.

Fitting splines When we fit a spline to a data series, we must specify a few parameters. The first is what kind of pieces we want to use between knots. Typical choices include lines, quadratics, or cubics. For our purposes here, we’ll keep it simple and just use lines. The second parameter is how well we want the spline to fit. A perfect fit would presumably require having a knot at each trade price and simply connecting adjacent trade prices with a line segment. This would perfectly “describe” our time series of trade prices as a sequence of little line segments, but it wouldn’t help us narrow down our focus to price movements in a larger or more fluid sense. So we want to give the spline a tolerance for missing some of the individual data points and encourage it to use fewer knots.

The more jittery our time series is, the harder it will be to fit it well with a smaller number of knots. Hence, the tolerance we provide should likely be a function of the variance of our prices. For this example, I found that computing the variance of the normalized prices and multiplying this by 5000 was a good value to put into the “s” parameter that is used to choose the number of knots in the function scipy.interpolate.UnivariateSpline for fitting splines in python. These parameter settings were used to make the spline pictured above.

From Splines to Price Movements In some cases, it may make sense to view the linear segments between the knots of our spline fit as a full set of price “movements” to consider. We might look at the steepest slope of our segments and consider how long it lasted. But if we look closer at our spline fit above, we see that it has broken up what we would intuitively view as continuous movements here into multiple line segments, so viewing only single segments will not give us a larger sense of the full downward movement around 10 am, for example.

An alternative consideration could be: how much time is spent in line segments whose slopes are at least X, or at most −X? If we compute the answer to this question across an array of different values of X, we might hope to capture something meaningful about drastic price movements. However, we might be capturing jitter with this more than we are focusing on relatively smooth increases or decreases. As we can see in our spline fit above, many of the short individual line segments are sharply trending up or down to fit what looks more like noise than a general price trend. It can be tempting to say — well, let’s just change the fitting parameters to force it to fit a smoother spline — but it’s probably quite challenging to make such decisions well globally for all days and all stocks in an automated fashion. This means we ideally should not assume that our line segments themselves truly capture everything we’d want to know about price movements.

Instead, we’ll take our set of knots and look at every pair of them. For each pair, we’ll compute the slope and length of a line segment that directly connects them. Looking at our example splines above, we might hope that this set of pairs will contain line segments that capture fairly well the bigger falls or jumps in price that we see. We’ll call each such line segment a price movement, as it connects two knots on the spline that approximates our price curve. We’ll limit our goal here to studying price movements that persist for at least 5 minutes at a time. This allows us to discard any price movements we compute for knots that are less than 5 minutes apart from each other.

Defining Price Movement Profiles

Carrying around a set of line segments that (hopefully!) contains the most meaningful price movements is a bit like bringing a fully packed suitcase for a one hour car ride. Yes it technically “solves” the problem of packing that book that you end up wanting to read, but you’ll waste most of the time just digging through your bag. Ideally, we want a procedure for concisely describing what might be most interesting inside the set of price movements we’ve collected for a particular symbol and day. Here we have a set of choices to make.

So what makes a price movement interesting to us? Inherently, this feels like it must be a function of both slope and length. If two price movements have the same length but different slopes, than the steeper one is more interesting. If two price movements have the same slope but different lengths, than the longer one is more interesting. However, what about two price movements where one has a steeper slope and a shorter length? It is not clear between these which one is more interesting. To evaluate this, we will probably need more context about what is “typical” for a given symbol over a given time period.

In mathematical terms, this gives us a partial ordering on the set of price movements we’ve collected. Given two price movements, we may be able to declare that one is strictly more interesting than the other, or we may declare them as incomparable. Since our goal is to capture the most dramatic or unusual price movements, we can then cull our set to those that are maximal in this partial ordering. (Being maximal here means that no other price movement is strictly more interesting.)

We may want to encode information about our maximal line segments in a more convenient form. For this, we’ll fix a vector x = (x₁, . . . , xₙ) of possible slope values. For each positive value of xᵢ, we’ll define yᵢ to be the max length among all of our price movements that have slopes ≥ xᵢ . For each negative value of xᵢ , we’ll define yᵢ to be the max length among all of our price movements that have slopes ≤ xᵢ . The resulting two vectors x and y encode information about our maximal price movements, and we’ll refer to these collectively as a price movement profile. Let’s take a look at the price movement profile for SPY on May 22, 2023:

The x-axis here represents slope, in units of relative increases/decreases in price per minute. The y-axis represents the number of minutes of the longest price movement with a steeper slope. We’ve chosen (somewhat arbitrarily) to plot a point for each increment of 0.0001 (a.k.a. 1 bps) along the x-axis. It’s hard to know in isolation what to make of this graph, so let’s zoom in on the part that reflects the 0.7% decline over about 28 minutes and compare this to the price movement profile of the following trading day (May 23) (graphed in blue):

We note that a 0.7% decline over 28 minutes translates to an average −0.00025% decline per minute. We can see the red points of our “news worthy” day here peaking above the blue points of the next day, indicating that May 22 prices achieved these levels of decline on average over longer time periods. However, this isn’t even the largest gap between the movement profiles of the two days.

Scoring Deviations from Average Behavior

Looking at such examples by hand can only get us so far in properly contextualizing what’s going on. But since movement profiles do seem capable of capturing the kind of phenomena we are interested in, how can we automatically screen them for anomalies worthy of attention? If we are looking across time periods and/or across symbols, we should expect that time periods or symbols which are more volatile will have longer price movements at steeper slopes, so we should adjust somehow for volatility or what is “normal” for a given symbol in a given time period. One way to do this is to compute a 20-day rolling average price movement profile for symbol, and then compare each current day to the prevailing average.

This comparison could theoretically take many forms. For now, what we’ll do is take each value of our movement profile and divide it by the trailing 20-day average value. This ratio will be a score of how “unusual” this particular value is. For example, suppose the 20-day average indicates an average max length of 100 minutes for slopes above +1bps per minute, and our current movement profile has a max length of 200 minutes for these slopes. Then our ratio here will be 2. In general, a high score on this ratio means that we have a price movement of a given slope that is longer than typical for such slopes. The highest score among all the different slope values in our profile will serve as our overall score for a given symbol and trading day.

More concretely, we’ll fix a set of slopes to consider in our movement profile. For now, we’ll consider slope values from -100 bps to 100 bps per minute, in increments of 1 bps. Thus, our movement profiles will have about 200 entries. For each entry, we’ll take the length of the longest price movement exceeding this slope, and we’ll divide that length by the 20-day rolling average for this value. Then we’ll look over our roughly 200 ratios computed this way, and take the max of these values as our final score.

Looking at Score Distributions Over Time

Once we have defined a score per symbol per day, we want to get a sense of what typical scores look like. For now, we’ll look at scores for SPY, over the time period from Jan 1, 2023 through May 31, 2023. Here is a histogram of the daily scores, rounded to the nearest integer:

There are 103 trading days represented here, and for most them, the score rounds to 1 or 2. The score for May 22 is about 2.98, which ranked 14th highest among the 103 scores. Interestingly, this score of 2.98 didn’t even come from the morning decline we were focusing on above, but rather from a positive slope value being represented for a more unusually long period. This fuller context reveals that the market movements in SPY on May 22nd were not particularly remarkable, but rather the sort of the thing that happens on more than 10% of the days in recent market conditions.

The Highest Scoring Examples

Naturally we may be curious — so what were the highest scoring days in SPY on this metric, and what do their price movements look like? The highest score over this time period was about 7.69, achieved on February 1, 2023. Here’s what the (normalized) prices and spline fit look like for that day:

To the naked eye, this does indeed look like a more dramatic price movement than the graph for May 22. However, this becomes much clearer if we force both graphs to be plotted on the same range for the y-axis, instead of letting our python plotting tools choose the ranges dynamically. Here is the apples-to-apples version:

In this view, it is clear that the much higher score is appropriately earned. Just for fun, here are another couple of higher scoring days, plotted in this same fixed range for the y-axis:

The score for February 7 was 5.04, and the score for March 13 was 6.20.

Conclusion and Further Directions

Now that we have a framework for scoring how “weird” a price move is, there are a few things we could do with it. Journalists could use a framework like this to give much needed context to the price movements they report on. Every price movement reported, for example, could be accompanied by a score like this to give readers a quick sense of how commonly/rarely things like this happen.

Another thing we could do is to try to correlate the weirder movements we identify with news or other external events. This could tell us what fraction of “weird” things are potentially explainable by these observable external phenomena, though the small sample size does make it particularly fraught to assume causality. We could also try to develop this further into an alerting system that would detect “weird” things while they are happening, and/or score live events as they unfold so that human observers have context in real-time about just how rare a price movement they are seeing. We could also use this kind of scoring framework to drive content and case studies for research. At the end of every quarter, for example, we could do a retrospective screen and study some of the most anomalous price movements across different sets of symbols. We could also use this framework to find similar clusters of examples and see if we can improve volume prediction or other model performance metrics on these kinds of examples specifically by honing in on any features that are common to these kinds of price movements.