Curve-fitting and Optimization In Trading Strategy Development
Trading strategy development usually involves optimization and curve-fitting of some sort. There are different and even conflicting views about the meaning of optimization and curve-fitting and their impact on strategy performance. Some claim that optimization and curve-fitting are unavoidable and even necessary but others insist that any trading strategy that is optimized or curve-fitted will eventually fail.
What is curve-fitting?
In mathematics, curve fitting is the process of finding a curve that fits best a collection of data points in the sense that some objective function subject to constraints is maximized (or minimized). For example, least squares is a curve-fitting method that minimizes the sum of squared residuals. A residual is the difference between a fitted and an actual value. The objective function to minimize when using this method for achieving the best fit is the sum of the squared residuals. Note that a “best fit” is defined only relative to the chosen objective and that curve fitting is essentially the result of optimization.
Curve-fitting and optimization
When one adopts the definition that trading strategies are processes that generate collections of entry and exit signals, then one realizes that what it is done essentially when any parameters are adjusted via back-testing is that the timing of the signals is varied so that they are fitted on historical data in such a way so that some objective function is optimized. This is not curve-fitting in the usual sense because one is not merely trying to find a curve that best fits the historical data but instead find the best collection of entry signals that in conjunction with the exit signals maximize some objective. This process is much more involved and complicated than simple curve-fitting. It involves selection, or timing of entry and exit signals, so that an objective function that is related to performance is optimized. This is an optimization problem rather than just a curve-fitting problem. As already mentioned, curve-fitting may involve optimization but the latter is a process with a much broader scope and includes many more possibilities than the former. Therefore, it is better to refer to optimized rather than to curve-fitted strategies although this turns out to be more of a semantics issue for those that understand the process in depth.
Let us consider a simple moving average crossover strategy that generates long entry signals when SMA(t1) > SMA (t2), where t1 and t2 are the periods with t2 > t1, and short entry signals when SMA(t1) < SMA(t2). In its simplest form, this is a stop and reverse strategy, i.e., when an opposite signal is generated the previous position is closed and reversed. This strategy cannot be used in practice unless the values of t1 and t2 are selected. The values are usually determined via optimization of performance using back-testing on historical data. Many believe that this process results in strategies that fail in actual trading because they are curve-fitted. Is this a valid claim?
Strategy failures may be related to changing market conditions
Actually, no one has ever proven mathematically that the failures of optimized strategies, which are well documented, are primarily due to the optimization, or what is commonly referred to as curve-fitting. It may be the case that the failures are merely due to the the nature of the strategies and their inability to adapt to changing market conditions. It is more probable that optimized trading strategies will fail for any values of their parameters at some point. It is the nature of the strategy and not the optimization that causes the failure. The large class of trading strategies based on technical analysis indicators has high probability of failure but that has been wrongly attributed based on my experience to the optimization process for setting parameters. It does not even matter whether the parameters are set so that small changes in their values result in stable performance. This is not an issue of the integrity of the optimization method used but of the nature of these trading strategies.
In my paper “Limitations of Quantitative Claims About Trading Strategy Evaluation” I have an example that shows how changing market conditions affect strategy performance and that selection of parameters is irrelevant.
However, any optimization that causes selection of entry and exit collections is in general a problematic process because it may introduce bias. Selecting collections that performed best in the past overlooks the fact that many other similar collections failed.
Going back to the simple moving average crossover strategy, it is easy to understand that given a specific historical data series, changing the values of t1 and t2 will cause a change in the timing of the entry and exit signals. In this case, selecting any collection of entry and exit signals that results from specific values of the parameters such that some objective function is maximized introduces bias. This is because it may be due to chance that the specific collection survived in specific market conditions. In the simple example, each collection is completely different from the others in the sense that both the entry and the exit points are different. What can we do to minimize the bias so that the integrity of the optimization process is not compromised? This question can be answered if we first understand how different types of strategies are affected by optimization of their parameters.
A three-level classification of optimized trading strategies
We can distinguish three types of strategies depending on how optimization effects their collection of entry and exit points:
Type-I curve-fit: When the parameters of Type-I strategies are adjusted both the entry and the exit signals are affected, as for example in the simple moving average crossover strategy considered before. In this case, optimization and curve fitting result in collections of entry and exit signals that differ and selecting one that performs best introduces selection bias. These strategies have the highest probability of failure.
Type-II curve-fit: When the parameters of Type-II strategies are adjusted, only the entry signals are affected. In this case, optimization and curve fitting result in collections of entry and exit signals that differ only in their entry part. Selection introduces less bias than with Type-I strategies. These strategies have lower probability of failure that Type-I strategies. Example: Enter long if SMA(t1) > SMA(t2) and Price < P and Exit long at P1 or P2 where P1 and P2 are fixed prices (profit price and stop price).
Type-III curve-fit: When the parameters of Type-III strategies are adjusted, only the exit signals are affected. In this case, optimization and curve fitting result in collections of entry and exit signals that differ only in their exit part. Selection introduces less bias than in the case of Type-I or Type-II. These strategies have the lowest probability of failure because the timing of entry signals is not affected by optimization. Example: Enter long if Close of today > Close of 2 days ago and Exit long at entry price + x points or at entry price — y points, where x and y are the parameter to optimize (profit target and stop-loss).
In general, strategies that include indicators involve Type-I curve-fit. Type-II curve fit is rarely present in practice. Type-III curve-fit includes the broad class of strategies based on parameter-less price patterns.
Most software programs that discover trading strategies automatically generate Type-I strategies. It is irrelevant how many statistical tests they perform to measure the significance of the results as these strategies have high probability of failure during actual trading because of their nature and changing market conditions. Note that not all Type-III strategies make sense. For example, trying to discover such strategies without a guiding market model is an exercise in futility since there are billions of combinations of price action features that can result in this type of strategies and selection bias is extremely high. However, it appears that in short-time frames these strategies can be more effective if designed properly.
The important issue is not whether a strategy is optimized because all strategies are in one way or another, but to what degree optimization impacts the probability of failure due to its nature and changing market conditions. Strategies can fail due to many reasons but in this article we dealt with optimization and curve-fitting. Type-III curve-fit strategies, as defined above, appear to have the lowest probability of failure if properly designed. However, in most cases the design is naive and not guided by appropriate market model.
This article was originally published in Price Action Lab Blog.
If you have any questions or comments, happy to connect on Twitter:@mikeharrisNY
About the author: Michael Harris is a trader and best selling author. He is also the developer of the first commercial software for identifying parameter-less patterns in price action 17 years ago. In the last seven years he has worked on the development of DLPAL, a software program that can be used to identify short-term anomalies in market data for use with fixed and machine learning models. Click here for more.