Backtest Fast and Slow Stochastic Crossover Strategy in Elasticsearch
In my previous article, “Wow! Backtesting RSI crossover strategy in Elasticsearch”, introduced how to back-test the RSI crossover strategy. In this article, we will implement a Stochastic crossover strategy and compare its performance with RSI. Although the Stochastic indicator was developed by George C. Lane in the late 1950s and is very old, it is still very popular. Similar to the RSI indicator, the Stochastic is also the momentum of price fluctuated between 0 and 100. Therefore, they are called oscillators. Both indicators can be used to identify overbought and oversold regions. Similar to the RSI, the stochastic indicator defines an oversold area below 20 and an overbought area above 80.
The price change in Stochastic is converted into a kind of data, that is, the ratio of two distances. The distance between the most recent closing price C and the most recent high price, and the distance between the most recent high price and the most recent low price. The equation can be rewritten as follows: MMaxn,1 and MMinn,1 is the moving maximum and moving minimum values. Corresponding to the Elasticsearch moving function with a window of n, it needs to shift 1 data to the right to include the current data. 14 periods are commonly used for window n.
Similar to MACD, Stochastic also defines a signal line named %D, which is a three-period SMA of %K.
There are two types of Stochastic indicators, fast and slow. The fast one is more sensitive to the price change than the slow one. The fast one will generate more buy or sell signals than the slow one. The slow Stochastic can be defined as follows:
Stochastic crossover strategy can be defined as emitting a sell signal when the %K line crosses below the %D line in the overbought zone (> 80) and emitting a buy signal when the %K line crosses above the %D line in the oversold zone Signal (< 20). For other values, be patient and wait for the buy or sell signal.
It is much easier to use graphs to observe changes in values. In this article, we try to apply backtesting to commission-free exchange-traded funds (ETFs) and focus on Elasticsearch as an analysis tool. The following example randomly selects “Fidelity International Multifactor ETF”. Its ticker symbol is FDEV. 10 more ETFs randomly selected will be run, and the final results will be shown later. The data is selected from the time range between 2021–01–15 and 2021–05–31 provided by IEX, Investors Exchange. The chart below shows the Stochastic (FK/SK) and the signal line (FD/SD) drawn together with the daily closing price. In the daily price curve, prices with selling signals are marked in red, and prices with buying signals are marked in blue. As shown in the figures below, the number of signals generated by the slow Stochastic is less than fast Stochastic.
Here, we present a simple Stochastic crossover strategy and use Elasticsearch to show the implementation details.
◆ Assuming that it is restricted to buy and hold 1 share at a time, no transaction will occur until the held share is sold.
◆ Buy 1 share when FK/SK crosses above FD/SD in the oversold region (< 20).
◆ Sell 1 share when FK/SK crosses below FD/SD in the overbought region (> 80).
◆ At the end of the backtesting period, a hold share is cashed with the current price.
According to the Stochastic trading strategy, there are 5 blue points and 11 red points for the fast stochastic, but only 3 buy and 3 sell transactions are allowed. There are 3 blue points and 10 red points for the slow stochastic, but only 2 buy and 2 sell transactions are allowed. Let’s describe the implementation using Elasticsearch. Suppose there is an Elasticsearch index populated with data, and its data mapping used is the same as described in the previous paper. The following steps demonstrate the code of the REST API request body.
Collect all relevant documents through the search operation
Use a “bool” query with a “must” clause to collect documents with the symbol FDEV and the date between 2021–01–15 and 2021–05–31. Due to the computation of 14-period for moving max/min and two 3-period for SMA in slow Stochastic, additional data is adjusted for 1 month (from 2020–12–15 to 2021–01–14).
{
"query": {
"bool": {
"must": [
{"range": {"date": {"gte": "2020-12-15", "lte": "2021-05-31"}}},
{"term": {"symbol": "FDEV"}}
]
}
},
Extract the close value of the fund
Use a “date_histogram” aggregation, named Backtest_Stochastics, with the parameter “field” as “date” and the parameter “interval” as “1d” to extract the prices of the fund each day. Then followed by an “average” aggregation, named Daily, to retrieve the close price since the subsequent pipeline aggregation cannot directly use the document fields.
"aggs": {
"Backtest_Stochastics": {
"date_histogram": {
"field": "date",
"interval": "1d",
"format": "yyyy-MM-dd"
},
"aggs": {
"Daily": {
"avg": {"field": "close"}
},
Extract the date of the bucket
Because of the additional data, subsequent operations need to filter out the out-of-range portion later. A “min” aggregation named “DateStr” is to get the date of the bucket. In the Elasticsearch server, the date is stored in Epoch time. The time unit is milliseconds, and the time zone is UTC.
"DateStr": {
"min": {"field": "date"}
},
Select the buckets with more than 1 document
To filter out the empty buckets (non-trading days), a “bucket_selector” aggregation, named SDaily, is used to select buckets with its document count greater than 0.
"SDaily": {
"bucket_selector": {
"buckets_path": {"count":"_count"},
"script": "params.count > 0"
}
},
Calculate the daily simple moving maximum and minimum of the close price
Use two “moving_fn” aggregations, named MMax and MMin, with the parameter window as 14 and the parameter “buckets_path” as Daily. The parameter “shift” is set to 1 to include the most recent data. MMax and MMin are calculated using the functions MovingFunctions.max() and MovingFunctions.min().
"MMax": {
"moving_fn": {
"script": "MovingFunctions.max(values)", "window": 14, "buckets_path": "Daily", "shift":1
}
},
"MMin": {
"moving_fn": {
"script": "MovingFunctions.min(values)",
"window": 14, "buckets_path": "Daily", "shift":1
}
},
Calculate %K and %D of fast Stochastic and slow Stochastic
Use three “bucket_script” aggregations, FK for fast %K, FDSK for fast %D and slow %K, and SD for slow %D. Use the parameter “buckets_path” to specify the results from Daily, MMin, and MMax. Then, the fast %K is calculated according to the equation in the script. FDSK is %K’s 3-period SMA and SD is FDSK’s 3-period SMA. The parameter “shift” is set to 1 to include the most recent data.
"FK": {
"bucket_script": {
"buckets_path": {"Daily": "Daily", "MMin": "MMin", "MMax": "MMax"},
"script": "100 * (params.Daily - params.MMin)/(params.MMax - params.MMin)"
}
},
"FDSK": {
"moving_fn": {
"script": "MovingFunctions.unweightedAvg(values)", "window": 3,
"buckets_path": "FK", "shift": 1
}
},
"SD": {
"moving_fn": {
"script": "MovingFunctions.unweightedAvg(values)", "window": 3,
"buckets_path": "FDSK", "shift": 1
}
},
Identify the crossover type of %K and %D
a) Use a “bucket_script” aggregation named, FKFD_Diff, with the parameter “buckets_path” to specify the FK and FDSK value to determine whether the distance is positive or negative. If FK is above FDSK, set it to 1. If FK is below FDSK, set it to -1. If they are equal, set to 0. SKSD_Diff aggregation can be defined in the same way.
"FKFD_Diff": {
"bucket_script": {
"buckets_path": {"FK": "FK", "FDSK": "FDSK"},
"script": "(params.FK - params.FDSK) > 0 ? 1 : ((params.FK - params.FDSK) == 0 ? 0 : -1)"
}
},
"SKSD_Diff": {
"bucket_script": {
"buckets_path": {"FDSK": "FDSK", "SD": "SD"},
"script": "(params.FDSK - params.SD) > 0 ? 1 : ((params.FDSK - params.SD) == 0 ? 0 : -1)"
}
},
b) Use a derivative aggregation, named F_Diff, with the parameter “buckets_path” to specify the value of FKFD_Diff to find the difference to the value of the timestamp ahead. For fast Stochastic, a value of -1 or -2 indicates that there is a cross between FK and FD. A value of 1 or 2 indicates that there is a cross between FK and FD. S_Diff aggregation can be defined in the same way for slow Stochastic.
"F_Diff": {
"derivative": {
"buckets_path": "FKFD_Diff"
}
},
"S_Diff": {
"derivative": {
"buckets_path": "SKSD_Diff"
}
},
c) The crossing of %K and %D may involve one or two trading days. Therefore, we need data from the previous trading day. Use a “moving_fn” aggregations, named PRE_FK, with the parameter window as 1, and the parameter “buckets_path” as the FK with the function MovingFunctions.sum() to include only the data of the previous day. PRE_FDSK and PRE_SD aggregation can be defined in the same way.
"PRE_FK": {
"moving_fn": {
"script": "MovingFunctions.sum(values)",
"window": 1, "buckets_path": "FK"
}
},
"PRE_FDSK": {
"moving_fn": {
"script": "MovingFunctions.sum(values)",
"window": 1, "buckets_path": "FDSK"
}
},
"PRE_SD": {
"moving_fn": {
"script": "MovingFunctions.sum(values)",
"window": 1, "buckets_path": "SD"
}
},
d) When the crossing of %K and %D involves two trading days, one of the trading days may not be in the overbought or oversold region. To ensure that the crossover occurs in the right region, the first trading day of the crossover must be in the overbought or oversold region. From the experimental results, restricting all two trading days in the overbought or oversold region may not produce good results. Therefore, we treat the second trading day of the intersection as irrelevant. To determine whether the crossover is valid, check the following criteria for aggregation F_Type. If it is a sell signal, set F_Type to 1. If it is a buy signal, set F_Type to -1. Otherwise, set F_type to 0. S_Type aggregation can be defined in the same way.
◆ Within overbought region
params.PRE_FK > 80 && params.PRE_FDSK > 80
◆Within oversold region
params.PRE_FK < 20 && params.PRE_FDSK < 20
◆Crossovers need to concern within oversold region
params.F_Diff == -1 || params.F_Diff == -2
◆Crossovers need to concern within overbought region
params.F_Diff == 1 || params.F_Diff == 2
◆FK cross below FDSK
params.FK <= params.FDSK
◆FK cross above FDSK
params.FK >= params.FDSK
"F_Type": {
"bucket_script": {
"buckets_path": {
"F_Diff": "F_Diff", "FK": "FK", "FDSK": "FDSK", "PRE_FK": "PRE_FK", "PRE_FDSK": "PRE_FDSK"
},
"script": "((params.F_Diff == -1 || params.F_Diff == -2) && params.PRE_FK > 80 && params.PRE_FDSK > 80 && params.FK <= params.FDSK) ? 1 : (((params.F_Diff == 1 || params.F_Diff == 2) && params.PRE_FK < 20 && params.PRE_FDSK < 20 && params.FK >= params.FDSK) ? -1 : 0)"
}
},
"S_Type": {
"bucket_script": {
"buckets_path": {
"S_Diff": "S_Diff", "FDSK": "FDSK", "SD": "SD", "PRE_FDSK": "PRE_FDSK", "PRE_SD": "PRE_SD"
},
"script": "((params.S_Diff == -1 || params.S_Diff == -2) && params.FDSK > 80 && params.SD > 80 && params.FDSK <= params.SD) ? 1 : (((params.S_Diff == 1 || params.S_Diff == 2) && params.PRE_FDSK < 20 && params.PRE_SD < 20 && params.FDSK >= params.SD) ? -1 : 0)"
}
},
Filter out the additional documents for output
Use a “bucket_selector” aggregation, named S_Date, with the parameter “buckets_path” to specify “DateStr” to select the correct buckets. The selection criteria are those buckets having the date on or after 2021–01–15 (the epoch time 1612137600000 in milliseconds).
"S_Date": {
"bucket_selector": {
"buckets_path": {"DateStr": "DateStr"},
"script": "params.DateStr >= 1612137600000L"
}
}
}
}
},
"from": 0, "size": 0
}
After collecting result, we can draw the figures as shown before.
The result of the implementation will emit buy, sell, or hold signals; however, those signals only satisfy the second and the third cases of the simple Stochastic crossover strategy trading strategy. For the first and fourth cases, we need to use Python programming language to code the program. The program includes four parts.
Read two command line parameters. One is for the selected ticker symbol, and the other is for the file name containing the trading strategy written in Elasticsearch REST API Request body using JSON format.
◆ Get the data from the Elasticsearch server.
◆ Parse the response data and refine the buy and sell signal.
◆ Report the backtest statistics.
The main function is shown as follows:
◆ Get the data from the Elasticsearch server.
◆ Parse the response data and refine the buy and sell signal.
◆ Report the backtest statistics.
The main function is shown as follows:
def main(argv):
inputfile, symbol, type = get_opt(argv)
resp = get_data(inputfile, symbol)
transactions = parse_data(resp, type)
report(transactions, type)
In this article, only the code segment for the refinement of buying and selling signal is shown. Readers can further refer to the open-source project on GitHub (Backtest_Stochastics). To ensure that one share is bought and held at a time, and no transaction occurs before the held stock is sold, we use the boolean variable “hold” to ensure that the transaction meets the following conditions.
◆ A buy signal (value equal to -1) is honored when the hold flag is False
◆ A sell signal (value equal to 1) is honored when the hold flag is True
The parse_data() function is shown as below. Finally, the transaction array will contain the valid signal.
# parse the response data and refine the buy/sell signal
def parse_data(resp, type):
result = json.loads(resp)
aggregations = result['aggregations']
if aggregations and 'Backtest_Stochastic' in aggregations:
Backtest_Stochastics = aggregations['Backtest_Stochastic'] transactions = []
hold = False
if Backtest_Stochastics and 'buckets' in Backtest_Stochastics:
for bucket in Backtest_Stochastics['buckets']:
transaction = {}
transaction['date'] = bucket['key_as_string']
transaction['Daily'] = bucket['Daily']['value']
# honor buy signal if there is no share hold
if bucket[type]['value'] == -1:
transaction['original'] = 'buy'
if not hold:
transaction['buy_or_sell'] = 'buy'
else:
transaction['buy_or_sell'] = 'hold'
hold = True
# honor sell signal if there is a share hold
elif bucket[type]['value'] == 1:
transaction['original'] = 'sell'
if hold:
transaction['buy_or_sell'] = 'sell'
else:
transaction['buy_or_sell'] = 'hold'
hold = False
# for other situations, just hold the action
else:
transaction['original'] = 'hold'
transaction['buy_or_sell'] = 'hold'
transactions.append(transaction) return transactions
The python program provides statistics on the trading strategy including the “win” and “lose” of the entire buy and sell transaction. The following is the result after FDEV is run for the F_Type signal.
number of buy: 3
number of sell: 3
number of win: 3
number of lose: 0
total profit: 1.52
profit/transaction: 0.51
maximum buy price: 28.90
profit percent: 5.26%
The following table collects all statistical data of 11 randomly picked ETFs using the fast Stochastic crossover trading strategy from 2021–01–15 to 2021–05–31. The results show that this period is a good period for trading because all selected symbols can have a profit except FTEC ETF. However, do not use a single indicator for trading based on the recommendations of most traders.
The table below is for the result using a slow Stochastic crossover trading strategy from 2021–01–15 to 2021–05–31. It shows that no transaction is made for FQAL.
The table below is the comparison of the strategy trading result among fast Stochastic, slow Stochastic, and RSI. It shows that RSI has a higher gain than the other two indicators.
The table below gives a summary of the number of times of buy, sell, win, and loss. The RSI crossover trading strategy has a better performance.
Remarks:
I. Thanks to IEX (Investors Exchange) providing ETF data and GitHub providing open-source project storage.
II. This article is based on a technical thought and does not constitute any investment advice. Readers must take their own responsibilities when using it.
III. There may still have errors in the article, and I urge readers to correct me.
IV. Those readers who feel interests can refer to the book authored by the writer for the basic skills of Elasticsearch. “Advanced Elasticsearch 7.0”, August 2019, Packt, ISBN: 9781789957754.