Demystifying market patterns SP500, EDA, Bayes engine, cryptography and much more

Gabriel Zenobi
Coinmonks
45 min readMar 5, 2022

--

Topics to be treated

  • Download historical prices of the S&P500
  • EDA- Exploratory data analysis
  • Encoding patterns in string, combinatorial analysis, filtering and optimization.
  • Normal distribution: Calculation of price ranges
  • Chain Generator Class
  • Introduction to pattern probability analysis on the S&P500
  • Probability chains for independent stochastic events
  • Conditional probability and Bayes theorem for dependent chains of events
  • Bayes Engine: Developing heuristics and statistical engine
  • Analysis of classic patterns and candlesticks on S&P500
  • Final conclusion

Introduction

The following study aims to perform a frequentist and Bayesian analysis on market patterns, these are often mentioned in articles as well as supposed profitable trading systems throughout the internet and even further in books that have been written in this regard. With this I do not seek to discredit or insult anyone, but rather to shed light on this issue that has many “believers” about what is or could be the market, on the other hand much is mentioned about specific price patterns which exploit them would guarantee a profitable and operational trading when what really happens is quite another thing. The market is stochastic in nature and as will be explored throughout this study will test its mathematical nature by doing experiments that by grace anyone can take to other markets and/or instruments, ignore the statistical tools in this case is a serious mistake and in my opinion is what makes many people have faith on these patterns, as if it were something static and that has to be repeated in all markets… it is not and in fact doing the proper analysis can see how this varies depending on various factors (time-frame, instrument, period range).

Necessary libraries

For download the market data

And then

Download historical prices of the S&P500

EDA — Exploratory data analysis

Let’s visually check the historical prices of the S&P500

Before classifying sequences of patterns we need to discriminate them correctly, for this we are going to create a function that is able to obtain the “delta” of the size of the candlestick body (formed by their respective prices), this will be determined by the maximum and minimum of each price.

Delta refers to the difference (or distance) of the candlestick pattern, from its minimum to its maximum in this case.

Delta-Type function

Delta distribution

Check graphically which sentiment dominates the market; bullish or bearish?

outliers

Let’s look analytically at the relevant data of the distribution.

The conclusion on the most noticeable characteristics are:

  • The values come from a quasi-normal distribution.
  • It is sufficiently symmetric. The general rule says that if the skewness is between (-0.5, 0.5) the distribution is quasi-symmetric.
  • Existence of outliers

Encoding patterns in strings, combinatorial analysis, filtering and optimization.

It is now when the real challenge begins, the final objective will be to create a function capable of generating probabilities on a sequence of candlesticks which will be determined by the price frequency of the market analyzed in question. But first we must ask ourselves; what kind of candles we are studying?, what do we call sequence?, is it reliable to use this word?, are its characteristics reliable and robust to volatility?, is it possible to find a pattern that we can rely on to make predictions?.

In our analysis we will look more into the latter, and perhaps incorporate new concepts as we get rid of old ideas.

First step: rank determination criteria.

The first objective to achieve is to generate a series of ranges that tell us what will be the size of each candlestick to study (i.e. its body), with this we mean to create a criterion that is able to determine what is the appropriate candlestick size to select according to the time series (this should apply to any asset in any time-frame), and since each of them is different because they are continuous values we must be able to generate a series of minimum-maximum intervals, otherwise an exact value will never be repeated twice, this is a property of the laws of probability.

In addition, each market has different dynamic factors (such as volatility, average historical prices, etc..) and therefore this analysis would be seriously affected if the conditions are not accurate, so the ranges would vary by having different price distributions, even within the market itself if we extend the analysis to past data and the above factors change due to different causes (crisis, cycles, fundamentals, etc..).

Recall that we start from the fact that the historical market prices form a normal distribution, and therefore we will use the standardization formula which will allow us to choose an interval using its mean and standard deviation, I will explain shortly why this is the case, for the moment let’s look at the mathematical formula:

Where:

  • z(z-score) : critical value, depends on the interval chosen in the distribution.
  • x: term to be cleared, it will give us the numerical value of the interval we are looking for.
  • mu: population mean.
  • sigma: standard deviation of the population.

To select the z-value we will look at a pre-calculated table. The reason for this is simple, we will propose a constant interval and starting from it we will take different body sizes or price deltas. For this case I will take an interval of 20% (i.e. each body size will allow 20% of noise or variation, this is the accepted error), let’s look at a fragment of the table for reasons of this example.

That is, starting from an interval of 20% (if we look at the red box of the table the corresponding z-value is 0.53) the successive values taken will increase according to this measure, for this example the ranges generated will be 20%, 40%, 60%, etc… Using this formula we will not depend on fixed values and with this we will have generalized the problem to any market or time scale.

Now we only need to choose a stopping criterion, i.e. taking intervals of 20% how many values can be taken as a maximum in the distribution. For this I will use 2.5 standard deviations, so the amount of ranges will be determined by the deviations we allow, if we increase the range of the candles (say to 40%) these amounts will decrease because the acceptance value is larger.

Let’s write the function:

Output:

[-97.43731612956171, -73.05217553169632, -48.667034933830934, -24.28189433596555, -0.10324626189983444, 24.28189433596555, 48.667034933830934, 73.05217553169632, 97.43731612956171]
total ranges: 9

As can be seen we do not integrate the function to obtain the value of z and based on that calculate each iteration of X, instead we calculate once the 20% range and from there on all the increments for X will be uniform (based on the initial value of the formula).

This will give us more ranges than if we use intervals determined by z (integrating the Gaussian probability function), since these would increase to 100% which corresponds to the right tail of the distribution. Following the inverse logic they would end up at 0% for the left tail of the distribution (because they will be negative values).

Another important detail to keep in mind is that we calculated the ranges in two parts; first the positive and then the negative (symmetric), finally these generated lists were merged into a single one which contains all the ranges...

A more detailed explanation of this will be given in the optional calculation of price ranges section.

As mentioned at the beginning the formula cleared is the standardization, with this we obtain X (which represents an observation) and thus determine the numerical interval which gives us the range. We should also mention that these can be negative since they correspond to bearish candles! Now we need a function that generates intervals from the calculated ranges.

Remark: When we finish our study this formula will be modified to generalize any type of interval without the need to look at a pre-calculated z-table, for the moment we do it this way for learning purposes.

Interval coding.

The next step will be to encode or encrypt the generated ranges in order to form the candlestick patterns in a simpler way, I will create an algorithm that generates only letters for each distinct continuous value, although the symbology is indifferent.

Output:

{‘A’: -97.43731612956171, ‘B’: -73.05217553169632, ‘C’: -48.667034933830934, ‘D’: -24.28189433596555, ‘E’: -0.10324626189983444, ‘F’: 24.28189433596555, ‘G’: 48.667034933830934, ‘H’: 73.05217553169632, ‘I’: 97.43731612956171}

At this point we need a function that is able to take the ranges and transform them into intervals, with this we will form pairs of lows and highs. In this way we will be able to permute them to create the sequences that will represent the different types of candlestick bodies, i.e. different combinations of price ranges that the market can generate according to our estimates.

This is not a simple problem in computational terms, in fact it is variations with repetition and is studied in combinatorics. So as the strings become larger in length the computational cost will grow exponentially, let’s understand the following; what we are looking for is to evaluate all the existing possibilities of combinations of intervals or parameters, this is a brute force problem. Before programming the functions for this task we will resort to the mathematical expression.

Variations with repetition

Formal definition: each of the tuples that can be formed by taking elements of a set is called a variation. In finite set combinatorics it is often necessary to know the number of variations of a set of m elements taken in tuples of n elements (with or without repeated elements in the tuples). The variations with repetition of sets of m elements taken in n-element tuples is the number of different n-tuples of a set of m elements.

It sounds confusing, however it is simpler if we evaluate it with an example, for our first case we will look for even sequences (of 2 elements). We must remember that in total we have X possible ranges generated by the function get_candles_body (remember that this value will change if we use more standard deviations, which is equivalent to say that we take more intervals). Then following the formula we will have the following amount of possible combinations for our intervals:

VR(9, 2) = 9² = 81.

The result of VR(m, n) will be the number of possible two-letter elements that we can form. These will represent the even intervals of candlesticks such as: {AB, AC, AB, AA, BB, …}.

To generate these pairs or intervals I will use the itertools library, this will save us work and lines of code. Let’s write the function.

Output:

[(‘A’, ‘A’), (‘A’, ‘B’), (‘A’, ‘C’), (‘A’, ‘D’), (‘A’, ‘E’), (‘A’, ‘F’), (‘A’, ‘G’), (‘A’, ‘H’), (‘A’, ‘I’), (‘B’, ‘A’), (‘B’, ‘B’), (‘B’, ‘C’), (‘B’, ‘D’), (‘B’, ‘E’), (‘B’, ‘F’)]
total intervals: 81

Doing a “print” to the result of the function we will be able to evaluate the amount of formed even combinations, this is just what we wanted.

Before continuing with the next step I must say that we are already in conditions to understand mathematically the reason of this, I will repeat something that I already mentioned previously;

we need even intervals because they represent the size of the body of each candlestick pattern to be studied regardless of how many candlesticks our future pattern chains have, although we may only be interested in evaluating the number of times a particular candlestick repeats, and by doing this we will get any study to do with patterns.

On the other hand the term “minimum and maximum” refers to the allowed “error”, i.e. the minimum where our candlestick starts and its maximum bounded. If we say as an example that a candle is of the form (20, 80), this translates as follows: its body can be formed by a distance that varies between: 20 <= X <= 80 units, with this we infer that any continuous value between that range is allowed, as an example if a price delta has a value of 70 this would be accepted because it is within that interval, however 10 or 90 not, so it would be rejected… this seems very obvious but I wanted to emphasize it for the last time. This reasoning also applies to negative intervals which will represent bearish candlesticks.

Interval filtering

Before generating the sequences or chains that finally will serve us to study its frequency in the market, we must reason the following; not all the intervals are valid, although with our formula we could calculate the total of possible intervals, in this context it is not true to say that all will be useful to us. For example where the minimum of a candle starts can never be a maximum, therefore if A < B a candle interval allowed would be (A,B), but it could never be (B, A) because this is meaningless…

The criteria used will be as follows:

  • If the value is positive and MIN < MAX then the interval is accepted, otherwise it will be rejected and discarded. For the negative case the logic is the inverse.
  • If the value is negative and MAX > MIN then the value is accepted, otherwise rejected. Recall that the closed interval notation is [MIN, MAX]. In addition the function will take an additional parameter that we must not forget, the dictionary that stored the information of the coding that we did in the initial steps, it contains the continuous value that represents each letter.

Output:

[(‘A’, ‘B’), (‘A’, ‘C’), (‘A’, ‘D’), (‘A’, ‘E’), (‘A’, ‘F’), (‘A’, ‘G’), (‘A’, ‘H’), (‘A’, ‘I’), (‘B’, ‘C’), (‘B’, ‘D’), (‘B’, ‘E’), (‘B’, ‘F’), (‘B’, ‘G’), (‘B’, ‘H’), (‘B’, ‘I’), (‘C’, ‘D’), (‘C’, ‘E’), (‘C’, ‘F’), (‘C’, ‘G’), (‘C’, ‘H’), (‘C’, ‘I’), (‘D’, ‘E’), (‘D’, ‘F’), (‘D’, ‘G’), (‘D’, ‘H’), (‘D’, ‘I’), (‘E’, ‘F’), (‘E’, ‘G’), (‘E’, ‘H’), (‘E’, ‘I’), (‘F’, ‘G’), (‘F’, ‘H’), (‘F’, ‘I’), (‘G’, ‘H’), (‘G’, ‘I’), (‘H’, ‘I’)]
total intervals: 36

As we see our function included the equal in the Min and Max conditionals, this is because we also want to filter the values of repeated intervals, for example; (A,A), (B,B), etc…. If the interval were exactly the same on both sides this will be the same as looking for our candle body to have the same value, for example if A = 50 and we have (A,A) then the equivalent is (50, 50) so the candle value to look in the market should be exactly equal to that value (this is equivalent to say that there are no intervals, because being equal minimum and maximum the error allowed by logic will be 0). In a continuous scale this is practically impossible, as well demonstrated by the laws of probability…

Let’s see how many intervals we manage to reduce with our filter function

Output: filtered intervals: 45

Chain coding.

For the moment we only generate the intervals that represent the allowed limits of the body of our candle, now we only have to form the patterns (i.e. the chains of candles) determined by the length that we choose. This is the most complex part in computational terms and although we already noticed this, it is time to repeat once again the formula studied in combinatorics. Let’s see the formula of how many possible candlestick chains we can form given a number of total intervals:

Let’s see what each thing means:

  • VR(m, n) : Total intervals created, for example {AA, BA, BB, AC, CC, …}, we saw this formula in a previous analysis.
  • VR(t, k): Total possible chains given the total number of available intervals.

This becomes fast and excessively expensive even if our combination of generated intervals is small, luckily the filter function helped us in this respect by optimizing only the most relevant values. If we did not use this function the actual value of strings would be titanic. Let’s see a brief example:

Output: 531441

If we assume that we generate 9 price ranges this would give us a total of 81 possible intervals (remember that the exponent is 2 because the intervals are even), with this we can form the chains or sequences of market patterns to study, let’s make the following calculations.

The number of candlestick patterns that we can form by increasing the length of the candlestick sequences or chains are:

  • If X_chains=2 : 81² = 6561
  • If X_chains=3 : 81³ = 531441
  • If X_chains=4 : 81⁴ = 43046721
  • If X_chainsc=5 : 81⁵ = 3486784401

That is, if we only have 81 price intervals (min, max) and we want to analyze a sequence or chain of 3 candlestick patterns, for example: {(AA, AB, BC), (AB, AA, CC), Etc…} we would end up with a total of 531441 possible combinations!!!, extremely expensive!!!(not to mention for chains of length 4 and 5…).

That is, if we had not optimized the number of intervals using the filter function, the total number of intervals would be (I²)^N_Chains, which would reduce the complexity of our problem.

String generator function

Finally we are able to generate our strings for further market analysis, for the simplest example I will use strings of 3 patterns in length. Let’s write the function.

Output:

[(‘AB’, ‘AB’, ‘AB’), (‘AB’, ‘AB’, ‘AC’), (‘AB’, ‘AB’, ‘AD’), (‘AB’, ‘AB’, ‘AE’), (‘AB’, ‘AB’, ‘AF’), (‘AB’, ‘AB’, ‘AG’), (‘AB’, ‘AB’, ‘AH’), (‘AB’, ‘AB’, ‘AI’), (‘AB’, ‘AB’, ‘BC’), (‘AB’, ‘AB’, ‘BD’), (‘AB’, ‘AB’, ‘BE’), (‘AB’, ‘AB’, ‘BF’), (‘AB’, ‘AB’, ‘BG’), (‘AB’, ‘AB’, ‘BH’), (‘AB’, ‘AB’, ‘BI’)]
total chains: 46656

We will continue this study in the Chain Generator Class section where we will formalize the concepts learned in a class, and add what is necessary to continue with our market analysis.

Normal distribution: Calculation of price ranges

This section is not necessary to complete our study it is interesting the notion of ranges and intervals taken by our function. Let’s remember that after all, this will be used to combine and form candlestick intervals, which will finally form the market patterns we are looking for.

In the previous section we saw the get_candles_body_range function, which was incomplete. Now we will use the python scipy module to calculate the z-score independently and generalize our formula for any interval we take.

Output:

If we look at the formula we will also observe the following:

st.norm.ppf(0.5 + interval_percent/2)

Where:

  • interval_percent/2: is our chosen interval percentage, and on which the interval expansion will base its increment, it is divided by 2 because the distribution starts at 0(left tail end) and ends at 1(right tail end) so if we choose an interval of 0.5 we will have reached the end of the distribution and the calculations would be overrun.
  • 0.5 represents the z-score of the mean value of the normal distribution (remember that it is defined as N(0, 1)), so we start from there (adding our interval) because z for this value is 0 which makes sense with the value of the mean for the standardized distribution when we isolate the formula...

let's do a simple test. To continue with the previous example the intervals will be 20%...

Output:

As we can see the increments were from 0.5 to 1.0 (actually we used an approximate value, otherwise the result would have been +Inf), these represent the z-scores for intervals of 20%, 40%, 60%, etc…

The logic for negative values is similar, let’s see it.

Output:

Exactly the same… only the interval decreased from -20% to -20% and the z-score now has a negative sign.

Chain Generator Class

In the section on Encoding patterns… we studied and reasoned all the elements to incorporate if we want to generate sequences or chains of candlesticks in any type of market and regardless of its conditions or time frame. We saw the meaning of ranges, intervals and using coding and optimization we finally concluded a robust combinatorial analysis. However this is not all, to complete our study we must be able to decode these sequences and use them so that we can calculate the frequency of these patterns in the market.

In the following, we will recapitulate what we have seen in a class that does all of this and will also have a decoding and predictive function

As an additional step we will store the value of X, which is the increment value of the interval(s). This value is very important and in the next section I will explain why and how we will use it. For the moment let’s continue.

Let’s look at all the possible delta price ranges.

Output:

Let’s see all the possible coded intervals (the intervals representing the Minimum-Maximum of each candlestick pattern)

List of decoded ranges, these are used to form the delta price intervals.

Output:

Let’s see all the possible strings that can be formed, this is the same information seen in the first step but without decoding

If we check the basic information we can also observe the amount of total string combinations (applying the filter function and optimizing the equal intervals). Also the total intervals used to create these strings, i.e. those used for the permutations that form them after applying the optimization, and finally we see how much is this optimized quantity.

For strings of only length=3 the possible combinations are immense, this already anticipates how improbable it can be to find a specific string (even if we seek to maximize a sequence of patterns).

Despite the optimization applied we see that it is still not enough, so we will calculate this in a more efficient way making use of the laws of probability and mathematics, all this in the next section!.

Introduction to pattern probability analysis on the S&P500.

From this moment on we are able to implement the necessary routines to study the market. We will complement everything seen so far, creating a new class which will be our base to study the market patterns, this will allow us to calculate the frequencies or probabilities using the Laplace rule and from there we will classify the different chains of patterns that we can create forming a matrix.

Laplace Probability Rule

This is equivalent to calculating the amount of given events over the amount of observations or total events.

We will implement this in our class, in addition to a function that will store in a dictionary the type of pattern with its frequency.

As we can see the nested loop calculates the frequency of each pattern generated by the function predict of the class PatternsChainsGenerator, these are the intervals that will be used to classify such patterns given all the previously calculated market bodies (price deltas), this task will be taken care of by the main loop. We also note that the parameter used is a string of length 1, thus the generated strings will be of an interval which will allow us to study individually each pattern and its characteristics along with its frequency or probability.

Let’s test the function by giving as parameter the dataframe created at the beginning of our analysis.

Output:

This is interesting, we can observe that certain intervals that form the acceptance pattern have in fact a very high frequency, ranging from almost 70% to 90%, remember that the intervals used are 20% (i.e. 40%/2 according to the calculations of the standardization formula).

It is right here that we must make a revision, these values are not casual. In fact it can be mistakenly thought that we are in front of the holy grail of the patterns and therefore assume an expert trading system, we will demonstrate that this is not so easy …. and we are far from it.

In order to understand the above mentioned we must emphasize (and repeat) the following; A pattern is nothing more than an acceptance interval, so we defined it and based on that notion we created a criterion which allowed us to accept values depending on the value of the price ratio.For this we used the concept of “error”, since after all this is what the intervals symbolize.

So the values that have a high probability are nothing but excessively open intervals, and even so wide in range that they accept both positive and negative values. To solve this we need another filter function which will eliminate those intervals that cross a certain gap, this will be proposed below along with its class.

Implementation of the class for pattern validation.

Output:

We see how our class calls an additional function (which was implemented in the Encoding patterns section), this is get_interval_increment which returns the increment value of each generated interval. Recall that this was calculated in previous steps using the standardization function.

Now it makes sense, as we can see the probabilities dropped drastically(when reducing the distance of the intervals)… this was expected and now we understand why. Let’s sort this information in a dataframe.

Let’s make a plot on the best probabilities found

As we can see this graph gives us a visual reference of how many probabilities we calculate, X axis represents the total intervals and Y axis the frequency or probability.

Probability chains for independent stochastic events

To be able to fully model the probability chains in the market we must first understand some notions about probability, these play a fundamental role in our search and have something important to tell us, so let’s start with the definition of probability for independent events.

It is said that random events are independent of each other when the probability of each of them is not influenced by the occurrence or not of the other, i.e. the events are not related in any way. In probability theory this rule is expressed as follows:

For example given Ai={{A1, A2, … , An} if these are independent their joint probability (the intersection of the events) is the product of the probabilities.

For only two events this simple rule is summarized as (which is how it is usually studied

Taking this theory to the market we will say that A and B represent a certain candlestick pattern (their characteristics at this time are irrelevant), then if we want to know what is the probability of a chain of events or pattern events we must first calculate the probability of each of them separately and then multiply chain by chain to obtain their respective probabilities. This is easy to put into practice, the big flaw is the following; being independent events the order of the sequence will be irrelevant and therefore it would be the same to calculate a chain of type AAB, ABA or BAA (where each letter represents a different candle pattern), since the order of the multiplication is commutative, using the property of intersection of events in the example we would have;

P(AAB) = P(BAA) = P(A)xP(A)xP(B) = P(B)xP(A)xP(A).

Due to the studies done so far we were able to prove that this is not true, the order for our model is fundamental since each chain will have a different likelihood ratio and therefore if we base our calculations on the independence of events we should ignore such order. As a result, apparently different chains will have the same probability of occurrence in common, and as said before this makes no real sense in markets. However, it is not all a lie, because we do not have a theory that necessarily affirms or refutes that the closing of a price A implies the formation of a future price B, and in fact thanks to the Markovian property we can treat the market as a stochastic process without memory, therefore an event A does not necessarily imply an event B. We will discuss this in the section on the probability of dependent events, however it must be exposed the great flaw of modeling the market only by applying this form.

Coding the intervals to calculate the probability matrix.

The first step will be to code the intervals (recycling some of the functions we already created) in order to form the chains of independent events.

Output:

This function gives us as a result a dictionary of probabilities sorted by letters, which represent the ‘coded’ values that we will use to generate the probability strings. Let’s use the combinatorics and store the results in a pandas data frame(just for a better visual aspect).

Finally we can generate the probability matrix of the sequences or events. Let’s implement all this plus the decoder function in a class.

Let’s compile the class and build the object with the single string length parameter of 3(as in our example).

As we can see our new class returns in a data-frame the chains of patterns formed together with their respective joint probability. Let’s plot this and see if there are any relevant features.

As expected, the probabilities are too weak to rely on. Moreover, their increments are minimal and always oscillate at low levels.

The last graph of interest will be its distribution.

Let us prove that the probability of intersection of events ignores the order of our pattern-chain.

This is what we said at the beginning, and as we can see it was demonstrated. The probability of independent events is insufficient to conclude our study due to the nature of the market… although each price is independent of the previous one, we must take into account the order as an additional step.

Conclusion

Both the average and the distribution is irrelevant, and therefore we conclude the following; if we want to make a system based on these patterns we must analyze and think carefully all of the above. Understand that in the real world there are risks associated with the available capital of our account and the time frame in which we move (since it would not make sense to make a trading system on a long-term time scale if our account does not support such spreads …), such patterns will be based on the last said, and will fail even more if the intervals that form them are excessively large making us believe that we are facing the discovery of the century when in fact they do not correctly represent the average price intervals of the market.

An expert system incapable of correctly classifying the order of the patterns it seeks to exploit is a system doomed to failure, if we continue with this idea we already know what results we may find in the future. Therefore we should never trust anyone who claims to have found the holy grail of market patterns, they would be liars, mathematics is our witness and great teacher in this search.

Conditional probability and Bayes theorem for dependent chains of events

In the previous section we talked about independent events and how to use probability and combinatorics to model such chains of events, we also demonstrated the danger of using such theory alone. In the following we will complete the missing mathematical notions and definitions in order to conclude our analysis on the efficiency of patterns in the market and how convenient they are to use in a theoretical trading system, let’s start with the definition of conditional probability for dependent events followed by an explanation.

This is the simplest form of the rule for knowing the probability of two dependent events, which is read as “the probability of A given B”. The following tree diagram explains this, we will continue with the explanation.

In other words, what we are looking for is an ordered sequence of events, so we can abstractly represent all the possible “chains” (dependent events) of probabilities. However this is not all, because in the markets there are not only two price patterns but thanks to this scheme we managed to simplify the base model, now we only need to generalize the concept of conditional probability.

Bayes’ Theorem

In summary:

Now that we understand this there is one last notion that we must complement in order to model the market correctly, let’s look at the Markovian property

This property tells us that it is not necessary to know all the past (historical data) to explain the future, it is enough to have the current closing price of the market. This is relevant because in this way we can say that the events are independent in the sense that a price (or price interval) does not influence the formation of the next, so we will have a much simpler model as this implies that markets have no memory.

With this in mind we can now write the function that calculates the probability of a chain of patterns using Bayes’ theorem and the law of total probability.

Bayes function for chain of events: Conditional probability

Which give us 0.25 as output.

As we can see this first function calculates the joint probabilities, the example deals with a simple chain composed by two possible events A and B with their associated probabilities (necessary to execute the calculations).

Bayes function for chain of events: Law of total probability

Output:

This is the denominator of the formula, which corresponds to the total probability. We need to first obtain all possible sequences of paths given the sequence of events to calculate, finally we multiply and apply the summation(per iteration of each chain).

Putting all this together I will call the final function Bayes.

Output:

Note: The parameter passed to the function is read as ‘A’ given ‘B’, given ‘C’. That is, it respects the order of the formula.

Using everything we have learned let’s implement a class which we will call DEChains, at this point we will recycle some of the functions and classes already created. Remember that the generation of price ranges is mandatory, this is done by the class PatternsEngine which will calculate the probabilities of the patterns in chains of length 1 (as we saw in the section of Chain generator class), these will be used to form all possible sequences making use of combinatorics. At the same time each pattern will have implicitly an associated interval which is coded and can be obtained if desired, this is very important because if we want to create custom intervals for us (and not for the function itself) we must pass such parameters to the class.

Remember that PatternsEngine inherits the methods of the base class PatternsChainsGenerator which executes the calculations explained and tested in the section of Encoding patterns, that is; the interval or delta that composes each candle (each pattern) and the total ranges taken using a number of N deviations.

As a first basic example we create the object of the class and calculate a single conditional probability string, the only parameter passed will represent an event of type A|B,C.

That is, probability of pattern A given that pattern B happened and given that pattern C happened first (respecting the order).

Give us output: 0.011605115565991968

Let’s make the calculation more complex, let’s calculate all possible pattern chains with length=3.

Let’s plot the different possible probability chains along with their respective distributions, and we will repeat these complex calculations for chains with lengths of: 2, 3 and 4.

As can be seen, as the chains grow in length their probabilities decrease drastically.

The following is the probability matrix of all the patterns classified in the market.

Conclusion

As we see the probabilities go down more and more as our chain of events or patterns becomes larger, it was expected since stochastic statistics warn us of this; the larger (tends to +infinity) is a sequence of events, the probability that it repeats itself gets closer and closer to 0.

Bayes Engine: Developing heuristics and statistical engine

This section will implement everything learned so far (mainly in the last two sections), in addition to the class we will optimize things that we left pending so far. On the other hand the fact that certain functions are discarded (not used in our final statistical engine) does not mean that they are not useful, they were only a support that at the time helped us to understand what we should.

We will also innovate new functions to be able to create custom pattern characteristics in addition to complement the study of probability for independent events. The final class will be called BayesEngine.

Once the class has been compiled, we create the object

Let’s generate a set of patterns using a 40% acceptance interval, as we know the candlestick body sizes or patterns will increase N times but the increments or distances (between the intervals) will always remain constant. N represents the number of standard deviations taken to generate these ranges, in this example we will take 3.5.

Using the intervals generated above (which represent our patterns) let’s see what is the probability of occurrence is in the S&P500. Recall that this can be taken to any market.

Now let’s calculate all the possible combinations of patterns that can be formed with a length=3 (this is what we conceptually call chains). In addition to this let’s look at the individual probability of each pattern that makes up the strings (this should not be confused with the probability of the string itself), the dictionary stores its value encoded in the letters used.

If we only want the list of individual patterns analyzed in the market (and coded).

Calculation of the probability of a single chain of patterns; recall that two probabilities are calculated: the Bayesian probability (the chain should be read as A|B,C) and the joint probability (the chain should be read as A&B&C) which is the intersection of events.

Output:

Let’s calculate the probability of a pattern chain given custom intervals.

Recall that these numbers represent the accepted candlestick body sizes based on the price distribution of the S&P500 index so these values are fixed for this market, if you want to apply this same pattern chain to other markets you should do the corresponding statistical analysis on their probability distribution to choose appropriate values (or use the predict_chains function with percentage intervals).

Output:

At this point there is something I must repeat; the pattern chains along with their probability were calculated using conditional probability and Bayes’ theorem. Therefore a high probability does NOT mean that the pattern is reliable, but simply that put in such order it is more likely to happen than in another (for example a pattern can have high probability given the order A|B,C while seen in B|C,A the probability can be very low), this can help us to know if there is any preference or causality of such events in the market. In addition to this we obtain the joint probability which is the intersection of the events, as explained before this is the probability for independent events regardless of the order (in the examples, you can see that the probability of finding any set of patterns in a chain is practically zero).

Continuing with the example we will calculate all possible chains with length=3.

Let’s plot the probability distributions

As we have been saying as the chains grow in length their probability decreases drastically, even if we keep chains of two patterns the probability continues to be a rarity… when we ran the same analysis looking at each pattern (and its characteristics) separately we also saw this, therefore it is to be expected that even the most “famous” market patterns appear casually (and not in the same way in all markets), more of this in the last section.

Analysis of classic patterns and candlesticks on the S&P500. Final part

It is time to use our Bayesian engine to test the most common patterns found in any material on the internet and books. These proliferate everywhere and warn us how important they are in the “price action”, as such many say that learning them will be enough guarantee to have a good reading of what the market will do and therefore be able to make reliable predictions creating the illusion of becoming true masters of price reading. The big flaw with this premise is that we know in advance that the patterns have unique characteristics in each market (in addition to the time scales), if we ignore this and accept the most common visual analysis we will be doomed to fail, thanks to the statistical studies conducted throughout this article, empirically demonstrate this and how such sequences when they grow tend to have virtually zero and negligible probabilities, which makes them rarities in the markets… making use of brute force and looking for all combinations to best maximize the best probability chains are still far below what is considered reliable to be credible, and therefore used (even more so to design a trading system based on this).

Next we will recap what we saw in the first phase and analyze more temporal market data, I will also show some classic candlestick pattern images and use them (as estimated) in our Bayesian statistical engine to see what it has to tell us about them.

Let’s load the data from the beginning of 2015 to the beginning of 2021.

Candlestick body generator function; these will be used by our statistical engine to calculate the probabilities of the pattern chains in the market.

Price delta plot, on time scale

Probability distribution

Boxplot + Swarmplot to have a better visual aid, we look for where the largest number of price bodies accumulate.

As you can see there are no candlestick patterns or prices around 0, this makes sense because there could not exist null candlestick bodies, remember that we take the body as a difference between the High and Low.

Now we can create the object of the class BayesEngine, this will be used to study the most “classic” market patterns.

Pattern 1.: Harami

Bearish Harami.

Output:

Bearish Harami[big intervals]

Output:

Bullish Harami.

Output:

Bullish Harami[big intervals]

Output:

Pattern 2.: Hammer

Bullish Hammer

Output:

Bullish Hammer[big intervals]

Output:

Bullish Inverted Hammer

Output:

Bullish Inverted Hammer[big intervals]

Output:

Pattern 3.: Shooting Start & Hanging Man

Shooting Star

Output:

Shooting Star[big intervals]

Output:

Hanging Man

Output:

Hanging Man[big intervals]

Output:

Patron 4.: Engulfing

Bullish Engulfing

Output:

Bullish Engulfing[big intervals]

Output:

Bearish Engulfing

Output:

Bullish Engulfing[big intervals]

Output:

Pattern 5.: Three white soldiers

Output:

Three white soldiers[big intervals]

Output:

Pattern 6.: Three black crows

Output:

Three black crows[big intervals]

Output:

Final conclusion

Throughout our market research we discovered tools and methods to help us answer what we were looking for, and after all the question in my opinion no longer stands; is it profitable to trade the market following patterns?, clearly the answer is a resounding no. Statistically speaking there is enough evidence to disprove this, probabilities decrease as we look for more ambitious candlesticks and therefore using them to make a trading system is doomed to failure. There will always be gurus saying they found the key to the ‘price action’, what really happens is that certain patterns occur because they have an apparent frequency, not because they must appear necessarily… anyone can replicate this same study in longer periods of time and even in different time-frames and market (such as FOREX), even so nothing guarantees that we find patterns that we can exploit at least not in the long term and much less in the case of large price chains, the stochastic nature of the market showed us, and the laws of probability are clear: as we prolong a sequence in time the occurrence of the same in the future is considerably low (mathematically tends to 0).

All is not lost because there are different methods to filter the (gaussian) noise of the market and, using machine learning and deep learning it is possible to adjust a system in the optimal way, that is the way forward for a profitable and maintainable expert trading system over time, keeping all risks and quantifiable metrics. I want to finish by saying; what cannot be measured cannot be quantified and what cannot be quantified becomes uncertainty.

--

--

Gabriel Zenobi
Coinmonks

Data Science | Trading system development | Machine Learning | Statistics | Physics | C | Cython | Python | Ninjatrader8 | Metatrader5(MQL)