Recognizing over 50 Candlestick Patterns with Python

An easy to follow guide for leveraging candlestick patterns for ML

Caner Irfanoglu
Feb 11, 2020 · 5 min read
Image Credit. Photo by Austin Distel on Unsplash

When making trading decisions, we can utilize several different information sources on our technical analysis. One of these sources is OHLC (open, high, low, close) data. Candlestick charts can be plotted to extract patterns from OHLC data for any tradable instrument.

A candlestick pattern is a movement in prices shown graphically on a candlestick chart that some believe can predict a particular market movement.¹

Image Credit

The full list of simple and complex candlestick patterns with visual examples can be found in this Wikipedia article.

Candlestick patterns are great candidates to train Machine Learning models for attempting to predict future prices. In this article, we will go over the feature engineering steps of creating a predictor using candlestick patterns and then visualize our results. We will use python, TA-Lib module and the performance rankings from the www.thepatternsite.com

Three steps of the process are:

  1. Extract the patterns using TA-Lib
  2. Rank the patterns using “Overall performance rank” from the patternsite
  3. Pick the best performance candle

Extracting the patterns using TA-Lib

With TA-Lib, extracting patterns is super simple. We can start by installing the module from https://github.com/mrjbq7/ta-lib. The repository contains easy to follow instructions for the installation process.

After the installation, we start by importing the module:

import talib

Then, we get a list of available patterns by running:

candle_names = talib.get_function_groups()['Pattern Recognition']

“candle_names” list should look like as follows:

candle_names = [
'CDL2CROWS',
'CDL3BLACKCROWS',
'CDL3INSIDE',
'CDL3LINESTRIKE',
.
.
.
]

We are ready to extract candles! We just need a sample dataset with open, high, low, close values.

Sample Bitcoin dataset preparation

Voila! Bitcoin dataset is ready. Let’s extract the OHLC data and create pattern columns.

# extract OHLC 
op = df['open']
hi = df['high']
lo = df['low']
cl = df['close']
# create columns for each pattern
for candle in candle_names:
# below is same as;
# df["CDL3LINESTRIKE"] = talib.CDL3LINESTRIKE(op, hi, lo, cl)
df[candle] = getattr(talib, candle)(op, hi, lo, cl)

TA-Lib creates individual columns for each pattern. While 0 corresponds to no pattern, positive values represent bullish patterns and negative values represent bearish patterns.

Candlestick Patterns found on Bitcoin Data

Congratulations! We just obtained our first dataset with algorithmically extracted patterns.

Ranking the patterns

We successfully extracted candlestick patterns using TA-Lib. With few lines of code, we can condense this sparse information into a single column with pattern labels. But first, we need to handle the cases where multiple patterns are found for a given candle. To do that, we need a performance metric to compare patterns. We will use the “Overall performance rank” from the patternsite.

candle_rankings = {
"CDL3LINESTRIKE_Bull": 1,
"CDL3LINESTRIKE_Bear": 2,
"CDL3BLACKCROWS_Bull": 3,
"CDL3BLACKCROWS_Bear": 3,
"CDLEVENINGSTAR_Bull": 4,
"CDLEVENINGSTAR_Bear": 4,
"CDLTASUKIGAP_Bull": 5,
"CDLTASUKIGAP_Bear": 5,
.
.
.
.
}

After some manual scraping, the patterns are combined in “candle_rankings” dictionary. When there exist multiple patterns, we will use the values in the above dictionary to decide best performance pattern. Full dictionary of the patterns and the explanations of the naming and ranking decisions can be found here.

Picking the best performance candle

Here comes the fun part. We will code the logic for creating the labels. We basically have 3 cases.

  • No Pattern: Fill the cell with “NO_PATTERN”
  • Single Pattern: Fill the cell with Pattern Name
  • Multiple Patterns: Fill the cell with lowest (best) ranking Pattern Name

Below is the code for creating the pattern labels and found pattern counts.

Logic for picking best pattern for each candle

Visualizing and validating the results

So far, we extracted many candlestick patterns using TA-Lib (supports 61 patterns as of Feb 2020). We ranked them based on the “Overall performance rank” and selected the best performance pattern for each candle. Next, we can validate our results by plotting the candles and visually check against the patterns found. Below is a sample script for visualizing the data using Plotly. The dataset and the plot can be compared side by side and the patterns can be validated easily by matching the indexes.

Code plotly for visualizing candlesticks
Candlesticks chart for bitcoin data using plotly

You may find this Tableau Viz more convenient to inspect the patterns with the annotations quickly.

Tableau Viz for Bitcoin Data

When the patterns found on our dataset are compared to the actual patterns, the results look consistent. We can test on larger datasets as part of the future work. Also, since some patterns only have a single version, ‘Bull’ and ‘Bear’ tags can be removed from them.

All scripts and contents of this post including the recognize_candlestick function, can be found at https://github.com/CanerIrfanoglu/medium.

You can also see candlestick recognition in action as part of the crypto currency technical analysis dashboard “Crypto Dash” at https://youtu.be/HS3gAmtET9k?t=121.

Content here is mainly based on the work of the creators of the TA-Lib module and Thomas Bulkowsi’s long time studies on candlestick patterns. I would like to thank them for making their work publicly available.

This article will be followed by more feature engineering and modelling work for predicting the crypto-currency prices using Machine Learning. If you enjoyed or found my work valuable, please make sure to stay synced and feel free to connect on linkedin. I would be delighted to hear your comments and suggestions. Cheers!

References

[1]Candlestick pattern. (n.d.). In Wikipedia. Retrieved February 11, 2020 from https://en.wikipedia.org/wiki/Candlestick_pattern

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Sign up for Analytics Vidhya News Bytes

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Caner Irfanoglu

Written by

Data Scientist and Cryptocurrency Algorithmic Trader

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store