LecTrade Machine learning with technical patterns and real time alerts for buying and selling in the stock market.

Leci37ᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠᅠ

14 min readNov 14, 2022

Why this stock prediction project ?

Things this project offers that I did not find in other free projects, are:

Testing with +-30 models. Multiple combinations features and multiple selections of models (TensorFlow , XGBoost and Sklearn )
Threshold and quality models evaluation
Use 1k technical indicators
Method of best features selection (technical indicators)
Categorical target (do buy, do sell and do nothing) simple and dynamic, instead of continuous target variable
Powerful open-market-real-time evaluation system
Versatile integration with: Twitter, Telegram and Mail
Train Machine Learning model with Fresh today stock data

If you have problems with installation, let me know. I am searching collaborators for this project. If you have experience and want to collaborate text me on email or github Issues

I proceed to explain my experience of how to predict good buy and sell points for stocks, “every stock is good if you buy and sell at the right time”.

I have created a tool LecTrade with powerful machine learning libraries like TensorFlow , to analyze all (1068 patterns) technical patterns of stocks.
Obviously in necessary, the prections of the operative points (buy-sell), the tool makes them in real time.

Code GitHub https://github.com/Leci37/stocks-prediction-Machine-learning-RealTime-telegram

Instructions for use:

RealTime Twitter https://twitter.com/Whale__Hunters
RealTime Telegram @Whale_Hunter_Alertbot this group is limited, to receive alert to sign up ask via github or twitter

The models have been trained in 15 min intervals with , i.e. the alerts expire in about 7 minutes, that means that once the tweet goes out, you have +-7 minutes to decide whether to trade or not. also that the models should be used on intra-day trade.Never blindly follow the alerts, check first. The alerts indicate points where only technical patterns have brought strong trend changes in the last 5 months, i.e. if these models were applied to the last 5 months they would hit 91% of the BUY SELL points, in the future no one can know. In other words, it is not an absolute truth.

The alert consists of the following:

Can be BUY or SELL.
The id of the stock, always from the USA market, in case of crypto will have the termination -USD. In case of Tesla, it is TSLA, in case of doubt with the company id , a simple google search “Stocks XXX”.
Link to Investing.com news, check before making the final decision.
Link to the candlesticks through TraderView.com, is the image shown attached with the alert.
𝙈𝙤𝙙𝙚𝙡 𝙏𝙧𝙪𝙨𝙩:⬆⬇, level of strength indicating whether there is a positive or negative trend and behind / number of models used to obtain the percentage. It may be the case that both the uptrend POS and downtrend NEG have a high score, indicating increased volatility.
📊⚙𝙉𝙖𝙢𝙚𝙨: The name of the selected models with which the prediction has been made and the percentage of strength.

Example of how the alerts looks like in Twitter:

Example of how the alerts looks like in Telegram:

It is possible to add to the alerts the summary of the financial information from TraderView.com, this image will always be loaded with the financial-analyst information in real time.

PROGRAM DESCRIPTION

This program performs the following functions:

Collection of historical OHLCV data for the last years and calculation of technical patterns (momentums, volatility, Japanese candlesticks, statistics…), 1068 patterns.
Calculations of which of the 1068 are the most valuable, and most relevant for the detection of good trading points (buy-sell).
Training of several machine learning models using powerful libraries:Google Tensor Flow, Sklearn and XGB
Evaluation of the multiple models, to discard the less reliable ones.
OHLCV data collection and making predictions from the models in real time, when any of the multiple predictions is considered valid, sending real-time alert to Telegram and Mail.

INTRODUCTION

I appreciate the answers and if someone wants to collaborate, it is necessary to introduce analysis of news, financial results, feelings in networks, preditions with multidimensional arrays in new versions (explained in https://github.com/Leci37/stocks-Machine-learning-RealTime-telegram#review-the-way-ground-true-is-obtained of the installation documentation).

You can find the source code at https://github.com/Leci37/stocks-Machine-learning-RealTime-telegram

The stock market is moved by technical indicators, there are several types of volatility, cycle volume, candlesticks, supports, resistances, moving averages…

An excellent site to see all the stock market technical indicators is webull https://app.webull.com/trade?source=seo-google-home.

Image: webull with Stochastic, MACD and RSI indicators

On the stock market graphs have been invented EVERY possible way to predict the stock market, with mixed results, making clear the difficulty of predicting human behavior.

These indicators indicate where to buy and sell, there are many beliefs about them (we mean in beliefs, because if they always worked we would all be rich).

Any technical indicator can be obtained by means of programmable mathematical operations.

Three examples:

RSI or Relative Strength Index is an oscillator that reflects relative strength
Greater than 70 overbought, indicates that it will go down.
Less than 70 oversold, indicates that it will go higher

MACD is the acronym for Moving Average Convergence / Divergence. The MACD in the stock market is used to measure the robustness of the price movement. Through the crossing of the line of this indicator and the moving average
It operates on the basis of the crossovers between these two lines
Or it is operated when both exceed zero.

Candlestick: Morning Star The morning star pattern is considered a hopeful sign in a bearish market trend.

These indicators are present in refuted and popular websites like investing.com to be analyzed by the market https://es.investing.com/equities/apple-computer-inc-technical

It is extremely difficult to predict the price of any stock. Inflation, wars, populism, all this conditions affect the economy, and it becomes difficult, if not impossible to predict what news will do tomorrow.

Here enters the self-fulfilling prophecy principle of explained is, at first, a “false” definition of the situation, which awakens a new behavior that makes the original false conception of the situation become “true”.

The project is long and complex, it takes time to install, but the result is very beautiful.

Note 29-December 2022 The improvement: Improvements in predictive models, using multi-dimensional This development is completed in the stocks-prediction-multi branch, request access without any problem.

The development explained in this readme, takes ONE time partition, (e.g. from 9:00 to 9:15) analyzes all the technical patterns, and sends a concussion.
With the multidimensional development, the model analyzes TEN time partitions (e.g. from 9:00 to 12:30), with all the technical patterns of that time, a decision is made.

The generated .csv files with name SCALA are for mono-dimension and the PLAIN are for multidimension , there is some mix in this branch.

OBJECTIVE

Understanding the principle of self-fulfilling prophecy, it is possible to obtain the pattern of the same, by means of the massive collection of technical patterns, their calculation and the study of their patterns.

For this, techniques such as big data will be used through Pandas Python libraries, machine learning through Sklearn, XGB and neural networks through the open google Tensor Flow library.

The result will be displayed in a simple and friendly way through alerts on mobile or computer.

Example of a real-time alert via chanel telegram bot. To receive alerts you need to register, contact github Leci37 support https://t.me/Whale_Hunter_Alertbot

The image shows: MACD, RSI , Stochastic and Balance of power (Elder Ray)

The alert is sent on the vertical line (the only vertical line that crosses the whole image), during the next 4 periods the stock decreases (It will be indicated as SELL) by 2.4%. Each candlestick period in the image indicates 15 minutes.

OPERATION

1.1 Data collection

Collect data to train the model

yhoo_generate_big_all_csv.py

The closing data is obtained through yahoo API finance, and hundreds of technical patterns are calculated using the pandas_ta and talib libraries.

yhoo_history_stock.get_SCALA_csv_stocks_history_Download_list()

The model to be able to train in detecting points of purchase and sale, creates the column buy_seel_point has value of: 0, -100, 100. These are detected according to the maximum changes, (positive 100, negative -100) in the history of the last months, this point will be with which the training is trained, also called the ground true.

Value will be assigned in buy_seel_point if the increase or decrease of the stock is greater than 2.5% in a period of 3 hours, using the get_buy_sell_points_Roll function.

Once the historical data of the stock has been obtained and all the technical indicators have been calculated, a total of 1068, files of type AAPL_stock_history_MONTH_3_AD.csv are generated.

Example of the file with the first eight indicators:

This data collection is customizable, you can obtain and train models of any Nasdaq stock, for other indicators or crypto-assets, it is also possible by making small changes.

Through the Option_Historical class it is possible to create historical data files: annual, monthly and daily.

class Option_Historical(Enum):
YEARS_3 = 1, MONTH_3 = 2, MONTH_3_AD = 3, DAY_6 = 4, DAY_1 = 5

The files \d_price_maxAAPL_min_max_stock_MONTH_3.csv are generated, which store the max and min value of each column, to be read in Model_predictions_Nrows.py for a quick fit_scaler() (this is the “cleaning” process that the data requires before entering the AI training models) . This operation is of vital importance for a correct optimization in reading data in real time.

1.2 Types of indicators

During the generation of the data collection file of point 1 AAPL_stock_history_MONTH_3_AD.csv 1068 technical indicators are calculated, which are divided into subtypes, based on prefixes in the name.

List of prefixes and an example of the name of one of them.

Overlap: olap_

olap_BBAND_UPPER, olap_BBAND_MIDDLE, olap_BBAND_LOWER,

Momentum: mtum_

mtum_MACD, mtum_MACD_signal, mtum_RSI, mtum_STOCH_k,

Volatility: vola_

vola_KCBe_20_2, vola_KCUe_20_2, vola_RVI_14

Cycle patterns: cycl_

cycl_DCPHASE, cycl_PHASOR_inph, cycl_PHASOR_quad

Candlestick patterns: cdl_

cdl_RICKSHAWMAN, cdl_RISEFALL3METHODS, cdl_SEPARATINGLINES

Statistics: sti_

sti_STDDEV, sti_TSF, sti_VAR

Moving averages: ma_

ma_SMA_100, ma_WMA_10, ma_DEMA_20, ma_EMA_100, ma_KAMA_10,

Trend: tend_ and ti_

tend_renko_TR, tend_renko_brick, ti_acc_dist, ti_chaikin_10_3

Resistors and support suffixes: _s3, _s2, _s1, _pp, _r1, _r2, _r3

fibo_s3, fibo_s2, fibo_s1, fibo_pp, fibo_r1, fibo_r2, fibo_r3, fibo_r2, fibo_r3

demark_s1, demark_pp, demark_r1

Intersection point with resistance or support: pcrh_.

pcrh_demark_s1, pcrh_demark_pp, pcrh_demark_r1

Intersection point with moving average or of moving averages between them: mcrh_.

mcrh_SMA_20_TRIMA_50, mcrh_SMA_20_WMA_50, mcrh_SMA_20_DEMA_100

Indicators of changes in the stock index, nasdaq: NQ_.

NQ_SMA_20, NQ_SMA_100

Note: To see the 1068 indicators used go to the attached sheets at the end of the document.

2 Indicator filtering

Execute to find out which columns are relevant for the model output

Feature_selection_create_json.py

It is necessary to know which of the hundreds of columns of technical data, is valid to train the neural model, and which are just noise. This will be done through correlations and Random Forest models.

Answer the question:

Which columns are most relevant for buy or sell points?

Generate the best_selection files, which are a raking of the best technical data to train the model, it is intended to go from 1068 columns to about 120.

For example, for the Amazon stock, point-of-purchase detection, in the period June to October 2022, the most valuable indicators are:

Senkuo of the Ichimoku Cloud
Chaikin Volatility
On-balance volume

Example of plots_relations/best_selection_AMNZ_pos.json file

“index”: {
“12”: [“ichi_chilou_span”],
“10”: [“volu_Chaikin_AD”],
“9”: [“volu_OBV”],

Plots with the 3 best technical data are printed in the folder plots_relations/plot.

Example name: TWLO_neg_buy_sell_point__ichi_chikou_span.png

3 Training TensorFlow, XGB and Sklearn models

Model_creation_models_for_a_stock.py

this requires the selection of better columns from point #2

There are four types of predictive algorithms, AI models:

Gradient Boosting consists of a set of individual decision trees, trained sequentially, so that each new tree tries to improve on the errors of the previous trees. Sklearn Library
Random Forest Random forests are an ensemble learning method for classification, regression, and other tasks that operates by constructing a multitude of decision trees at training time. Sklearn Library
XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost Library
TensorFlow TF is an open source library for machine learning across a range of tasks, and developed by Google to meet their needs for systems capable of building and training neural networks to detect and decipher patterns and correlations, analogous to the learning and reasoning used by humans. TensorFlow Library

There are POS (buy) or NEG (sell) models and there is a BOTH model (BOTH is discarded, since prediction models are binary, they only accept 2 positions, true or false).

This point generates prediction models .sav for XGB and Sklearn. .h5 for Tensor Flow.

Naming Examples: XGboost_U_neg_vgood16_.sav and TF_AMZN_pos_low1_s128.h5

Format of the names:

Type of AI you train with can be:
XGboost, TF, TF64, GradientBoost and RandomForest
Stock ticker AMZN for amazon , AAPL for Apple …
Detects points of purchase or sale pos or neg
How many indicators have been used in the learning, can be of 4 types depending on the relevance given by point #2 Indicator filtering. This ranking is organized in the MODEL_TYPE_COLM class,
vgood16 the best 16 indicators
good9 the best 32 indicators
reg4 the best 64 indicators
low1 the best 128 indicators
Only for TF models. Depending on the density of the neurons used, defined in the class a_manage_stocks_dict. MODEL_TF_DENSE_TYPE_ONE_DIMENSI can take value: s28 s64 and s128

These combinations imply that for each stock 5 types of IA are created, each in pos and neg, plus for each combination the 4 technical indicator configurations are added. This generates 40 IA models, which will be selected in point: #4 to evaluate the QUALITY of those models.

Each time an AI template is generated, a log file is generated: TF_balance_TF_AAPL_pos_reg4.h5_accuracy_87.6%__loss_2.74__epochs_10[160].csv

It contains the accuracy and loss data of the model, as well as the model training records.

4.1 Assessing the QUALITY of these models

Model_creation_scoring.py

To make a prediction with the AIs, new data is collected and the technical indicators with which it has been trained are calculated according to the best_selection files.

When the .h5 and .sav models are queried:

Is this a point of sale?

These answer a number that can vary between 0.1 and 4

The higher the number the more likely it is to be a correct buy/sell point.

Each model has a rating scale on which it is considered point of sale. For some models with a rating of more than 0.4 will be enough (usually the XGboost), while for others require more than 1.5 (usually the TF).

How do you know what the threshold score is for each model?

The Model_creation_scoring.py class generates the threshold score threshold files, which tell which threshold point is considered the buy-sell point.

Each AI model will have its own type file:

Models/Scoring/AAPL_neg__when_model_ok_threshold.csv

For each action in #3 train the TF, XGB and Sklearn models, 40 AI models are generated. This class evaluates and selects the most accurate models so that only the most accurate ones will be executed in real time (usually between 4 and 8 are selected).

Models/Scoring/AAPL_neg__groupby_buy_sell_point_000.json

“list_good_params”: [
“r_rf_AFRM_pos_low1_”,
“r_TF64_AFRM_pos_vgood16_”,
“r_TF64_AFRM_pos_good9_”,
“r_TF_AFRM_pos_reg4_”
],

4.2 Evaluating those real BENEFITS of models

Model_predictions_N_eval_profits.py

Answer the question:

If you leave it running for N days, how much hypothetical money do you make?

Note: this should be run on data that has not been used in the training model, preferably

Models/eval_Profits/_AAPL_neg_ALL_stock_20221021__20221014.csv

5.1 Making predictions for the past week

Model_predictions_Nrows.py

You can make predictions with the real-time data of the stock.

Through the function call every 10–12min, download the real-time stock data through the yahoo financial API.

df_compare, df_sell = get_RealTime_buy_seel_points()

This run generates the log file d_result/prediction_results_N_rows.csv

This file and the notifications (telegram and mail) contain information about each prediction that has been made. It contains the following columns: this point is deprecated by stocks-prediction-multi branch.

Date: date of the prediction
Stock: stock
buy_sell: can be either NEG or POS, depending on whether it is a buy or sell transaction.
Close: This is the scaled value of the close value (not the actual value).
Volume: This is the scaled value of the Volume (not the actual value).
88%: Fractional format ( 5/6 ) How many models have predicted a valid operating point above 88%? Five of the six analyzed
93%: Fractional format ( 5/6 ), number of models above 93%.
95%: Fractional format ( 5/6 ), number of models above 95%.
TF: Fractional format ( 5/6 ), number of models above 93%, whose prediction has been made with Tensor Flow models.
Models_names: name of the models that have tested positive, with the hit % (88%, 93%, 95%) as suffix

Registration example

2022–11–07 16:00:00 MELI NEG -51.8 -85.80 5/6 0/6 0/6 0/6 1/2 TF_reg4_s128_88%, rf_good9_88%, rf_low1_88%, rf_reg4_88%, rf_vgood16_88%,

To be considered a valid prediction to trade, it must have at least half of the fraction score in the 93% and TF columns.

More than half of the models have predicted with a score above 93% which is a good point for trading

5.2 Sending real-time alerts

predict_POOL_enque_Thread.py multithreading glued 2s per action

It is possible to run it without configuring telegram point 5.2, in that case no alerts will be sent in telegram, but if the results were recorded in real time in: d_result/prediction_real_time.csv

There is the possibility to send alerts of purchase and sale of the share, to telegram or mail.

the multiple AI trained models are evaluated, and only those greater than 96% probability (as previously trained) are reported.

Every 15 minutes, all necessary indicators are calculated in real time for each action and evaluated in the AI models.

The alert indicates which models are detecting the correct buy and sell points at which to execute the transaction.

These buy and sell alerts expire in, plus or minus 7 minutes, given the volatility of the market.

Also attached is the price at which it was detected, the time, and links to news websites.

Note: financial news should always prevail over technical indicators.

What is displayed in DEBUG alert, is the information from d_result/prediction_results_N_rows.csv of the Item: 5 make predictions of the last week Test

To understand the complete information of the alert see Point 5.1 Making predictions of the last week.

Quick start-up

See also detailed installation guide at
https://github.com/Leci37/stocks-Machine-learning-RealTime-telegram#detailed-start-up .

Install requirements

pip install -r requirements.txt

Run Utils/API_alphavantage_get_old_history.py

Run yhoo_generate_big_all_csv.py

Run Model_creation_models_for_a_stock.py

Run Model_creation_scoring.py

Run Model_predictions_Nrows.py Optional, last week predictions

Real time forecasts:

Run Utils/Volume_WeBull_get_tikcers.py Ignore in case of using default configuration

Configure bot token see point 5.2 Configuring chatID and tokens in Telegram

Run predict_POOL_inque_Thread.py

It is possible to run it without configuring telegram point 5.2, in that case no alerts will be sent in telegram, but if the results were recorded in real time in: d_result/prediction_real_time.csv

CODE: https://github.com/Leci37/stocks-prediction-Machine-learning-RealTime-telegram

And details of the installation and possible improvements

Recommended reading This does not understand the principle of self-fulfilling prophecy (explained at the beginning), but it is worth considering. LSTM time series + stock price prediction = FAIL https://www.kaggle.com/code/carlmcbrideellis/lstm-time-series-stock-price-prediction-fail

USE THE SOFTWARE AT YOUR OWN RISK. THE AUTHORS AND ALL AFFILIATES ASSUME NO RESPONSIBILITY FOR YOUR TRADING RESULTS. Do not risk money which you are afraid to lose. There might be bugs in the code — this software DOES NOT come with ANY warranty.
Permitted, free use and modification, but no commercialization to third parties, without authorization.

**https://github.com/Leci37/LecTrade LecTrade is a tool created by github user @Leci37. instagram @luis__leci Shared on 2022/11/12 .   . 
No warranty, all rights reserved**