Jun 7 · 17 min read

AI Trend forecasting | Stock prediction

Disclaimer: This article is written for the learning purpose to apply Machine Learning and Prediction model for predicting the future stock price.

The third project | Trend forecasting | DSE3 G.9

Let’s start with AI

What’s AI? AI isn’t just a new set of tools. It’s the new world. AI has the potential to create huge economic and societal benefits for businesses, governments, and individuals worldwide. Source: PwC

PwC 2021 AI Predictions | What degree have your AI Investments in these areas lived up to expectations?

As a PwC 2021 AI Predictions survey, you can see that AI Companies have realized the benefit from AI investment such as improving decision-making, increase productivity, and cost-saving. However, we think that not just AI companies can benefit from AI investment but you, as a retail investor, can also use this AI technology to win in the stock market.

Have you invested in the stock market?

  • Have you won in the short run but got lost in the long run?

“If you face the problems above, read this article. We will show you how to apply AI for improving decision-making, reduce bias and risk and, ultimately, save your time for investing in the stock market!”

Here is our story

After two challenges passed, BOTNOI announced a new weekly challenge. This is our third project in the Data Science Essential program, called the Trend forecasting project. We have spent one week building up this model and writing this article. Even though it’s a tight timeline, it’s worthwhile and enjoyable. Our team has learned from each other to exchange investing, modeling, and programming perspectives.

Our project requirement

  • Pick one stock from SET100 based on the potential highest gain from the prediction model


  • The SET100 index is the primary stock index of Thailand. The constituents of the list are companies listed on the Stock Exchange of Thailand (SET) in Bangkok.

Our investing strategy

When you play a game, if you want to win, you need a strategy! Invest in the stock is the same. You need an investing strategy! And here is our investing strategy called a hundred to one.

Investment strategy — a hundred to one

What do you need to know to build up this investment strategy? Technical analysis, Money management, Fundamental analysis, AI model for stock prediction, and Stock selection model.

This article will walk you through this concept before jumping to the AI Prediction Pipeline and results.

I. Technical analysis

Technical analysis is widely used for the technical investor, or some hybrid investors combine technical analysis and fundamental analysis to make decisions when investing in a stock.

Technical analysis is a means of examining and predicting price movements in the financial markets, by using historical price charts and market statistics. It is based on the idea that if a trader can identify previous market patterns, they can form a fairly accurate prediction of future price trajectories.

Source | IG.COM/Technical analysis definition

You might have heard from your parents or friends that stock marketing is like gambling. Do you believe that? How can you create a better chance to win this game in the stock market?

You should know the trend!

What is a Trend? A trend is the overall direction of a market or an asset’s price. In technical analysis, trends are identified by trend lines or price action that highlight when the price is making higher swing highs and higher swing lows for an uptrend, or lower swing lows and lower swing highs for a downtrend.

Source | Investopedia

There are two types of trends: Uptrend and Downtrend.

Why should you buy stock in uptrends?

  • Uptrends are characterized by higher peaks over time and imply bullish sentiment among investors.

Source | Investopedia

We show you the illustration of uptrend and downtrend. The uptrend is where the overall stock price goes up in a current period of time while the downtrend is where the stock keeps going down. This is an article, we have gone beyond that, our team creates an indicator called 25-day EMA and 100-day EMA for the stock selection model. This will represent short and long-term pricing lines. You might notice that, when the green line is higher than the red line, it might indicate that the stock is on an uptrend. In reality, it is not necessary to use only 25-day and 100-day EMA to compare with. You might use 12-day and 50-day EMA or others which depend on your experience, stock, or certain market.

Remark: The exponential moving average (EMA) is a technical chart indicator that tracks the price of an investment (like a stock or commodity) over time. The EMA is a type of weighted moving average (WMA) that gives more weighting or importance to recent price data.

Some retail investors might ask whether EMA is only one indicator used for technical analysis. The answer is “No”. There are plenty of indicators (e.g. Simple Moving Average, Relative Price Strength, and etc.). You can see this in the picture below.

What’re indicators we use?

We will not go through a very detailed calculation of each indicator. However, there are some useful sources that explain indicators in visualization and mathematical description.

II. Money management

Similar to real situations, your resources or budget is always limited. You need to optimize return giving budget constraint

Consideration of Price change vs Absolute return:

There are three constraints which are (1) The budget is Baht 20,000, (2) We can buy only 1 stock in time, and (3) we can buy only multiply of 100 units in each stock as this base on the Thailand Stock Market requirement.

For example, if the stock price is Baht 150, we can only buy 100 units but not 133 units. Therefore, we will have Baht 5,000 remaining in cash and buy an amount of Baht 15,000. In order to ease understanding, we have shown the different results between a price change and absolution as in the following picture.

Illustrative of money management comparison

You can see that Stock A’s price has increased by 33% which is more than stock B at 30%. However, investing in Stock B provides the greater absolute results (value change of investing in Stock B is 30% while in Stock A is 25%.

Remark | %Price change is calculated by dividing target stock price with purchased stock price while %Value change is calculated by dividing total value with initial investment.

Do you know what the reason is? The answer is “money management”. Investing in Stock B can utilize 100% of the cash. Given other factors are constraints, investing in Stock B is a wiser choice. Therefore, our investment strategy will focus on the absolute return of investment, instead of price change.

III. Fundamental analysis

“Market action discounts everything’ says Technical Analysis but sometimes price charts lie.”

Source |

Sometimes looking at only indicators or graphs might lead to a significant loss. That is why we look at key fundamental factors together such as Return on Equity (ROE), Net Profit Margin (NIM) and Dividend Payout (DIV); and recent NEWS. This might improve the chance to win. We have summarized the benefit of these factors as follows.

  • NIM indicates how much net income a company makes with total sales achieved.

The NEWS has a very important factor to indicate stock price movement as investors come with emotion. If there are good NEWS or stories, it will help stock prices go up. So, NEWS is the final factor that our team use for the stock selection model

IV. AI Model for stock prediction

This project is related to stock prediction, as it needs to forecast the stock price. Do you know what is characteristic of stock price? Yes, it’s a Time-Series.

What’s Time series forecasting? It uses information regarding historical values and associated patterns to predict future activity. Most often, this relates to trend analysis, cyclical fluctuation analysis, and issues of seasonality. As with all forecasting methods, success is not guaranteed.

How time series forecasting look likes

What is XGBoost?

XGBoost is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework. In prediction problems involving unstructured data (images, text, etc.) artificial neural networks tend to outperform all other algorithms or frameworks. Our team has summarized the XGBoost summary in the infographic below.

XGBoost explanation

“The most important factor behind the success of XGBoost is its scalability in all scenarios. The system runs more than ten times faster than existing popular solutions on a single machine and scales to billions of examples in distributed or memory-limited settings.”

Source | XGBoost: A Scalable Tree Boosting System, 2016

XGBoost is designed for classification and regression on tabular datasets, although it can be used for time series forecasting. This link below will show you how we can apply XGboost for time series forecasting.

Limitation of XGBoost

This XGBoost model is that tree-based algorithms so it cannot extrapolate data. For the algorithm, train set bounds are total bounds for the rest of the data. So, it does not fit the Stock market that is reaching a new high or new low as the prediction will not go beyond such train set bounds.

Our team looks at the SET100 chart, we can see that SET100 index does not reach a new high or low, so we expect that XGBoost is still fine with stocks in SET100.

SET100 Index (2018–2021)

V. Stock selection model

Some retail investors might ask “could they only rely on the result from the AI prediction model to successfully win in the Stock Market?” The answer is “probably not”.

Pre-screening criteria

Since we have 100 stocks as our population, we perform the pre-screening criteria

  • Trend: We buy only when the stock trend is up. One indicator is that 25-days EMA is higher than 100-days EMA. This will indicate an increasing trend in stock price. So, it’s a better chance to win this game.
Pre-screening criteria

Based on the result of the AI Prediction model and pre-screening results, our team selects the 12 stocks with the highest target profit to perform stock selection as shown in the following illustration.

Stock selection model

Our team has to brainstorm and come up with the Stock selection model as illustrated above. We combine three main components which are (1) profit result from the Prediction model, (2) result from Back-Testing and Fundamental analysis for our stock screening.

Of course, the result from the prediction model, which is profit, is the key factor for this stock selection model. As refer to cash management consider, targeted profit is based on profit amount based on the initial investment (i.e. Baht 20,000) rather than a percentage of increase in each stock itself. We put the highest weight at 40% for calculating the scoring.

The result of back-testing of the model is also the second factor for our stock selection as we consider that the higher chance that model can predict correctly, the higher the confidence level that we can trust in the AI Prediction model. We allocate weight at 30% for this factor.

We use the growth of return on equity (%ROE change), dividend payout (%Div change), and net profit margin (%NPM change) to represent fundamental analysis. As these three representatives are fundamental factors, so we allocate weight to only 10% each which gross up to 30% for the fundamental analysis factors.

Scoring system

Scoring by tier

As we would like to create a systematic way for scoring. We, then, separate three-tier (e.g. top 4 will get 1 score while the bottom 4 will get 3 scores). Then, we use these scores calculated in each factor to calculate the final score. We will select the stocks with the top 3 lowest scores for the final round selection.

Final selection by NEWS screening

We select the top 3 of the lowest scores. We do research for stories to support the stock increases based on NEWS and IAA Analyst consensus. Finally, we will pick only one stock based on the best stories at the time together with the final score.

Machine Learning Pipeline

5 Steps of Machine Learning Pipeline

Step 1 | Get data

  • We listed the stock listed in SET100.
100 stocks in the SET100 index
  • We extracted stock information through Yahoo Finance’s API by using the “pandas_datareader” library. Note that we did not specify a period for an extraction. Therefore, we received stock information starting from 2 June 2016 until 2 June 2021 (the date of extraction). We understand that extracted data is limited to 5 years.
Illustrative python code for extracting stock information
  • Information gets from each stock (trade date, high, low, close, adjust close, volume, and symbol) as below table. Note that adjust close is the price adjusted dividend, par split, and others. Our team will use this adjusted close price to perform stock prediction and benchmark for investment return. Below are examples of extracted stock information.
Illustrative extracted stock information from Yahoo finance

Step 2 | Cleansing & transformation

  • We scanned through stock information and found that the dataset is of good quality (e.g. no missing data). Therefore, we did not perform data cleansing to this dataset.

Step 3 | Feature engineer

We define the base features which are Open, Adj Close, High, Low, and Volume.

We create technical indicator as additional features as below:

  • Average volume (20)


  • (x) represents the number of days. For example, SMA(10) is calculated from an average of 10 days ago.

The detail calculation of each technical indicator, you can refer to the link below.

  • We generate technical indicators using the code below. Note that we show you only the codes for calculating %return, SMA, and EMA as examples. The further detail is in the provided Colab link.
Illustrative python code for generating stock indicators
  • We use the base features and technical indicators to be input for the Time Series Prediction model.

Step 4 | Prediction model

We use XGBoost for Time Series Prediction and perform the following steps.

I. Define training and testing datasets

For purposing of developing the model, we split the dataset into a training dataset (1 Jun 2018 to 21 Apr 2021) and a testing dataset (5 May 2021 to 2 Jun 2021). We use the training dataset to predict the price of the next 2 and 8 days.

Remark | at the end we need to predict the price at 14 June 2021 so it is the next working 8 days from the latest dataset, which is 2 June 2021.

II. Setting hyperparameters

We set Hyperparameters for XGBoost which are ETA=0.3, Subsample=0.7, N_Estimator =150 , and min_child_weight = 5. This setting of hyperparameters is based on our test of model performance.

For further detail of XGboost: XGboost

III. Evaluating model performance

The accuracy ratio is calculated by the sign of movement in stock price between a predicted sign and an actual sign. For example, if we predict that stock price goes up and the actual price goes down, so this means we incorrectly predict. The accuracy of training sets is 0.9 while the testing set is 0.7. Although there is a bit gap in this accuracy or our model is a bit overfitted, we do accept this model as we consider that the accuracy at 0.7 is still high from our point of view.

Illustrative of model accuracy — samples from testing datasets (1 May 2021–2 Jun 2021)

Based on the illustrative of model accuracy above, we can see that majority of the dates in testing dataset stocks are accurately predicted.

IV. Deployment model

Illustrative of stock price prediction line (orange line) against the actual price (blue line)

When deploying the model, we predict the targeted stock price on 7 and 14 June 2021 which are buy and sold dates, respectively. The difference between these two days is the price change of stocks. Note that we use the dataset from 1 January 2018 to 2 June 2021 to retrain our model.

V. Adjust tick sizes

As in the SET market, there is some tick size which is the minimum price movement per each market price level. We adjust this price interval before calculating the predicted stock price

Illustrative of tick sizes and python code for adjusting price interval

VI. Consideration of money management for profit calculation

As we mentioned on money management at the beginning of this article, we calculated the predicted profit amount instead of the price change of stock so that we can see the absolute return of investment in regards to the initial budget at THB 20,000.

Step 5 | Result and visualization

According to our Stock Selection model (described in the first part), we have summarized 12 stocks with the highest target profit. We rank each stock by each factor (i.e. profit, %win/loss, %ROE change, % Div change, and %NPM change). In each factor, we rank by scores 1, 2, and 3 which 1 mean the best and 3 mean the worst. Then, we calculate the final score. The top 3 stocks that get the lowest score will go to the final round of stock selection are RS, STA, and ACE.

Remark: the profit shown in the above table is taken into consideration by cash management. In addition, based on the list of 12 stocks, the stock prices are quite far from Baht 200. So, we consider that it is a very rare chance that these stocks will reach Baht 200.

Final round — Stock selection

Based on the final round of stock selection, RS shows the best in class for every aspect as follows.

  • RS has the best score (Rank no.1) when we do stock selection in the previous round

Finally, our team picks RS Public Company Limited (RS) as the best stock to buy in on 7 June and sells it on 14 June at ATO price.

Our Conclusion

AI comes into our life. It might help you to create a better world. You might have more time to do what you like. However, you should know how to use them in the proper ways. As the AI model has been developed every day, whichever model currently works. It might not work in the future.

As retail investors, we don’t recommend just relying on the result of the AI prediction model to invest in stock but making further studies to understand the fundamentals of stocks. If the fundamentals of the stock are good, there will be less chance for you to lose in this game.

We think that there are opportunities for further enhancement such as trying alternative AI models, deploying AI models to a chatbot for providing daily price prediction, or adjusting models more for longer-term prediction.

We hope you enjoy reading this article. This might provide you some ideas on how to apply AI in real work (e.g. stock prediction), and how to combine investment strategy with an AI prediction model.



Project Manager & Author : Pree Preechaborisutkul | ML/Dev : Vittavat Phucharoen, Phil, and Noi| Analyst: March, Kid, and Chie


This publication consists of articles related to Data…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store