【Application(1)】The investing strategy of Benjamin Graham—the founder of securities analysis.

TEJ 台灣經濟新報
TEJ-API Financial Data Analysis
10 min readApr 19, 2021

--

Using Python to fulfill the investing strategy of Benjamin Graham meanwhile backtesting it in the Taiwan stock market.

Preface — who is Benjamin Graham?

We will introduce the defensive investing strategy which is created by Benjamin Graham — master of value investing. Benjamin Graham is the recognized pioneer of securities analysis. His first private fund was founded in 1923, and its performance was very good in his first shot. In 1925, it was liquidated and dissolved due to the disagreement of the partners. In 1926, he established a joint venture with his friend Graham Joint Account. By the beginning of 1929, the capital scale had grown from $450,000 to $2.5 million (non-new investors). Overnight, Graham’s name became the darling of Wall Street. The owners of many listed companies hoped that Graham would be responsible for their funds, but they all be declined because Graham believed that the stock market had soared excessively.

In 1934, Graham and David L. Dodd co-authored the book “Security Analysis”, which became the first ancestors of securities analysis. Before Graham, securities analysis could not be regarded as a science. This book nowadays has not been out of print and is one of the textbooks for university teaching securities analysis.

Contemporary well-known fund managers such as Warren Buffett, John Neff, Tom Knapp, et al., are Graham’s students. Therefore, as long as fund managers in Wall Street advertise value investment, we can say that they are all Graham’s disciples and grandchildren.

This article might be hard to totally understand. However, don’t worry about it because we have provided you with the full source code and contact information at the bottom of the article. If you have any problems while this article, please leave a message below or email us!

✨ Highlights of this article ✨

  • 🌟 Master strategy/ adjusted strategy intro
  • 🌟 Adjusted strategy construct & performance display

🚪Links related to this article🚪

🗻Master Strategy 🗻

Defensive investment strategy

1. Select the companies whose yearly revenue is more than 100 million, or yearly revenue is more than 50 million for public utilities.
2. Liquid ratio is higher than 200% and the long-term liabilities do not exceed net current assets.
3. Select the companies which have earnings every year over the past 10 years.
4. Select the companies that have paid cash dividends every year over the past 20 years.
5. Select the companies whose 3-year average EPS grow at least 1/3 over the past 10 years.
6. Stock price ÷ 3-year average EPS <15
7. Price to book values <1.5
8. Portfolio contains 10–13 stocks.

⬇️For the change in time and location, we made an adjustment to the above conditions.⬇️

Adjusted Strategy

1. Select the companies whose yearly revenue is more than the yearly average revenue of all listed companies.
2. Select the companies who have earnings every year over the past 5 years.
3. Select the companies that have paid cash dividends every year over the past 2 years.
4. Liquid ratio is higher than 200%.
5. Net current assets - long-term liabilities > 0
6. The absolute value of (recent 3-year average of net income -a recent 5-year average of net income)/recent 5-year average of net income > 0.33
7. PER1(calculated by the most recent 3-year average of EPS) ≤ recent 3-year average of PER.
8. PER(calculated by the sum of recent 4-quarter EPS) ✖️ PBR ≤ recent 3-year average of PER*PBR.

Using the components of the Taiwan 50 index as an example

Taiwan 50 index includes the top 50 market value listed companies in Taiwan.

Construction step:

  • 1st: import required packages.
  • 2nd: data extraction.
  • 3rd: construct master strategy.
  • 4th: backtesting strategy performance.
  • 5th: master strategy vs Taiwan 50 index.

First step — Import packages

import tejapi 
import pandas as pd
import numpy as np
tejapi.ApiConfig.api_key = 'your_key'
import datetime
import matplotlib.pyplot as plt

Second step — Data extraction

First, we could get constitute stocks of Taiwan 50 index from TW50.csv. However, the format of constitute stocks contains both numbers and characters(1101 台泥), so we could use the second line of code can help us get the numbers only.

Grabbing individual stock many times is faster than grabbing multi-stocks at one time. We recommend you use for-loop while grabbing multi-stocks.

The database we use in this episode includes IFRS以合併為主簡表(累計)-全產業(TWN/AIM1A)、上市(櫃)股價報酬(日)-報酬率 (TWN/APRCD2)和上市(櫃)未調整股價(年) (TWN/APRCY). The trial account has a limit on extracting data, so if you want to access data without limit, please check it on TEJ E Shop🎁

stk_info = pd.read_csv('TW50.csv',engine='python')
stk_nums = stk_info['成份股'].apply(lambda x: str(x).split(' ')[0])
# 撈取財務資料
zz = pd.DataFrame()
for code in stk_nums:
zz = zz.append(tejapi.get('TWN/AIM1A'
,coid=code
,paginate=True,chinese_column_name=True
,opts= {'pivot':True}
)).reset_index(drop=True)
print(code)

Third step — Construct the master strategy

Let’s start it 👷👷

✅ 1. Select the companies whose yearly revenue is more than the yearly average revenue of all listed companies.

The purpose of this condition is to kick out the bottom 50% of profitable companies, and the components of Taiwan 50 index have included the top 50 market value listed companies. Therefore, the profitability of most of them is quite stable, so we can ignore the first condition🏄🏄~

✅ 2. Select the companies who have earnings every year over the past 5 years.

  • The function of np.where() is similar to the if-else statement, but it enhances the readability of the codes
    - np.where(z1[‘常續性稅後淨利’]>0,1,0): Be in line with z1[‘常續性稅後淨利’]>0, then output 1, otherwise 0.
  • rolling is an in-build function of pandas, and the number in parentheses can adjust the size of the windows.
    - z1[‘earning’].rolling(5).sum(): sum of recent 5-year of z1[‘earning’].

First, if the z1[‘常續性稅後淨利’]is greater than 0, then rolling sum of the earnings. Then, choose the one whose rolling sum is equal to 5.

# 條件2:過去5年皆有盈餘
z1[‘earning’] = np.where(z1[‘常續性稅後淨利’]>0,1,0)
z1[‘earning_continue’] = z1[‘earning’].rolling(5).sum()
z1[‘condition_2’] = np.where(z1[‘earning_continue’]==5,1,0)

✅ 3. Select the companies that have paid cash dividends every year over the past 2 years.

The process is the same as step 2. We aim to choose the one whose rolling sum is equal to 2.

# 條件3:過去2年皆有支付現金股利
z1[‘cash_dividend’] = np.where(z1[‘普通股每股現金股利(盈餘及公積)’]>0,1,0)
z1[‘cash_dividends’] = z1[‘cash_dividend’].rolling(2).sum()
z1[‘condition_3’] = np.where(z1[‘cash_dividends’]==2,1,0)

✅ 4. Liquid ratio is higher than 200%.

  • Liquid ratio = Current assets/ Current liabilities.

We don’t have to calculate this value because TEJ already has an in-build liquid ratio.💪💪

# 條件4:流動比率>200%
z1['condition_4'] = np.where(z1['流動比率']>200,1,0)

✅ 5. Net current assets - long-term liabilities > 0

  • Net current assets = Current assets -Current liabilities
  • Non- current liabilities include long-term liabilities
# 條件5:淨流動資產-長期負債>0
z1['condition_5'] = np.where((z1['流動資產']-z1['流動負債']-z1['非流動負債'])>0,1,0)

✅ 6. The absolute value of (recent 3-year average of net income -a recent 5-year average of net income)/recent 5-year average of net income > 0.33

- z1[‘歸屬母公司淨利(損)’].rolling(3).mean():recent 3-year average of net income
- z1[‘歸屬母公司淨利(損)’].rolling(5).mean():recent 5-year average of net income

# 條件6:abs【(近3年平均稅後淨利-近5年平均稅後淨利)/近5年平均稅後淨利】 >0.33
z1[‘近3年平均稅後淨利’] = z1[‘歸屬母公司淨利(損)’].rolling(3).mean()
z1[‘近5年平均稅後淨利’] = z1[‘歸屬母公司淨利(損)’].rolling(5).mean()
z1[‘condition_6’] = np.where(abs((z1[‘近3年平均稅後淨利’]-z1[‘近5年平均稅後淨利’])/z1[‘近5年平均稅後淨利’])>0.33,1,0)

✅ 7. PER1(calculated by recent 3-year average of EPS) ≤ recent 3-year average of PER.

- z1[‘每股盈餘’].rolling(3).mean():recent 3-year average of EPS
- z1[‘當季季底P/E’].rolling(3).mean():recent 3-year average of PER.
- z1[‘close’]/z1[‘近3年平均EPS’]:PER1(calculated by recent 3-year average of EPS)

# 條件7:PER (當年年底收盤價/近3年平均每股盈餘) <= 近3年PER之平均
z1[‘近3年平均EPS’] = z1[‘每股盈餘’].rolling(3).mean()
z1[‘近3年平均PER’] = z1[‘當季季底P/E’].rolling(3).mean()
z1[‘PER1’] = z1[‘close’]/z1[‘近3年平均EPS’]
z1[‘condition_7’] = np.where(z1[‘PER1’]<=z1[‘近3年平均PER’],1,0)

✅ 8. PER(calculated by the sum of recent 4-quarter EPS) ✖️ PBR ≤ recent 3-year average of PER*PBR.

In the database of IFRS以合併為主簡表(累計)-全產業(TWN/AIM1A), the values of EPS reported at the Q4 is equal to EPS(Q1+Q2+Q3+Q4). The PER TEJ provides is also calculated by the cumulative EPS, so we can use in-build PER.

# 條件8:PER(以近4季每股盈餘計算)*PBR <= 近三年PER*PBR平均
z1['PEPB'] = z1['當季季底P/E']*z1['當季季底P/B']
z1['mean_PEPB_3'] = z1['PEPB'].rolling(3).mean()
z1['condition_8'] = np.where(z1['PEPB']<=z1['mean_PEPB_3'],1,0)

✅ 9. Scoring

# 計算總分
z1['score'] = z1['condition_2']+z1['condition_3']+z1['condition_4']+z1['condition_5']+z1['condition_6']+z1['condition_7']+z1['condition_8']
master strategy construction

Fourth step — Backtesting strategy performance

(It may be the most complex part of this article, so while reading this part please keep focusing on it!! 👀 📖 👍)

We use the score to sort the data and separate the data into five portfolios by quintiles, then we will backtest the performance of five portfolios respectively. For example, the top 20% of the score will be the first group, 21% to 40% will become the second group……

quintiles

Explanation of code 📚

date: Financial report date
buy_date: Assume 12/13 is T, then buy date is T+90.
sell_date: Sell date is buy_date+365.(holding period 1 year)
pf_H
: Storing stock code of each portfolio.
data: Extracting yearly return of multi-stocks(pf_H), and the date set between buy_date and sell_date.
q1_ret: yearly return of multi-stocks(pf_H)
tw0050: Extracting yearly return of 台灣50指數(TRI50), and the date set between buy_date and sell_date.
Date/Buy_date/Sell_date/date of yearly return -explanation

Assume the portfolio is equal-weighted, then we can calculate the weighted-average return:

The number of stock in portfolio: len(pf_H)
weight : w = 1/len(pf_H)
weighted-average return: sum(w*q1_ret)
transition fee + tax : (0.1425*2*len(pf_H) + 0.3*1)

The return of the five portfolios and Taiwan 50 index.

Fifth step — Master strategy vs The Taiwan 50 index

  • 💻Calculating the cumulative return of the five portfolios and the Taiwan 50 index.💻

Taiwan 50 index was published after 2002, so we give it a nan at the first row(date:2000–12–31).

Because the holding period is 1 year, the most recent data is the report of 2020Q4. Therefore, the buy date is 2021–03–31 and we will hold it till 2022–03–31. However, 2022–03–31 is in the future, we can’t get the close price at that date now. Therefore, we have to kick out the return calculated on 2020Q4.

#計算累積報酬率#
cum_ret = return_[['p1_return','p2_return','p3_return','p4_return','p5_return','twn50_return']].astype(float).apply(lambda x:(x*0.01+1).cumprod(),axis=0).reset_index(drop=True)
cum_ret['Date'] = return_['Date']

#剔除2020-12-31#
cum_ret = cum_ret[:len(cum_ret)-1]
#欄位排序#
cum_ret = cum_ret[['Date','p1_return','p2_return','p3_return','p4_return','p5_return','twn50_return']]cum_ret
the cumulative return of the five portfolios and the Taiwan 50 index.
  • 📈Data visualization 📈
plt.style.use('seaborn')
plt.figure(figsize=(10,5))
plt.xticks(rotation = 90)
plt.title('master invest strategy',fontsize = 20)
date = cum_ret['Date']
plt.plot(date,cum_ret.p1_return,color ='red',label='p1_return')
plt.plot(date,cum_ret.p2_return,color ='orange',label='p2_return')
plt.plot(date,cum_ret.p3_return,color ='blue',label='p3_return')
plt.plot(date,cum_ret.p4_return,color ='purple',label='p4_return')
plt.plot(date,cum_ret.p5_return,color ='green',label='p5_return')
plt.plot(date,cum_ret.twn50_return,color = 'black',label='twn50_return')
plt.legend(fontsize = 15)
Master strategy vs the Taiwan 50 index

The result shows that the cumulative return of the highest score group (P1) is outperformed others🚀, which is nearly 5 times higher than the benchmark — Taiwan 50 index(black line).

  • 📝Performance/ statistic indicator 📝
Ratio = pd.DataFrame()
for col in cum_ret.columns[1:]:
##年化報酬率
cagr = (cum_ret[col].values[-1]) ** (1/len(cum_ret)) -1
##年化標準差
std = return_[col][:len(return_)-1].astype(float).std()
##Sharpe Ratio(假設無風險利率為1%)
sharpe_ratio = (cagr-0.01)/(std*0.01)
##最大回撤
roll_max = cum_ret[col].cummax()
monthly_dd =cum_ret[col]/roll_max - 1.0
max_dd = monthly_dd.cummin()
##表格
ratio = np.reshape(np.round(np.array([100*cagr, std, sharpe_ratio, 100*max_dd.values[-1]]),2),(1,4))
Ratio = Ratio.append(pd.DataFrame(index=[col],
columns=['年化報酬率(%)', '年化標準差(%)', '夏普比率', '期間最大回撤(%)'],
data = ratio))
Ratio.T
Performance/ statistic indicator

Again, the result shows that the annual return🚀, Sharpe ratio🚀, and max drop-down of the highest-score group (P1) all beat others, but also has the highest volatility.

  • 🆕P1 portfolio at 2020Q4 🆕
stk_ranking = result[result['財報年月']== '2020-12-01'].sort_values(by='score',ascending=False).reset_index(drop=True)first_group = round(len(stk_ranking)*(0.2))
stk_ranking = stk_ranking.loc[0:first_group]['公司代碼'].tolist()
# 回台灣 50成分股查詢 P1組合的名稱
stk_info['stk_num'] = stk_info['成份股'].apply(lambda x: str(x).split(' ')[0])
stk_info['stk_cname'] = stk_info['成份股'].apply(lambda x: str(x).split(' ')[1])
stk_info['成份股'][stk_info['stk_num'].isin(stk_ranking)].to_list()
P1 portfolio at 2020Q4

Conclusion

In this article, we use Python to fulfill the investing strategy of Benjamin Graham meanwhile backtesting it in the Taiwan stock market and further display the result of the strategy by table and plots. After seeing the result, isn't it very touched?😆😆~

This article contains much content, which is worth it for readers to savor and digest. According to the results of our backtest, if we have started to invest in the highest-scoring group since 2000, its cumulative return will exceed 20 times ❗️️️️ ❗️️️️

To make a successful backtesting, we still need to pay attention to things such as the data quality, the length of data, whether there are bugs in your program, whether too much transaction costs are ignored, etc. As long as there is a mistake in these problems, it will cause the distortion of the backtesting and lose money!! As a result, you have to pay attention to our backtesting result again️️️️️️️️️️️ every time❗️️️️ ❗️️️️ ❗️️️️ ❗️️️️ ❗️️️️

Finally, if you like this topic, please click 👏 below, giving us more support and encouragement. Then, we will still work on financial data analysis and applications in the following articles, please look forward to them❗️❗️ Besides, if you have any questions or suggestions, please leave a message or email us, we will try our best to reply to you.👍👍

Question:
Where could I get the data with a stable, high quality, and the time length is sufficient? TEJ API is your best choice!!

* Links related to this article again!💪💪

🌟If you have any question and difficulty, do not hesitate to contact us: Contact Information🌟

Source Code🚪:

--

--

TEJ 台灣經濟新報
TEJ-API Financial Data Analysis

TEJ 為台灣本土第一大財經資訊公司,成立於 1990 年,提供金融市場基本分析所需資訊,以及信用風險、法遵科技、資產評價、量化分析及 ESG 等解決方案及顧問服務。鑒於財務金融領域日趨多元與複雜,TEJ 結合實務與學術界的精英人才,致力於開發機器學習、人工智慧 AI 及自然語言處理 NLP 等新技術,持續提供創新服務