This is How I Implemented Benjamin Graham’s Teachings into an Automated Investing Strategy in Python
More than ever before, I’ve been speaking with developers who’ve gotten interested in investing. Last year, everyone had that friend who was showing off the incredible gains on their Bitcoin wallet. Even if the idea of cryptocurrencies didn’t appeal to everyone, it certainly got some heads turned towards all the money in their savings accounts. With the way things are these days, the growth on many savings accounts isn’t even beating inflation, and a lot of people have come around to realizing that maybe they should be making some real returns on that money instead.
If you’re anything like me and the other developers I talk to, though, just throwing your whole account into something like the S&P 500 isn’t quite enough for you. You want an investment profile that’s truly your own — and you probably think putting your programming skills to use is a good way to figure out what to put into your portfolio. In this article, I’ll be going over an example of how you can transform a set of investment principles into Python code that will invest your money for you. We’ll be using Alpaca, a commission-free brokerage, as the Alpaca platform provides an API that will make this about as easy as it gets. I’ll be going over the principles and the code piece-by-piece, but if you want to just get to the finished result, you can check it out right now on our GitHub. (If you’re a Quantopian veteran, the structure of the code should be familiar to you.)
The man who our investment script is going to take after is none other than Benjamin Graham. Known to this day in the world of finance as “the father of value investing,” Benjamin Graham’s influence on investors is everywhere. He was Warren Buffett’s mentor, and his book, The Intelligent Investor, is still probably the most widely-recommended published work for people who want to do more with their money.
In order to follow Graham’s suggestions, though, you’ve got to do your own research on company assets and debts, reading balance sheets and investor reports to make your picks. If you’re anything like me, you don’t have room in your life to pick up a second job as a financial analyst, studying quarterly reports and translating what you see into trades. That’s where we’ll be putting our programming skills to use.
Fortunately, the businesses that surround the market are constantly evolving and presenting new opportunities to everyday investors. Alpaca is a new brokerage that offers an easy-to-use API and commission-free trades. When you combine Alpaca’s platform with free data sources and a little bit of Python, you can create a script that will handle the dirty work of researching company fundamentals for you.
For this script, let’s say we want to translate some of what Cabot Wealth Network identified as being key criteria for how Graham chose companies to invest in into code. This is only an example to get you started. If you decide that you want to commit your own money to an automated trading strategy, you should test the performance of its principles yourself first. Fortunately, Alpaca provides a “paper trading” platform, so you can see this script at work without having to put your real account balance on the line.
- Find companies with a low debt load. Graham’s advice: find companies with a total debt to current asset ratio of less than 1.1. Total debt and current assets are both reported every quarter on company balance sheets, and they can both be read from the IEX financials endpoint.
- Confirm the low debt load using the “Current Ratio.” Companies should also have a current asset to current debt ratio over 1.5. Current debt is composed of debts that are due within one year, and we don’t want to have to fear a near-term liquidation of a majority of the company’s assets. This information is accessible from the same endpoint as above.
- Avoid high-risk companies by finding companies who have had a consistently positive earnings per share over the last year. This will indicate that the portion of the company an investor owns, however small that portion is, is stable enough to be a solid investment. We can access the last two years of earnings reports using IEX’s earnings endpoint.
- Find stable bargains, not companies that the market already expects explosive growth from. We can partially evaluate how overvalued a company is by looking at its price-earnings (PE) ratio. Filtering out companies with PE ratios over 9.0 will eliminate high-growth companies, the prices of shares for which are prone to be subject more to the whims of market speculators than what they report on their balance sheets. IEX helpfully calculates the PE ratio for us and reports it in their key company stats endpoint.
- Validate a company’s valuation by checking that they don’t have an overinflated price-to-book (PB) ratio. High PB ratios indicate that the market is placing a very high value on the potential associated with intangible assets like R&D efforts or brand recognition. These factors can be very speculative, and thus we want to filter out companies with PB ratios over 1.2. IEX also reports this factor in the key company stats endpoint, listed as
- Gravitate towards dividends. Our algorithm will be investing in undervalued stocks, but there is no guarantee on how long it will take the market to follow our logic (if it will at all.) Dividends can help make up for the time spent waiting for valuation increases. Strategizing around dividends can be very complex, but for the purpose of this script, we’ll just make sure that it has a dividend yield over 1%. More detailed information is available about dividends through the IEX dividends endpoint, but we can get the dividend yield from key company stats.
- We also need to figure out how much of our money to put into each company. Since Cabot’s list didn’t really touch on this, I’ll be using a rule that attempts to find a balance between market capitalization and sector presence. I’ll be approaching diversification as follows: we assign an equal amount of our portfolio to each sector we want to invest in, and we divide up each sector’s portion among the companies we like in it based on their market caps. This means that Fortune 50 tech giants won’t swallow up our portfolio and leave smaller markets out to dry, but it also means that we run the risk of keeping too much of our money in a stagnant sector. There are boundless ways to go about diversification, and I recommend you do some thinking (and testing) of your own to determine which sits best with you. For this example, though, let’s just give this way a shot.
In sum, these criteria aren’t meant to find the next “double your money in a month” speculative trade; they’re meant to find stable investment opportunities where you’re confident that your money will grow in good hands. And now that you know what we’ll be looking for, let’s get to the implementation.
The first step in running our algorithm is getting the data we need to evaluate stocks. It is quite easy to use — let’s say you want to get the financials data for a group of stocks. You could run this code:
>>> from iexfinance import Stock
>>> stock_batch = Stock(['AAPL', 'TSLA', 'MSFT'])
And you’d get a response in JSON format with a whole lot of balance sheet information from the last few quarters for those three companies. We’ll be using the
get_financials() method in our script, along with
get_key_stats() , and
get_earnings() , all of which retrieve information from different IEX endpoints.
This is an open source library which is widely used by people practicing data science. Covering all the ways it can be used go way beyond the scope of this article, but I’ll give a basic example. Pandas maintains data structures called dataframes for us, which can have labeled rows and columns, like a database table. Information from dataframes can be accessed in a similar manner to a dictionary. Let’s say we have a dataframe full of fundamental data we’ve extracted called
fundamental_df . It has one row for each stock, and has several columns, one of which is named
market_cap. If you wanted to add that column up, you might do this:
total_market_cap = 0
for stock_symbol in symbol_batch:
total_market_cap += fundamental_df[stock_symbol]['market_cap']
But the main reason that Pandas is useful is that it gives us access to lots of cleaner ways to accomplish similar tasks. For example, the above could be written as simply as this:
total_market_cap = fundamental_df['market_cap'].sum()
Now let’s say you had another column,
pe_ratio , and you wanted to add up the market caps of all companies who have a PE ratio of over 15. That could be written like this:
high_pe = fundamental_df['pe_ratio'] > 15
high_pe_market_cap = fundamental_df[high_pe]['market_cap'].sum()
While writing dataframe code can be challenging sometimes, it shouldn’t wind up being very hard to read when you understand the basic syntax like that. So even if you’ve not already mastered manipulating data with
pandas, you should be able to follow along as I use it in the script.
This is an open-source library built to make connecting to brokerage APIs and submitting trade orders through them easy. It’s designed to be mostly compatible with Quantopian scripts, so if you’re a Quantopian user, it should look familiar to you. Here are some example methods you can define that pylivetrader will hit automatically when it runs your script:
# This method will be called every time the script is run.
# From here, we can schedule when other methods will be called.
# context is simply an object containing useful information passed around to the different methods pylivetrader accesses.
def before_market_open(context, data):
# This method will be run every day before market open as long as the script is running.
def handle_data(context, data):
# This method is called whenever there's new data coming in - once a minute, when the market's open.
All we’ll really be needing to get our script started is
initialize — and here’s what it looks like for our script. (If you’d rather see the code in your own IDE, you can grab the code in full on our Github.)
To kick things off, we’ll be doing
from pylivetrader import * to get access to all the features, like order submission, it provides. We then go ahead and put the sectors that we’re interested in trading into the
context object, which just means that we can access them later in other functions. For the sake of simplicity, we want our algorithm to rebalance on the first of the month every three months.
schedule_function lets us tell pylivetrader that we’re going to want one of our methods run eventually, and we tell it that we want it to try to run at the first of every month. You can see Quantopian’s documentation on how to configure the timing of your functions here.
The function we’re scheduling is called
try_rebalance . It’s a simple function that sees if it’s time to shake up our portfolio by checking the
The math here is pretty easy to follow — we run our algorithm’s actual logic, stored in
rebalance , once every three months. Let’s take a look at the first of those functions, where we’ll be grabbing all our data and figuring out which stocks we want to buy. There’s a lot going on here, and I’ll break it down one piece at a time.
Here’s where we start the real work. We’re going to be processing each sector separately, since, as I mentioned above, we want to eventually weight our investments accordingly. To do that, we’ll first need to get the data for the stocks in each sector. I’ll go over
build_sector_fundamentals below, but for now, just think of it as some magic that gives us back a dataframe that has all the information we’ll need for the sector we’re in, sorted by stock. Similarly,
filter_fundamental_df is a bit of magic that will remove the stocks that don’t meet all of our criteria. We’ll come back to these.
With all our information put together, we’re going to figure out how much of each stock we want to buy. We calculate a stock’s sector contribution by comparing its market cap against the total market cap of the other stocks in that sector we want to buy, and we store that in a new column on the dataframe,
sector_contributions . If you add all the sector contributions for every stock in a given sector together, you’ll get 1, so
context.total_sector_contributions will be
1/len(context.sectors) , assuming we order at least one stock from every sector. Once that’s sorted out, we merge all our dataframes together using
pd.concat so we can easily query all the data at once later.
Let’s put off the data filtering a little longer and look at how we compose our orders once we’ve got enough information to choose and weight our positions. We’ll do this in the rebalance function.
We might have some existing positions that we no longer wish to keep. Maybe a company experienced some great growth and its PE ratio rose over 9.0, and it’s time to take our profit. We see which positions we have in our portfolio by looking through
context.portfolio.positions , which is created for us by pylivetrader. For those we don’t want around anymore, we call
order_target_percent . This is a method pylivetrader provides us, and it will set our portfolio’s total investment in a given security to the percent we tell it to. So, if we have $50,000 and no AAPL stock, and we say
order_target_percent(symbol('AAPL'), 10) , we’ll wind up buying roughly $5,000 of AAPL. When we tell it to order “zero percent” of something, we’re telling it to liquidate our existing position.
Once our assets are freed up, we look through the stocks we actually do want to buy, and we submit buy orders for each, again using
order_target_percent . This time, we tell it to determine the percentage of our portfolio to invest based on on
get_weight . As discussed above, this is going to go through and create a weight based on the stock’s size within its own sector. With that, pylivetrader’s job is done — the stocks are bought and sold every few months, and you get to watch the script’s performance in real time on Alpaca’s dashboard.
Of course, I still haven’t gone over the “magic” I mentioned earlier. Let’s take a look at how those methods filter all the stocks out there down to just the few we want.
Broadly, this code accesses an IEX endpoint that provides us the list of stocks in a given sector, then breaks those lists down into chunks of 100, as IEX won’t allow you to ask for information on more stocks than that in a single query. We have to extend
iexfinance a bit with the SectorCollection class to access that endpoint, as that library is more focused on the Stocks endpoint, but it’s not a huge chore. After that, we hit the IEX endpoints we need — financials, quote, stats, and earnings — for each batch of stocks, and we wrap it all up in a dictionary. At the end, we transform that dictionary into a dataframe using the
pd.DataFrame.from_dict method, which helpfully saves us the hassle of having to declare the dataframe format ourselves.
We’re almost done, but first, let’s take a look at this method’s helper functions that validate the data.
This block of code is a little lengthy and pretty straightforward — we make sure that all our data is present, save it in a dictionary, and give it back to the caller. I will note that
eps_good is doing some work that could be done as part of the dataframe filtering, but I felt that it was cleaner to just handle the earnings information separately rather than add a column for every quarter to the dataframe. We make sure that a company hasn’t had any negative earnings per share reports in the last year, and if it clears that check, we move forward with our analysis of it. (If the validation methods return
False, it discards the stock and doesn’t even add it to the dataframe we’ll filter.)
Finally, let’s bring it back around, and we’ll encode the rules I discussed earlier in the form of a dataframe filter.
This returns a view of the dataframe with only the rows for stocks that meet the criteria. And there you have it! Put all those methods together, and you’re ready to go. Again, you can find a full copy of the script on our GitHub here if you don’t want to piece it together yourself.
Running Our Script
Of course, in order to run the script and actually trade, you’ll need an account with Alpaca. Alpaca provides a commission-free brokerage platform specifically for investors who want to execute their trades through an API. Once you’ve registered your account with them, you’ll be given access to a paper trading API key. (In the paper trading environment, you get to let your algorithm play around with however much fake money you want to give yourself.) When you’ve got that, you’re ready to run pylivetrader. Follow the instructions on its GitHub readme to give it your API keys, and once it’s set up, run it in your terminal like so:
pylivetrader run GrahamFundamentals.py
Feel free to give the algorithm a try in the paper trade environment, and if you’re feeling adventurous, tweak some aspects of it so that they’re more to your liking and see if it performs any better. Once you’ve funded your Alpaca account, you’ll be able to write scripts that invest with your real account balance using a separate set of API keys.
I hope this has helped to give you some idea of how to put your programming skills to work in the stock market. Best of luck, and happy investing!