Extracting Fundamental Stock Data from EDGAR using our favorite language: Common Lisp (Part 1)

Muro
4 min readMar 1, 2024

--

Lately, I’ve been getting into Fundamental Analysis in an attempt to pick better stocks for my portfolio. I’m really drawn to the idea of value investing. I found some websites that offer financial data for a fee every month (somewhere between 40–100 USD/month to get fundamental data). But honestly, I don’t want to pay for something when I know I can get it for free and more importantly, legally. So, I’ve been looking for ways to get this info without spending money. Remember, a dollar saved is a dollar earned.

Fortunately, the SEC provides free access to fundamental data through EDGAR. EDGAR uses a format known as XBRL, making it a go-to resource for anyone looking to dive into financial details without the extra cost.

Alright, enough talk, let’s get down to business. Time to boot up Sly (or Slime) and hop into our REPL.

We’re going to lean on Dexador for fetching the data with GET requests and cl-json for handling the JSON data. So, let’s kick things off by pulling in these two libraries from Quicklisp:

(ql:quickload '(:dexador :cl-json))

When using EDGAR, there’s a simple rule: we can make up to 10 requests per second. This keeps things fair so everyone can access the info they need without overloading the system. Also, when we ask EDGAR for data, we need to tell it who we are by using a request header with our email address. This helps keep everything running smoothly and fairly for everyone. So, let’s define a variable for our request headers:

(defparameter *sec-headers*
'(("User-Agent" . "example@email.com")))

Peeking into the EDGAR API documentation (https://www.sec.gov/edgar/sec-api-documentation), it’s clear we need the CIK number of the company to tailor our requests like so:

https://data.sec.gov/submissions/CIK##########.json

And here’s a little nugget of knowledge: the CIK number must be a 10-digit affair, padded with zeros where needed.

But then comes the million-dollar question: how do we snag the company’s CIK number? No sweat — the SEC has our backs. They provide an API endpoint right here, serving up the company ticker, the company name, and you guessed it, the CIK number.

https://www.sec.gov/files/company_tickers.json

Let’s create a hash table to store our ticker names and its corresponding company name and CIK number.

(defvar *company-info* (make-hash-table :test 'equal)

And now let’s make a GET request to https://www.sec.gov/files/company_tickers.json to fetch the information. I will store the ticker names as keys in the *company-info* hash table and an association list (alist) as value containing the company name and the CIK number. Note that we are using our headers in the request.

(defun read-tickers ()
"Get list of companies from Edgar and store them in the *company-info* hash table"
(let* ((tickers (dex:get "https://www.sec.gov/files/company_tickers.json"
:headers *sec-headers*))
(response (cl-json:decode-json-from-string tickers)))
(dolist (item response)
(let* ((ticker (cdr (assoc :TICKER (cdr item))))
(cik (cdr (assoc :CIK--STR (cdr item))))
(company (cdr (assoc :TITLE (cdr item))))
(value (list (cons 'company company) (cons 'cik cik))))
(setf (gethash ticker *company-info*) value)))))

Now let’s populate our hash table by running our read-tickers function:

;;; Populate hash table
(read-tickers)

So now, if we inspect our *company-info* hash-table we can see that it has been filled with company data:

That is great, but remember, the CIK number must be 10 digits long, padded with zeros if shorter, so we need a function to pad the CIK numbers with zeros:

(defun pad-cik (cik)
"Pad CIK number so it is always 10 digits long"
(format nil "~10,'0d" cik))

Now let’s write a utility function to get the CIK number using the Ticker name:

(defun get-stock-cik (ticker)
"Get CIK number for a given ticker"
(let ((data (gethash ticker *company-info*)))
(pad-cik (cdr (assoc 'cik data)))))

Let’s test our function:

Since we’ve already tucked away the company name in our hash table, how about we write a function to retrieve the company name using its ticker symbol?

(defun get-company-name (ticker)
"Get company name for a given ticker"
(let ((data (gethash ticker *company-info*)))
(cdr (assoc 'company data))))

Let’s test it:

Okay, I think we are ready to start pulling data from EDGAR. But let’s save that adventure for our next post. See you next time!

PS. I get it, I get it, Python could handle this, but let’s be honest, tackling it in Common Lisp just adds an extra layer of fun!

--

--