“Hack” Robinhood.com for any Securities historical price data

Andrew Zhu
May 15 · 5 min read
Photo by Tech Daily on Unsplash

In the previous short article Track Dogecoin Real-Time Price with Python I leveraged Python’s requests and BeautifulSoup package to scrape the web HTML to grab real-time Dogecoin (or any other cryptos trade in Robinhood.com).

My holding number of Dogecoin is like the coin itself, is a joke. The main purpose is not for trading but to get hand dirty and see how I can use Python to scrape the web with minimum lines of code, and it looks working pretty well.

The next question follows: How can I get the historical price information in daily or even hourly granularity for the past 5 years? There are some seems easy and existing solutions. But …

Scraping the web is easy, but there are four potential problems.

  1. The performance is not very good. To have one price number, you need to request the whole web and parsing the HTML, which is completely overkilling and will bring troubles if you are working on 1,000+ securities and hope to capture price secondly.
  2. The web structure is not stable. Usually, a website will redesign its UI yearly or monthly. The HTML structure or CSS class name change will fail your soup “find” patterns.
  3. Many websites implemented anti-scaping mechanisms to prevent high frequent HTTP requests and even block the HTTP request IP address.
  4. Nowadays, many websites are built with Ajax and its content is rendered real time from asynchronized Javascript requests. It is not easy to grab real time data unless you switch to a real web browser like Chrome(or headless browser like Puppeteer, this another story)

You can also try using some existing Python packages that provide the stock historical price info like cryptocompare. But you need to pay for unlimited usage.

There should other ways to access the historical price information. Or, just for fun.

I found that I can check out any crypto or stock price on robinhood.com without sign in to it with a registered account. So, the website must be using an anonymous AJAX request to update the price on its page.

I opened the web page in incognito mode:

https://robinhood.com/crypto/DOGE

Pressed F12 and Inspecting the network traffic, found the web page repeating request data targeting this URL.

https://api.robinhood.com/marketdata/forex/quotes/1ef78e1b-049b-4f12-90e5-555dcf2fe204/

Hm… the price data must be sourcing from this API URL, the GUID1ef78e1b-049b-4f12-90e5-555dcf2fe204 must represent the dogecoin. I sent a request to this URL, hoping to get some data returned.

import requests
url = "https://api.robinhood.com/marketdata/forex/quotes/1ef78e1b-049b-4f12-90e5-555dcf2fe204/"
HTML = requests.get(url)
print(HTML.text)

Unluckily, nothing returned. There must be something missing in the request.

Usually, web site use referer or custom authorization id to protect its APIs to be called only from the original website so that protect its server and save cost. Since the Robinhood web page is open to anonymous access, there must be an open referer or auth id, somewhere on the web page.

Back to the chrome web inspection view. Taking a close look at the HTTP request header. There are lots of properties.

Among those, three properties captured my attention immediately. I guessed that these could be the “secret” keys used by the web page API call.

referer: https://robinhood.com/
origin: https://robinhood.com
authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9....

And, I can compose mimic the similar HTTP request easily with Python’s requests package.

import requests
url = "https://api.robinhood.com/marketdata/forex/quotes/1ef78e1b-049b-4f12-90e5-555dcf2fe204/"
auth_id = "Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJleHAiOjE2MjExMTc3MTMsInRva2VuIjoiaG5VZTZyanM1VG9VRjhJcXNPTXYwSm44YVZEYVJIIiwi...wMbv3Xl5Mfdei4pIsft_n99njvW0Use1cvKZFlPB52j5dFlEB3QG9VWASC0YDsnhuZWJLEuZ9GLAMKsDjo2k0e6KU2U7WHyxc9XWGw"
headers_obj = {
"authorization":auth_id
,"origin":"https://robinhood.com"
,"referer":"https://robinhood.com/"
}
s = requests.Session()
s.headers.update(headers_obj)
print(s.get(url).text)

Bingo! it works!

{"ask_price":"0.527115","bid_price":"0.524981","mark_price":"0.526048","high_price":"0.555427","low_price":"0.481254","open_price":"0.525359","symbol":"DOGEUSD","id":"1ef78e1b-049b-4f12-90e5-555dcf2fe204","volume":"0.000000"}

Note that, you need to replace the auth_id with yours if you want to give the above code a try.

Soon, I found the result returned from the above code is the real-time trading price, not historical price info.

I continue poking around the page, click the weekly and monthly trending view, and see what is changed in the inspecting view.

Soon, I found some other URLs are requested.

https://api.robinhood.com/marketdata/forex/historicals/1ef78e1b-049b-4f12-90e5-555dcf2fe204/?bounds=24_7&interval=hour&span=week

From the URL, we can see there are query strings like “bounds”, “interval” and “span”. This is the literally obvious API that used by the web page! Without waiting a second. I updated the above Python code a bit and send an HTTP request to this URL with customized header options.

(changed “hour” to “day” )

import requests
url = "https://api.robinhood.com/marketdata/forex/historicals/1ef78e1b-049b-4f12-90e5-555dcf2fe204/?bounds=24_7&interval=day&span=week"
auth_id = "Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9...fLgydIhYcyuo70HphiQYR2R6P7O4S5pdIPGZ9DnlF9BuuK_U2z8dEoS4sgvSkpG-MTz86tgRYNeTTD5wZqaXLvrrhXGHjFgQaArZjA"
headers_obj = {
"authorization":auth_id
,"origin":"https://robinhood.com"
,"referer":"https://robinhood.com/"
}
s = requests.Session()
s.headers.update(headers_obj)
print(s.get(url).text)

Wow, it works. it returned the full Dogecoin daily price from the previous week.

{"data_points":[{"begins_at":"2021-05-08T00:00:00Z","open_price":"0.692208","close_price":"0.639380","high_price":"0.743600","low_price":"0.602447","volume":0,"session":"reg","interpolated":false},{"begins_at":"2021-05-09T00:00:00Z","open_price":"0.639009","close_price":"0.572042","high_price":"0.716530","low_price":"0.383239","volume":0,"session":"reg","interpolated":false},{"begins_at":"2021-05-10T00:00:00Z","open_price":"0.572042","close_price":"0.450941","high_price":"0.574535","low_price":"0.415286","volume":0,"session":"reg","interpolated":false},{"begins_at":"2021-05-11T00:00:00Z","open_price":"0.450941","close_price":"0.494802","high_price":"0.550857","low_price":"0.442105","volume":0,"session":"reg","interpolated":false},{"begins_at":"2021-05-12T00:00:00Z","open_price":"0.495290","close_price":"0.392290","high_price":"0.526713","low_price":"0.385556","volume":0,"session":"reg","interpolated":false},{"begins_at":"2021-05-13T00:00:00Z","open_price":"0.392290","close_price":"0.490564","high_price":"0.522086","low_price":"0.352147","volume":0,"session":"reg","interpolated":false},{"begins_at":"2021-05-14T00:00:00Z","open_price":"0.490724","close_price":"0.560759","high_price":"0.595950","low_price":"0.463283","volume":0,"session":"reg","interpolated":false}],"bounds":"24_7","interval":"day","span":"week","symbol":"DOGEUSD","id":"1ef78e1b-049b-4f12-90e5-555dcf2fe204","open_price":null,"open_time":null,"previous_close_price":null,"previous_close_time":null}

I then tested other query string options and found that you can change the parameters to get any combination of data.

interval: hour, day ,week
span: week, month, year, 5year

Please note that the auth id will expire in about 30 minutes, so try not to build a program that continually calls the API to update your data. Even though, if you give the “hour”+”5year” combination, you get hourly price data from the passing 5 years in just one API call for any stock or crypto.

Next, you can leverage the web scraping solution I used in Track Dogecoin Real-Time Price with Python to update price real-time without id and no timeout problem.

With the understanding of how Robinhood.com API calls works. You can build a database to hold the security price data, and a stock/crypto price crawler to keep your own security price database updating with the market.

Maybe, you can build your own price analyzer with your own model. Potentially earn some fortune if not losing everything :D

I did not actually “hacked” Robinhood.com's account and did anything illegal. But sharing my practices and thought flow when I was in trying to reveal the hidden Robinhood APIs to get security historical price programmatically. Hope this helpful to anyone who wants to grab data from the public web with their own hand in Python.

CodeX

Everything connected with Tech & Code

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store