Reverse engineering Google Finance charts
In this article, we will talk about how to reverse engineer Google finance charts to parse them using Ruby on Rails.
When you search in Google for something like
Bitcoin price or
bitcoin vs dollar we will notice a chart and very rich finance data, originally the source of this data is Google finance.
What is Google finance?
Google Finance is a website focusing on business news and financial information hosted by Google.
We will ignore all the other data and we will focus on parsing the chart only as the extract of the other elements has been covered by other SerpApi blog posts.
Basically, every chart or graph consists of two important parts (x-axis and y-axis).
The x-axis is a horizontal line and the y-axis is a vertical line.
Now we just need to understand the numbers in the Google finance chart, The y-axis represents the price column, and the x-axis represents the time.
It’s obvious now in the screenshot above, the price is 56,854.90 at 8:05.
Now we will find the chart CSS class:
In this example, we will take the attribute
jsdata but we should note that the input of this attribute is changing every search.
So by using REGEX we will extract the last element inside
jsdata="Wplt6c;_;AWRM64" which means the element we want is
1- This is the Regex that we used to search in the page source for the chart data.
2- Is the raw HTML page — to search inside of it.
3- The result, which it’s the group of the chart JSON data.
After formatting the JSON, now we need to understand what’s inside the JSON carefully.
1- This number represents the price.
2- This one represents the time
Putting everything together
Now we have to use
dig method to extract the JSON data we need, which it consists of the x-axis and y-axis arrays (price and time).
and then we will convert the time from minutes to hours, we will use this formula to get the UNIX time:
unix_time = time * 60
and the last thing to do is convert the unix_time to DateTime (
UTC Y-M-D:H:M )
data[:time] = Time.at(time * 60).utc.strftime("%Y-%m-%d %H:%M %p")
The final result: