Serverless Chrome automation with GCP

Rémy DAVID
Google Cloud - Community
4 min readDec 4, 2018

--

A Google Data Studio report from the sample gallery

Google Data Studio is an amazing free tool allowing anyone to quickly build nice and easy to share dashboards out of pretty much any kind of data sources. It has one frustrating limitation though : reports are not real time.

The first reader loading your report will get the latest data but then data is cached for up to 12 hours even if newer data is available in the data source and subsequent readers will not get the updated data. Only report editors can refresh the cache by manually clicking the refresh button in the toolbar :

I don’t like you, little manual refresh button

But have you met Puppeteer ? From the Chrome DevTool team, it is “a node library which provides a high-level API to control headless Chrome“. Basically it allows to do most things that you can do manually in the browser from a Node.js app running on a headless server. What if we could just push this picky little manual refresh button from some task scheduler to get rid of the 12 hours cache limitation ? Let’s do just that !

HCaS (Headless Chrome as a Service)

--

--

Rémy DAVID
Google Cloud - Community

Teach Lead @Veolia, working on BI, analytics and dataviz using GCP. UX lover. Helping resourcing the world through digital transformation. Views are my own.