Hands on with Google Cloud Platform, TensorFlow & BigQuery — early thoughts
I spent two days at Google’s Sydney office this week, using their new Google Cloud Platform (GCP) and getting some hands on experience with the GCP tool set.
I was given the opportunity to use GCP tools for querying billion-row data sets, set up virtual servers and create machine learning models.
We were encouraged to try … test … trial GCP to our hearts content.
At the end of the two days, I walked away impressed and excited. I think Google really has something here.
It’s much cleaner and easier to use than Google’s predecessor called App Engine. App Engine didn’t work that well if we’re being honest and Google will be the first to tell you that.
The Google Cloud Platform has clearly been rethought and redeveloped — complete with data analysis, compute power, machine learning, storage, data processing and more.
Here are a few of my early thoughts on the GCP platform:
- The first thing that struck was that GCP feels truly elastic — elastic computing, elastic data analysis, elastic machine learning and elastic data processing. It had been built so that you get the resources you need for the seconds that you need them. For example, we created and used three virtual machines to run a machine learning library. After using them for a couple of minutes and getting our result, we shut them down again. They were only used for about 175 seconds and Google told us we’d only be billed for those few seconds the machines were being used.
- GCP makes machine learning easy for data scientists — I was able to create and run machine learning scripts in minutes. This was a really exciting thing to watch. It points to a future where data scientists spend less time on mechanics and focus on the interpretation and business application of insights.
- BigQuery handled big data processing with ease. I was able to analyze 20 Billion rows of Wikipedia data in about 17 seconds. Yep — amazing. It took me more time to write one line of SQL code which speaks for itself.
- Ease of use was clearly a focus for GCP builders— GCP felt miles ahead in this area although a few improvements will be needed in the future (See below). It was easy to use for the most part when you had a guide sheet. The interface was clean. Projects were easy to organize and navigation through the GCP tool set was straightforward.
- While the interface was easy to use, making certain tools work end-to-end required users to jump around in a clunky way. In one example, the data processing task required us to create several virtual machines, switch screens, manually copy/paste each of their IP addresses, paste them into a different screen one-by-one, then go back to the original screen to approve the new servers and finally open up a command line screen to add the new machines to a different process area.
Overall, Google has created a great platform in GCP and its ahead of the field when compared to the market.
I spent my time using the GCP tool set, thinking about how I might perform the same tasks with the IBM BlueMix/Watson suite as well as Amazon’s platform.
The difference came down to ease-of-use and elasticity. GCP felt miles ahead of both IBM and Amazon in terms of easily creating machine, conducting data analysis and running machine learning scripts. In terms of elasticity, GCP allows you to run virtual machines for seconds at a time which is much more elastic than Amazon where you pay for machines by the hour.
Another attendee at the workshop, also comparing tools, summed it up perfectly when he said “The other tools have a high barrier to use, with this I can just make it work.”
His words echoed the same sentiment I took away from my time in Sydney. GCP was built to ‘just work’. And while there were improvements needed for various processes, Google has put out a great product and platform.
I’m excited to start throwing real data into GCP and trialling the platform in the wild. Look out for more from me as I try BigQuery and TensorFlow with real-world projects in upcoming weeks.