How Salesforce and Google Cloud Platform are Simplifying Big Data

By Julien Sauvage

To check out a demo of Google Cloud Dataflow and Wave Analytics in action, check out this video.

Earlier in May we announced Salesforce Wave Analytics for Big Data. Leading innovators joined our Analytics Cloud ecosystem to extend the use cases for big data to the business user. Among those companies was Google, who is announcing the general availability of Google Cloud Dataflow, a fully-managed data processing service on Google Cloud Platform. We are excited to be part of this major announcement.

But what does that mean exactly from a technology point of view? And why does it matter to the business user?

You’ve heard it many times — Big Data is awesome and can drive amazing results for many business functions, from streamlining operations to influencing product development. Yet traditional data architecture for processing big data can still be very complex, and require multiple technology layers:

  • a data storage and management layer;
  • a data processing layer (sometimes separate);
  • an ETL and data integration layer that extracts, transforms and loads this data;
  • a data quality layer that deals with missing values, abnormal data values and duplicates;
  • and a data preparation layer that aggregates and or enriches that data.

Simply put, it can be a very fragmented and complex architecture that’s hard to set-up and hard to maintain.

If there’s one thing to remember about the combination of Google Cloud Dataflow and the Wave Analytics Platform, it’s that we remove all that complexity.

Google Cloud Dataflow is a simple, flexible, and powerful tool that can perform data processing tasks of any size. It overcomes complexity at both the architecture and programming levels. And it automates data processing in both batch and stream mode.

First, the architecture is simplified. Big Data from various sources can be combined with customer data from Salesforce. There’s no need for separate ETL or data preparation tools, Cloud Dataflow takes care of it. There’s no need to manage distributed systems.

Secondly, there’s no need to use old-school MapReduce, a complex programming framework for the processing of large datasets that’s also well-known for latency limitations and occasional lack of performance. Cloud Dataflow is really designed to handle very large datasets and complex workflows — and do it simply. Once data is processed and enriched, it can be pushed to Wave with a native connector (this connector was built by another of our great partners, SpringML, see their blog here).

Now that our big data is in Wave it’s ready for the business user! Any user can freely explore and drill through that rich data, run various visualizations and consume dashboards to get insights — on any device. Faceted and multi-dimensional dashboards can be built in minutes. Any business user can drill down into their dimension or measure of choice.

And because Wave Analytics is native to Salesforce, users can use Chatter directly from inside Wave so that their teams can collaborate on insights from anywhere. Just like Google Cloud Dataflow, Wave Analytics removes the complexity that is inherent to traditional, legacy BI and analytics solutions.

Need an example? Let’s imagine a media company, called Acme Media, that uses Salesforce as their global CRM solution. When a sales rep closes an opportunity with an advertiser, the status of that opportunity object will be updated to “Closed” in Salesforce. It will then be entered in DoubleClick as an order. DoubleClick is Google’s ad technology that manages digital advertising. DoubleClick then proceeds to serve the ads.

But that sales rep needs visibility into ad performance. With Google Cloud Dataflow, Acme Media can process very large datasets in near real-time, so they can be aggregated, transformed and enriched on the fly. Then that data will be sent natively to Wave Analytics, where that sales rep can analyze, explore, and consume it as a set of drillable dashboards.

So now our sales rep constantly has access to the latest information, on any device. For example, he might notice that the ad is performing poorly in a particular demographic segment or geographic region, and he may want to modify the ad immediately from his mobile phone. By accessing fresh data and analyzing it at the point of decision, every sales rep at Acme can now take appropriate steps at any time to better target their audience.

With Google Cloud Dataflow and Wave Analytics, this is now simple and business users can easily analyze massive amounts of customer data and turn Big Data into actionable insights.