Transform Your Data Strategy with the Power of Salesforce Data Cloud’s Zero Copy Integration to Amazon Redshift

Rajkumar Irudayaraj
7 min readAug 27, 2024

--

We are excited to announce a major milestone in our vision to make Salesforce Data Cloud an open and extensible platform for connecting and unifying customer data. Our Bring Your Own Lake (BYOL) with Zero Copy integration and Data Sharing (Data-Out) to Amazon Redshift is now generally available! This completes the bi-directional capability along with Zero Copy Data Federation to Amazon Redshift (Data-In), allowing our customers to seamlessly connect and unlock their data across AWS regions without copying or replicating it. With simple point-and-click, metadata-driven experiences, businesses can now integrate their data into their processes more efficiently. This enhancement strengthens the Einstein 1 Platform, enabling richer customer 360 profiles and more powerful use cases for personalization, automation, analytics, predictive insights, and generative AI.

Zero Copy is a revolutionary approach to data integration that allows you to connect to data stored in external platforms without the need for traditional ETL processes to move or copy the data. Instead, you simply connect and use the data in place, unlocking its value immediately with access to fresh, accurate data when you need it. This Zero copy approach fosters a modern data integration strategy, giving customers control to manage and govern their data securely while still leveraging the Salesforce platform to enhance customer engagement.

BYOL with Zero Copy provides secure, bi-directional data integration in two parts:

  1. BYOL Data Federation (data-in): Easily connect to data stored in Amazon Redshift with a simple point-and-click interface. Create External Data Lake Objects and integrate them into the Customer 360 Data Model. This allows Data Cloud capabilities to enhance customer experiences on the Einstein 1 Platform.
  2. BYOL Data Sharing (data-out): Effortlessly share Salesforce data with Amazon Redshift through Data Shares, allowing customers to maximize their existing data investments.

With the Zero Copy integration architecture in Data Cloud, we’ve established a robust foundation using open standards like the Iceberg table format and SQL, enabling rapid expansion of BYOL capabilities to other data platforms.

Salesforce Data Cloud’s Zero Copy Data Sharing leverages AWS Glue Data Catalog, paving the path to support multi-engine views and integrating with Amazon Redshift Spectrum and AWS Lake Formation. This integration allows seamless querying of auto-mounted AWS Glue Data Catalog tables with IAM credentials and uses Lake Formation for permissions and access control on AWS Glue Data Catalog Views.

With the Zero Copy integration architecture in Data Cloud, we’ve established a robust foundation using open standards like the Iceberg table format and SQL, enabling rapid expansion of BYOL capabilities to other data platforms.

Overcoming ETL Challenges with Zero Copy Integration

Challenges of Traditional ETL Processes

Traditional ETL (Extract, Transform, Load) processes are often complex and resource-intensive, requiring skilled personnel to build and maintain intricate pipelines. This complexity leads to high costs and significant operational overhead. Additionally, traditional ETL processes involve copying data, which can cause delays and result in outdated information, making it difficult for organizations to make timely, data-driven decisions. The processes also raise concerns about data governance, security, and compliance, especially when data is frequently moved or copied. Furthermore, scaling traditional ETL to accommodate growing data volumes and changing business needs is challenging, often necessitating extensive redesigns and limiting flexibility.

Advantages of Zero Copy Integration with Salesforce Data Cloud

Zero copy integration overcomes these challenges by eliminating the need to copy data, thus ensuring real-time data freshness and reducing the risk of delays and inaccuracies. With Salesforce Data Cloud’s open and extensible architecture, zero copy integration allows organizations to access and analyze data seamlessly, breaking down silos and unlocking valuable insights. This approach enhances security and governance by maintaining strict control over data access, meeting regulatory requirements, and safeguarding against security threats. The simplified setup of zero copy integration also reduces reliance on IT resources, accelerating time-to-value and enabling businesses to focus on innovation and strategic initiatives. Additionally, the integration with platforms like Amazon Redshift provides managed services that allow customers to prioritize analytics and insights over technical maintenance.

Empowering Teams to Drive Business Value

Zero copy integration bridges the gap between data and operational insights, allowing businesses to quickly adapt to market changes and customer preferences. By merging CRM and non-CRM data, marketers can enhance segmentation and personalize campaigns, data scientists can build advanced AI and machine learning models, and analysts can combine multiple data sources to gain a comprehensive view of customer behavior. Use cases like targeted promotions, customer satisfaction analysis, and predictive modeling demonstrate how zero copy integration can drive more precise marketing strategies and improved customer engagement. By transitioning from traditional ETL to zero copy integration, organizations streamline operations, reduce costs, and enhance their ability to derive actionable insights from their data.

How Zero Copy Data Sharing works

With zero copy data sharing (Data Out), you can effortlessly share Salesforce Data Cloud objects with Amazon Redshift. We handle all the complexities, including the automatic creation of AWS Glue views in Lake Formation using Iceberg as the standard format, allowing the Data Cloud objects to be shared directly into your AWS account’s catalog. These objects are then mounted as specific views in your Redshift cluster, enabling you to run queries seamlessly.

Salesforce Data Cloud offers a simple point-and-click interface to share data with a customer’s AWS account. Using the AWS Lake Formation Console, customers can easily accept the data share, create resource links, and mount Salesforce Data Cloud objects as data catalog views. This setup allows for seamless querying of live, harmonized, and unified data in Amazon Redshift.

With zero copy data sharing (Data Out), you can effortlessly share Salesforce Data Cloud objects with Amazon Redshift. We handle all the complexities, including the automatic creation of AWS Glue views in Lake Formation using Iceberg as the standard format, allowing the Data Cloud objects to be shared directly into your AWS account’s catalog. These objects are then mounted as specific views in your Redshift cluster, enabling you to run queries seamlessly.

Zero copy Data sharing is supported with both Amazon Redshift Serverless and provisioned RA3 clusters. Data can be shared with a Redshift Serverless or provisioned cluster in the same Region or with a Redshift Serverless cluster in a different Region.

A Data Analyst’s Journey: Leveraging Zero Copy Data Sharing to determine customer churn propensity

In our previous blog post we showed how data federation works with access to Amazon Redshift data from Salesforce Data Cloud. Now let’s explore how a data analyst can easily determine customer churn analysis by accessing Salesforce Data Cloud data shares in Amazon Redshift.

The Data Architect persona first creates a data share target to Amazon Redshift, and then creates the data share for the pertinent data objects from Salesforce Data Cloud with just point and click experience to share data with a customer’s AWS account. AWS Lake Formation Console provides user experience to accept the data share to establish a secure connection to share data.

The Data Architect persona first creates a data share target to Amazon Redshift, and then creates the data share for the pertinent data objects from Salesforce Data Cloud with just point and click experience to share data with a customer’s AWS account. AWS Lake Formation Console provides user experience to accept the data share to establish a secure connection to share data.

The data analyst is now able to run customer churn analysis using data from both Salesforce Data Cloud and Redshift data. They can now run a ML model in Redshift ML, resulting in churn predictions, which can then be used to determine the next steps towards ensuring customer delight!

The data analyst is now able to run customer churn analysis using data from both Salesforce Data Cloud and Redshift data. They can now run a ML model in Redshift ML, resulting in churn predictions, which can then be used to determine the next steps towards ensuring customer delight!
The data analyst is now able to run customer churn analysis using data from both Salesforce Data Cloud and Redshift data. They can now run a ML model in Redshift ML, resulting in churn predictions, which can then be used to determine the next steps towards ensuring customer delight!

Conclusion

The groundbreaking Zero Copy integration between Salesforce Data Cloud and Amazon Redshift eliminates the complexities of ETL processes. This secure and flexible bidirectional integration enhances customer experiences and drives success. With a 360-degree view of customers enriched by Salesforce Data Cloud Zero Copy, teams gain invaluable insights, personalize interactions, and deliver greater value across all touchpoints.

Get Started Today!

Contact your Salesforce or Amazon sales team today to learn how you can start leveraging Salesforce Data Cloud’s Zero Copy integration. Continue your learning journey:

About the Author: Rajkumar Irudayaraj is a Senior Director of Product at Salesforce Data Cloud. He has over 20 years of experience in Data platforms and services, with a passion for delivering data powered experiences to customers. He spearheads initiatives focused on BI and AI, including: Zero Copy Data Sharing (BYOL), Calculated Insights, Data Actions, and Data Cloud SQL/Query engines delivering impactful insights and contextual data seamlessly into the flow of work.

About the co-author: Sriram Sethuraman is a Sr Manager of Product Management. He has been building products for over 10 years using big data technologies. In his current role at Salesforce, Sriram works on Salesforce Data Cloud building zero copy data integration solutions with various external data lake partners and near real time streaming solutions.

--

--

Rajkumar Irudayaraj

Rajkumar is a product leader with 20+ years of experience in Data platforms and services, with a passion for delivering data powered experiences to customers.