4 Considerations for Salesforce CDP Data Ingestion

Gina Nichols
Salesforce Architects
5 min readMay 27, 2022

--

Customer Data Platforms (CDPs) bring together customer data from multiple systems to create a single centralized repository that can provide insights into customer interactions across the organization. A key step in this process is data unification, in which fragments of information about a customer that are stored in various systems are merged together into a unified individual — one coherent profile of that customer. The success of this data unification is highly dependent on the quality of the data. Incorrectly mapped fields or missing data will make it difficult (or in some cases, impossible) to unify fragments from disparate systems, resulting in multiple unified individuals for the same customer in the CDP. This post covers considerations to keep in mind as you begin the process of gathering data for Salesforce CDP ingestion and unification.

Salesforce CDP — Data Drives Value

1. What type of data do I need?

Diagram depicting data ingestion to drive action with Salesforce CDP
Salesforce CDP — Data Ingestion to Drive Action

Follow these steps to help you decide the types of data you need:

  1. First, determine the types of data you want to use for data unification and customer segmentation. As you do so, consider which use cases will need to be real time. An anti-pattern is to assume that you should automate everything out of Salesforce CDP. However, Salesforce CDP doesn’t support real-time ingestion unless you use the Ingestion API or real-time activation, so it’s very important to determine why and when you need real-time automation. For example, if you want to send out an email as soon as a purchase is made then it is better to send the transaction data directly to Marketing Cloud, instead of sending it first to CDP and then expecting the data to reach Marketing Cloud immediately.
  2. Next, determine what values you want to use for segmentation. Examples include purchase amounts, frequently purchased products, product last purchased, and so on.
  3. Finally, identify the fields you want to use for personalization. For example, if you’re a retailer, you may want to personalize content in marketing campaigns based on your customer’s favorite store.

2. How do I prepare my data?

Once you’ve decided what data you need, you’re ready to start preparing it:

  1. First, pre-scrub (cleanse) your data before ingestion. Identify fields that have poor data quality and address those issues before continuing.
  2. Next, de-dupe data prior to ingestion to reduce redundant data.
  3. Then populate and verify the quality of the required fields that are used in the starter reconciliation rules during the identity resolution process.
    a. Fuzzy name and normalized email address requires first name, last name, and email address.
    b. Fuzzy name and normalized phone requires first name, last name, and phone number.
    c. Fuzzy name and normalized address requires first name, last name, address line 1, state, and country.
  4. Finally, prepare your files, keeping in mind Supported File Formats and Delimiters.

3. What about keys?

Identify the enterprise-wide unique identifier that you plan to use to create your unified profiles. This is the ID that all profiles will be resolved against. This is important, as Salesforce CDP will not create an immutable enterprise-wide unique identifier for you. Example of unique identifiers include the Global (MDM) ID, Salesforce ID, or Loyalty ID.

It’s also important to have a primary key defined for each data source object. The primary key lets Salesforce CDP uniquely identify a record. Salesforce CDP doesn’t automatically generate primary keys; it expects these keys to be present in the source data as an attribute. You can use formula fields to create primary keys for those objects in the source data that don’t have an established primary key as an attribute and set it as the primary key for each such object.

4. What else should I consider while setting up the ingestion data streams?

  • Set up appropriate refresh schedules. Note: By default a full refresh for data extensions occurs once every 24 hours. In setup, the “hourly” schedule frequency refers to how often Salesforce CDP looks for incremental data from Marketing Cloud, but Marketing Cloud sends a full refresh only once per 24 hours — as a result it can take up to 24 hours for the initial data to appear in Salesforce CDP.
  • During AWS S3 setup, select the option to refresh only new files. Initial files should include the header so that the system can learn the file; you can turn this off in subsequent imports. Additionally set up alerts to be notified on missing files so that you will be notified if there is a problem with a missing file during a scheduled data stream refresh.
  • Validate data types during stream setup so that the right data type is selected. Salesforce CDP supports only three data types when ingesting the data source object: text, number, and date. Make sure to review the data types that are suggested for each field and edit if necessary during set up. Data types can’t be changed after the Data Source Object (DSO) is created.
  • By default, ingested dates/times are interpreted as being in UTC. Salesforce CDP localizes all dates/timestamps using the timezone configured for the org. If source data is not in UTC, the recommended approach is to include the time zone in the ingested data, and make sure that the right date format is selected in the data stream configuration.
  • While bringing in profile records during ingestion, realize that they must be overwritten in their entirety. This means that each time any value in the record needs to be updated, the entire record must be sent again. For example, the Lifetime Value field could change regularly for a customer, so you don’t want to bring that in with the primary profile information like their name and date of birth, which never change. A best practice is to split static and dynamic values into separate data streams.

Conclusion

Key considerations for Salesforce CDP implementations include identifying the data you need, preparing your data for ingestion, and establishing keys for your data. Follow the recommendations in this post to improve the quality of data you use for unification and prepare for a smoother, more successful implementation of Salesforce CDP.

--

--

Gina Nichols
Salesforce Architects

Gina Nichols is a Director on the Data Cloud product team with Salesforce.She is also a award winning co-author(STC Chicago).