Salesforce Data Cloud Utility and Ingestion Api UI Setup Instructions

Step by step instructions on how to setup and configure the Lightweight - Data Cloud Utility and Ingestion API UI

Justus van den Berg
14 min readFeb 29, 2024

Introduction

This article is a step by step setup guide on how to setup the “Lightweight — Data Cloud Utility” based on a sample scenario.
The aim of this article is to combine the steps from multiple previous articles into a single guide that describes the setup steps from start to finish.
There still will be a reference for setting up the Auth Provider as that is a separate, stand-alone topic that is simply to big to add here.

All package and app related info is installed on the org that will connect to Data Cloud. This is referred to as the “local” org.
All Data Cloud related setup, like the Ingestion API or Connected Apps are done on the Org that hosts Data Cloud. This is referred to as the “Data Cloud” Org.

Install packages

Install the (dependent) packages in the below order on the local org. It are a lot of packages. A consolidated stable version will be released in the future.

01) Lightweight — Apex Test Util v2 — (Blog 01 / Blog 02)
02) Lightweight — LWC Util — (Blog)
03) Lightweight — REST Util
04) Lightweight — JSON Util
05) Lightweight — Auth Provider Util — (Blog)
06) Lightweight — Data Cloud Auth Provider — (Blog)
07) Lightweight — Salesforce Auth Provider
08) Lightweight — Data Cloud Util — (Blog 01 / Blog 02)

Package 05, 06 and 07 are optional but recommended as they are part of the overall framework. You can use the below sf commands to install the packages using the Salesforce CLI.

See “Install packages from a URL” for more details on installing the packages or use the scripts below if you plan on using the CLI.

MANAGED Packages Install Script

# Managed package - Lightweight - Apex Unit Test Util v2@2.4.0-1
sf package install --package "04tP3000000M6OXIA0" -w 30 --target-org "Org"

# Managed package - Lightweight - Apex LWC Util@0.6.0-1
sf package install --package "04tP3000000T7ZBIA0" -w 30 --target-org "Org"

# Managed package - Lightweight - JSON Util@0.7.0-1
sf package install --package "04tP3000000T7m5IAC" -w 30 --target-org "Org"

# Managed package - Lightweight - REST Util@0.11.0-1
sf package install --package "04tP3000000M6gHIAS" -w 30 --target-org "Org"

# Managed package - Lightweight - Data Cloud Util@0.8.0-1
sf package install --package "04tP3000000TKD3IAO" -w 30 --target-org "Org"

## OPTIONAL BUT RECOMMENDED

## Managed package - Lightweight - Auth Provider Util v2@0.12.0-1
sf package install --package "04tP3000000MVUzIAO" -w 30 --target-org "Org"

## Managed package - Lightweight - Data Cloud Auth Provider@0.5.0-1
sf package install --package "04tP3000000M6y1IAC" -w 30 --target-org "Org"

## Managed package - Lightweight - Salesforce Auth Provider@0.5.0-1
sf package install --package "04tP3000000MCLtIAO" -w 30 --target-org "Org"

UNLOCKED Packages Install Script

# Unlocked Package - Lightweight - Apex Unit Test Util v2@2.4.0-1
sf package install --package "04tP3000000M6Q9IAK" -w 30 --target-org "Org"

# Unlocked Package - Lightweight - Apex LWC Util@0.6.0-1
sf package install --package "04tP3000000T7cPIAS" -w 30 --target-org "Org"

# Unlocked Package - Lightweight - JSON Util@0.7.0-1
sf package install --package "04tP3000000T7nhIAC" -w 30 --target-org "Org"

# Unlocked Package - Lightweight - REST Util@0.11.0-1
sf package install --package "04tP3000000M6htIAC" -w 30 --target-org "Org"

# Unlocked Package - Lightweight - Data Cloud Util@0.8.0-1
sf package install --package "04tP3000000TK9pIAG" -w 30 --target-org "Org"

## OPTIONAL BUT RECOMMENDED

## Unlocked Package - Lightweight - Auth Provider Util v2@0.12.0-1
sf package install --package "04tP3000000MW1FIAW" -w 30 --target-org "Org"

## Unlocked Package - Lightweight - Data Cloud Auth Provider@0.5.0-1
sf package install --package "04tP3000000M6zdIAC" -w 30 --target-org "Org"

## Unlocked Package - Lightweight - Salesforce Auth Provider@0.5.0-1
sf package install --package "04tP3000000MCNVIA4" -w 30 --target-org "Org"

Assign Permission Sets

Once the packages are installed, assign the following permission sets to the integration user / the user that will be using the UI Application:

  • Lightweight — Apex Unit Test Util v2
  • Lightweight — LWC Util
  • Lightweight — JSON Util
  • Lightweight — REST Util
  • Lightweight — Data Cloud Util
  • Lightweight — Data Cloud Util UI
  • Lightweight — Auth Provider Util
  • Lightweight — Data Cloud Auth Provider
  • Lightweight — Salesforce Auth Provider

You can use the below sf commands to assign the permissions using the Salesforce CLI.

## Assign required permission sets
sf org assign permset --name "Lightweight_Apex_Unit_Test_Util_v2"
sf org assign permset --name "Lightweight_LWC_Util"
sf org assign permset --name "Lightweight_JSON_Util"
sf org assign permset --name "Lightweight_REST_Util"
sf org assign permset --name "Lightweight_Data_Cloud_Util"
sf org assign permset --name "Lightweight_Data_Cloud_Util_UI"

## OPTIONAL
sf org assign permset --name "Lightweight_Auth_Provider_Util"
sf org assign permset --name "Lightweight_Data_Cloud_Auth_Provider"
sf org assign permset --name "Lightweight_Salesforce_Auth_Provider"

Setup the External Orgs and Auth Provider

The first setup step is to configure the connected apps on the (external) Data Cloud org(s) and to set up the Auth Provider on the local org that will connect to the external Data Cloud org(s).
The detailed guide on how to do this can be found here:

https://medium.com/@justusvandenberg/connect-to-the-native-salesforce-data-cloud-api-through-named-credentials-using-a-custom-auth-9b900d0fabcf

Create an example sObject

At this stage you should have the following on your local org.

  • A Data Cloud Auth Provider set up
  • An External Credential Configured
  • A Named Credential using the External Credential set up and tested

The next step is building a custom Salesforce Object whose data we want to send to Data Cloud. The choice for a custom object is purely for demo purposes. It can be any type of sObject or Apex Data Structure.

I am creating an object called “Smart_Demo__c” with a field that matches each data type that Data Cloud supports. This allows you to play around with all data types. I also add a UUID field as the primary key field. The CreatedData field will be used as the Event Time Field.

  • Id
  • Name
  • CreatedDate
  • UUID__c
  • TextField__c
  • NumberField__c
  • DateField__c
  • DateTimeField__c
  • EmailField__c,
  • UrlField__c
  • PhoneField__c
  • PercentField__c
  • BooleanField__c

Generate a YAML from an sObject

Now we have an sObject we want to create a YAML file that matches the fields so we can create a mapping. We can do this using the YAML tool in the Data Cloud Utility App.

  • Open the Data Cloud Utility App and click on the Data Cloud Utility Tab
  • In the component called “Create Data Cloud Ingestion API YAML from sObject”, select the Smart_Demo__c object
  • Select the Id, Name, CreatedDate and all the fields we have created
  • Now press the “Create YAML” Button. The YAML will show in a modal.
  • Press the “Download” button and save the YAML file on your machine.

Create a Data Cloud Ingestion API & Data Stream

Now it’s time to go to your Data Cloud Org, go to Data Cloud Setup and go the Ingestion API section in the menu.

  • Click “New” and Name the Connector “Smart Demo
  • In the Schema section of your new connector click “Upload File” and upload the YAML file you created in the previous section.
  • Check the data types and press “Save” when you’re happy
  • Open the Data Cloud App and Click the Data Streams Tab
  • Click “New”, Select Ingestion API as type and click “Next”
  • Select the “Smart Demo” Ingestion API and check the Smart Demo Object
  • Click Next
  • Set the category to “Other”
  • !! Keep in mind the cost implications for the category you select !!
  • Set the Primary Key to the UUID field and the Event Time field to CreatedDate
  • Click “Next”, select an optional data space and press “Deploy”
  • You’ll be redirected to the Data Stream. Copy the Object API Name as we will need this later. Example: Smart_Demo_Smart_Demo_EB5D1D66__dll

Create a Metadata Configuration

We’re done on the Data Cloud Org and we’re going back to the local Org. The next step is creating the metadata configuration file for our new Ingestion API.

  • Go to Setup > Custom Metadata Types > Data Cloud Ingestion API Configuration and click “Manage Records”.
  • Click the “New” button
  • Give the configuration a Label and a Name, the name is the unique name that will be required when you reference this metadata record from Apex
  • Populate the “Named Credential” Name field with the Named Credential pointing the native Data Cloud API
  • Specify the connector name, in our case “Smart_Demo
  • Populate the Target Object Name, in our case also “Smart_Demo
  • The field “Salesforce Named Credential” is for future use.
  • The “sObject Name” field is used to generate mappings. I set the sObject name to the sObject we’ve created earlier: “Smart_Demo__c
  • In order to generate Data Cloud Queries Based on the Mapping you can populate the “Data Lake Object Name” field
Fill in the required details

Now we have the main configuration record we’re going to create the mapping records.

  • Go to Setup > Custom Metadata Types > Data Cloud Ingestion API Field Mapping and click “Manage Records”
  • For each field we want to map we’ll have to create a mapping field. This field will be included in example queries and CSV templates
  • Give your mapping record a label and a unique name. I chose a unique name that includes the order, but it’s not really important. It just has to be unique for all your metadata.
  • The source field is the name of the field in the sObject and the Target field is the name we specified in the YAML file for that same field. The Data Source Object Field name.
  • Select the configuration record we just created
  • And lastly select the field type in Data Cloud. Note that the value “uuidField” is handled as a text field, but in the templates it generates a UUID as test data.
Example of creating a mapping record between a Salesforce sObject Field and a Data Cloud Data Source Object Field

Now we have setup all our mapping fields the Data Cloud Ingestion API Configuration record will look like this:

The setup is now complete :-)

Testing the configuration

The first step in testing is validating that all field types are as expected.

  • Open the “Data Cloud Utility” App from the app launcher and go to the “Data Cloud Ingestion” Tab
  • Select the “Smart Demo” Metadata Configuration from the picklist in the “Data Cloud Streaming Ingestion Utility” component.
  • A payload is automatically generated based on all the fields in the mapping and the field types defined in the mapping.
Auto generate a sample payload based on the metadata configuration and send it to the test endpoint or stream it live to Data Cloud
  • Now press the “Test” button. If there are any errors they will show. If is all ok You’ll be ok to press “Send” button and start streaming a test record to Data Cloud
  • If there are any errors with the payload format you’ll get an error message with the issues That looks something like below:
Example of a field validation error where an Integer was put in a String field
  • If all is OK you get a Success alert.
Success alert when all went well
  • To quickly get an overview of the fields that are mapped and the configuration click the “Show Mapping” button. This will give the full overview of your configuration and the field mapping in a single page.
Full configuration overview

Testing Bulk uploads

  • Open the “Data Cloud Utility” App and go to the “Data Cloud Ingestion” Tab
  • Select the “Smart Demo” Metadata Configuration from the picklist in the “Data Cloud Bulk Ingestion Utility” component.
  • Click “Create Upsert Job”. A new job is now created.
  • From the jobs list click on the actions arrow and select “Add CSV Data”. This will open a modal with Sample data.
    This sample data can be used as a template to load CSV data into Data Cloud.
  • Press the “Add CSV” button to add the data.
  • Click the actions arrow and select the “Complete Job Action”. If you press refresh the status will change to “UploadComplete” and then change to “InProgress” to eventually “Done” or “Failed”.
Bulk Ingestion Utility Overview

Query the test data

Once the bulk job is done we have inserted some test data through both streaming and bulk ingestion. It’s now time to run a query against the data lake object (__dll) and check if everything has went successfully.

  • Open the “Data Cloud Utility” App and go to the “Data Cloud Query” Tab
  • Select the “Smart Demo” Metadata Configuration from the picklist in the “Data Cloud Query Utility” component.
  • A query is automatically generated based on the target fields in the field mappings
  • It does not matter what API version you use, v1 is default
Run a query against any Data Cloud object. In this example our Smart Demo Data Lake Object
  • Set the result format to Data Table and press “Execute Query” Button
  • If all went well you’ll get a modal with your freshly inserted data. Press the “Close” button to close the modal
The Data Cloud query result formatted in a Lightning datatable
  • Change the result format value to “CSV” and press the “Execute Query” Button again.
  • When you view the data as CSV the headers are mapped. So the headers do not contain the __c prefix anymore but now match the Data Source Object field name as specified in the YAML. This is required to match the format that the Ingestion API expects.
  • Click the “Download” button at the bottom and save the CSV file to your computer. We’re going to use this file to delete the test data.
  • I found the best way of deleting records is to only query the primary key field and just have a single field in your CSV. This seems the least error prone.
The Data Cloud Query result formatted in a CSV format with headers that can be used with Bulk Ingestion
A Data Cloud Query Result as RAW API response

Deleting the test data

A great advantage of this tool is that it makes it really easy to delete data from Data Cloud with an admin friendly interface.

  • Open the “Data Cloud Utility” App and go to the “Data Cloud Ingestion” Tab
  • Select the “Smart Demo” Metadata Configuration from the picklist in the “Data Cloud Bulk Ingestion Utility” component.
  • Click “Create Delete Job”. A new job is now created.
  • From the jobs list click on the actions arrow and select “Upload CSV”.
  • This will open a modal where you can upload a CSV file.
  • Select the CSV file we downloaded trough the Data Cloud Query component.
  • On success you will get an alert and it will show you the temporary ContentVersion details. On success the file is uploaded to Data Cloud and removed from Salesforce. If there are any errors the file should be deleted, but it’s worth checking the delete completed successfully.
  • Click the actions arrow and select the “Complete Job Action”. If you press refresh the status will change to “UploadComplete” and then change to “InProgress” to eventually “Done” or “Failed”.
  • Once completed you can now go back to the Data Cloud Query component, run the query again, and if all went well the test data is all deleted.

Converting Salesforce Query data to a Data Cloud CSV format

Say you want to load data from one of your Salesforce Objects that you have mapped in the custom metadata configuration, you’ll have to convert this data to a CSV format with the correct headers.

The “Data Cloud — sObject to CSV Utility” component will help you to do this. In this example we’re going to query the Smart_Bill__c sObject and convert this data to CSV.

  • Open the “Data Cloud Util” App and go to the “Data Cloud Utility” Tab
  • In the “Data Cloud — sObject to CSV Utility” select your sObject from the metadata configuration.
  • Based on the mapping and the sObject specified in the custom metadata a SOQL query is automatically generated. You can call the tooling API as well.
  • Once you’re happy with the query you can press the “Execute” button to get the results in a Lightning datatable format.
  • Press the “Close” button and now update the “Result format” value to “CSV” and press the “Execute” button again
  • A CSV formatted file with the correct headers is now created
  • Press the “Download” button and save the file to your computer
  • You are now ready to upsert this file through the Bulk Ingestion Utility
  • You have successfully converted Salesforce Data to Data Cloud CSV ready formatted data based on your mapping
  • Please note that the mapping is not mandatory, it’s simply a tool that can help you configure your Ingestion API between a Salesforce object and a Data Cloud Data Source Object.

Conclusion

It’s a few packages to install and a little bit of work to complete the initial setup of the named credentials, but once that is all done it’s pretty straightforward to start creating a configuration and matching mapping records.

We have seen the UI utilities in action; by creating a YAML file based on an sObject we were able to create an Ingestion API configuration in a few clicks.

Using the ingestion tools we could very easily test both streaming and bulk data ingestion using auto generated test data templates. Admins can use these templates to create and validate the data format of the data they will load.
We created even more test date based on data stored in an sObject using a transformation to a properly formatted CSV file.

Last but definitely not least we created a delete bulk job to delete all our test data in bulk all using the UI: Making a difficult job easy.

Note on Limits

It’s really nice to have an alternative to the CRM Connector to get data into Data Cloud, but we have to live with the governor limits.

If you create a bulk job system that runs async, you can handle CSV files up to roughly 10MB. And through the UI about 5MB using upload. Depending on the number of fields that is a significant amount of data.
The best thing is to play around with some file sizes and see what works best.

There is no limit on outbound API calls from Apex but do keep in mind that Data Cloud charges based on the number of records processed. Always speak to your certified implementation partner to make sure you don’t get any surprises.

Final Note

At the time of writing I am a Salesforce employee, the above article describes my personal views and techniques only. They are in no way, shape or form official advice. It’s purely informative.
Anything from the article is not per definition the view of Salesforce as an Organization.

Additional resources

Cover Image generated using Microsoft AI Image Generator

--

--