AWS-Appflow-Salesforce

Soubhik Khankary
4 min readJan 24, 2024

--

Appflow Integration with enterprise application salesforce

About Appflow:

Appflow helps you to automate the flow to/from SAAS application into AWS services with a very low code implementation. We could implement full load or incremental load based on the requirement and also we could do simple transformation, partitions and aggregations based on the requirements.

Requirement:

Currently I had a requirement to pull the data from salesforce and push to AWS S3 service securely. The data has to undergo multiple data quality checks and transformation rules before we load the data into redshift. Also the appflow jobs must be created via IAAS(Infrastructue as a Service code) using either CFT or terraform(Used CFT for the current discussion). Appflow load must be scheduled daily once and undergo incremental load only.(Schedules could be changed based on requirement)

Implementation:

Step_1:

It is very important to remember that when appflow is scheduled for incremental load it only pulls the last 90 days records and subsequently it goes for incremental loads.

To pull the data from salesforce object which has records older than 90 days we must create a appflow job in on demand mode which pulls in the complete historical data from source.

Note: Depends on how old the data should be pulled in for analysis purpose.

Simple Diagram:

Step_2:

Create a connection with salesforce application and it is always good to create the connection manually. Generally the connection are created manually by salesforce administrators who will authenticate the the access by providing OTP or RSA token.

Go to appflow section in AWS and click on connection section.

Choose salesforce in the connectors dropdown.

Click on Create connection. Post the click you will be prompted to enter the details like username and password for the salesforce environment.

Salesforce login page would pop up once we click on connect button. Administrator must be able to key in username/password along with RSA token.Once the connection is successfully established you could see them in connections tab. Username and password is stored in secrets manager automatically .(Will talk about this service in other blogs)

Step_3:

Creating appflow for multiple objects would be a tough job if done manually. So we will have to use IAAC code setup like CFT template to make the deployment smoother, faster and error free. I will be uploading the CFT template sample for your help.

AWSTemplateFormatVersion: '2010-09-09'
Parameters:
EnvPrefix:
Description: "Environment Prefix Variable "
Type: String
ScheduleStartTime:
Description: "Time at which the appflow would become active"
Type: String
BucketName:
Description: "Bucket name to land the file"
Type: String
ScheduleExpression:
Description: "Rate at which the incremtal workflow has to take place"
Type: String
connectionnames:
Description: "Rate at which the incremtal workflow has to take place"
Type: String
Resources:
AccountFlow:
Type: AWS::AppFlow::Flow
Properties:
FlowName:
!Join
- '-'
- - !Ref EnvPrefix
- 'projectname-appflow-salesforce-objectname-full'
Description: App Flow to import data from salesforce to S3 for table object_name
TriggerConfig:
TriggerType: OnDemand

SourceFlowConfig:
ConnectorType: Salesforce
ConnectorProfileName: !Ref connectionnames
SourceConnectorProperties:
Salesforce:
Object: Object_name
EnableDynamicFieldUpdate: false
IncludeDeletedRecords: true
DestinationFlowConfigList:
- ConnectorType: S3
DestinationConnectorProperties:
S3:
BucketName:
!Join
- '-'
- - !Ref EnvPrefix
- !Ref BucketName
BucketPrefix: projectname/salesforce/object_name/payload/full
S3OutputFormatConfig:
FileType: JSON
AggregationConfig:
AggregationType: SingleFile
PrefixConfig:
PrefixFormat: DAY
PrefixType: PATH

Tasks:
- TaskType: Map_all
SourceFields: []
TaskProperties:
- Key: EXCLUDE_SOURCE_FIELDS_LIST
Value: '[]'
ConnectorOperator:
Salesforce: NO_OP


Quick short description:

  1. Mention the CFT template version
  2. Mention the variables that needs to be parameterized for different env like dev or qa
  3. Mention the source connection that was created along with object name
  4. Mention the destination s3 location where the data would be dumped. Data format could be in json, csv or parquet based on the requirement which could vary.
  5. In the above shared CFT all the fields are getting mapped
  6. If the full load schedule has to be incremental just change the below code block
TriggerType: Scheduled
TriggerProperties:
DataPullMode: Incremental
ScheduleExpression: !Ref ScheduleExpression
ScheduleStartTime: !Ref ScheduleStartTime
TimeZone: US/Eastern

Please upload the CFT template after making some changes based on the requirement and you could see the appflow job would be created very quickly with in few minutes.

Conclusion:

We were able to create and pull in mutliple objects from salesforce using appflow and level of code & maintenance was quite low . Please let me know if you have anymore questions. Thanks !!!

--

--

Soubhik Khankary

Data Engineer by job , Teaching computers by stats and love to learn never endless math.