AWS-Appflow-Sharepoint

Soubhik Khankary
3 min readJan 25, 2024

--

Appflow integration with enterprise SharePoint application

About Appflow:

Appflow helps you to automate the flow to/from SAAS application into AWS services with a very low code implementation. We could implement full load or incremental load based on the requirement and also we could do simple transformation, partitions and aggregations based on the requirements.

Please follow the post for appflow-salesforce integration details:

Problem Statement:

We had a requirement to pull in the changed/modified files from sharepoint online. Only the modified objects must be detected and pulled into the S3 location at scheduled intervals daily.

Important requirements from the clients were:

  1. Low Code/No Code
  2. Low Maintenance
  3. Easy development and push the jobs to higher environments
  4. Secured way of establishing the connection with source

Solution:

Appflow is the AWS service which fulfilled all our requirements. I will also provide the CFT template which will fulfill the above requirements. I will also walk through the basic steps below :

Step_1: Create Connection with SharePoint online securely

Once we click on Connect button we will be redirected to key in username, password and rsa token. Once all the details are keyed in successfully the connection gets created successfully. Username and password gets stored in Secrets manager and only the users with proper permissions to fetch the credentials when required.

Step_2: Deploy AppFlow jobs scheduled to pull in the history+CDC(Changed Data Capture) data

Once the connection is successfully established it is time to create appflow jobs to pull in the modified documents from the sharepoint locations.

Create the appflow job using the CFT template below and after running it for the first time verify the S3 location for metadata file and document dumps.

AWSTemplateFormatVersion: '2010-09-09'
Parameters:
EnvPrefix:
Description: "Environment Prefix Variable "
Type: String
ScheduleStartTime:
Description: "Time at which the appflow would become active"
Type: String
BucketName:
Description: "Bucket name to land the file"
Type: String
ScheduleExpression:
Description: "Rate at which the incremental workflow has to take place"
Type: String
connectionnames:
Description: "Rate at which the incremental workflow has to take place"
Type: String
entityname:
Description: "entity name"
Type: String
Resources:
spFlow:
Type: AWS::AppFlow::Flow
Properties:
FlowName:
!Join
- '-'
- - !Ref EnvPrefix
- "projectname-appflow-sharepoint-incr"
Description: App Flow to import data from sharepoint to S3
TriggerConfig:
TriggerType: Scheduled
TriggerProperties:
DataPullMode: Incremental
ScheduleExpression: !Ref ScheduleExpression
ScheduleStartTime: !Ref ScheduleStartTime
TimeZone: US/Eastern

SourceFlowConfig:
ApiVersion: v1.0
IncrementalPullConfig:
DatetimeTypeFieldName: lastModifiedDateTime
ConnectorType: CustomConnector
ConnectorProfileName: !Ref connectionnames
SourceConnectorProperties:
CustomConnector:
EntityName: !Ref entityname
CustomProperties:
"subEntities": "[\"put your own drives\"]"
DestinationFlowConfigList:
- ConnectorType: S3
DestinationConnectorProperties:
S3:
BucketName:
!Join
- '-'
- - !Ref EnvPrefix
- !Ref BucketName
BucketPrefix: sharepoint/payload/incr
S3OutputFormatConfig:
FileType: JSON
AggregationConfig:
AggregationType: SingleFile
PrefixConfig:
PrefixFormat: DAY
PrefixType: PATH

Tasks:
- TaskType: Map_all
SourceFields: []
TaskProperties:
- Key: EXCLUDE_SOURCE_FIELDS_LIST
Value: '[]'
ConnectorOperator:
CustomConnector: NO_OP

File Metadata Structure:

Limitation of Appflow with SharePoint Online:

Please let me know if you find the blog useful. The CFT template shared above must be saving your time while doing development. Thanks Again !!!

References Used:

https://docs.aws.amazon.com/appflow/latest/userguide/connectors-microsoft-sharepoint-online.html

--

--

Soubhik Khankary

Data Engineer by job , Teaching computers by stats and love to learn never endless math.