Amazon AppFlow -Enterprise Integration Made Easy

Phanikiran Patruni
5 min readDec 27, 2022

--

Amazon AppFlow is a fully-managed integration service (SaaS) that helps us integrate data software as a service (SaaS) applications, such as Salesforce, and AWS services, such as Amazon Simple Storage Service (Amazon S3) and Amazon Redshift. We can integrate multiple data systems with AWS services. This enables us to use the power of integration with serverless Architecture and eliminate the need of building new applications for data transfer and transformation.

AppFlow can connect with multiple data sources outside of AWS and bring in data into the AWS ecosystem. For Enterprise organizations where we have on-prem/SAAS/Hybrid based solutions, AppFlow can play an important role in integrating data from on-prem to Cloud. AppFlow also gives the flexibility to integrate applications in native cloud also.

Today in this blog, we are going to cover integration between S3 and RDS Postgres database integration which will highlight several features of AppFlow ( Integration, Transformation, Rules, Error Flow, Validation & Filters )

To start with we will need the following applications in place

1. AWS Account

2. RDS Postgres database

3. S3

4. Sample Data

The high level Architecture as shown below depicts how AppFlow pulls data from S3, transforms the data, runs validations and then ingests the data into Postgres which can be used by downstream applications. For reference perspective, Salesforce and other systems also shown so as to make the point about different integration scenarios.

High Level Flow

At a high level, AppFlow will have below components which will be required to configure this particular flow.

  1. Connections
  2. Flows
  3. Users

Step 1: Create a connection with Postgres Database

The 1st step is to create a connection ( target connection ) to Postgres database. It is a simple step in which we create the connection giving all the basic inputs ( dbhost, dbname, userid, password etc.). Once tested and successful, the connection can we reused for multiple flow

Step 2: Create the Flow Integration- Source Configuration

To create a flow, we will use the “Create Flow” option to go ahead and create the integration flow between S3 and Postgres.

The 1st configuration will be the source configuration. S3 will be configured along with the bucket details, data format preference (csv, json ). We have to ensure that the bucket is accessible and the file which will be present in the bucket will be of csv format.

Step 3: AppFlow Destination Configuration

The destination configuration is an important step as it contains different destination configurations for different systems.

For this use case we will be using the Postgres option, API Version, PostgreSQL object (scheme ), PostgreSQL subobject (table )

The additional configurations will ensure the flow responds accordingly whenever there is an error. For example we configured the current flow to upload error data whenever there is a record that does not match the criteria or there is unknown error.

Flow trigger provides the option to trigger on demand, on a defined schedule, event based trigger. Event based is currently limited to some sources and destinations.

Step 4: Source to Destination Mapping

This step takes care of the important mapping exercise between source and target fields. We can manually map the fields, upload a csv mapping file or simply a passthrough data.

The data can be inserted, update existing records, upsert and delete existing records. Mapping of the fields is a straightforward configuration. The target field data can be modified and also validated for data quality. This will ensure the target system will have the data cleansed before it is inserted into the system.

Step 5: Running the Flow

Once the flow is successfully created it can be run using the “Run Flow” options and upon successful completion, the status will be displayed ( success/failure ).

Filter options can also be added which will ensure the data which will pass through this filter will only be inserted into the target database.

Upon successful completion of the flow, the data will be transformed and ingested into the target database. All this configuration takes very less time and this helps to bring solutions to market at an accelerated pace. AppFlow has multiple connectors which can be used to connect multiple SaaS systems like Salesforce, GitHub, GitLab's, Google Analytics, Facebook Page Insights, Microsoft Teams, SendGrid, ServiceNow etc. AppFlow also gives the options of developing custom connectors which can be used for data integration, transformation and ingestion.

As the Enterprise organizations are going through multiple mergers and acquisitions, integration becomes an integral part of Architecture of many systems. AppFlow can play an important role in these integrations and being a SaaS product, the time to market for any integration can be lowered very much.

Thank you for taking time to go through this blog. Please feel free to provide your comments and suggestions.

--

--

Phanikiran Patruni

Phanikiran is a recognized AWS Ambassador. He is a Cloud & Integration Architect AWS Certified Solution Arch Professional & Certified DevOps Professional.