Transitioning from Export to Data Lake to Azure Synapse Link: What You Need to Know?

Nadeem Khan(NK)
LearnWithNK
Published in
7 min readSep 3, 2024

--

As businesses increasingly rely on data-driven insights, choosing the right data storage and processing solutions is crucial. For those using Dynamics 365 Finance and Operations (FnO), Export to Data Lake has been a popular alternative to the traditional Bring Your Own Database (BYOD) approach. However, with its deprecation set for November 2024, it’s time to consider next-generation options like Azure Synapse Link and Link to Fabric.

In this blog, we’ll focus on Azure Synapse Link, which not only addresses the limitations of Export to Data Lake but also enhances performance, security, and integration, making it an excellent choice for staying competitive. We’ll dive into the key differences so you can smoothly transition and make the most of what Azure Synapse Link has to offer.

Storage Strategy for Incremental Data

Azure Synapse Link

Azure Synapse Link Stores Data in the Timestamped folder. Typically, the folder name includes a timestamp based on the UTC timezone, formatted as yyyy-MM-ddTHH:mm:ss.fffZ. Within, each folder a subfolder is created for each table that received changes for the interval of “Time Interval” set in Azure Synapse Link Profile.

The timestamped folder enables the downstream services to read data from any table without getting impacted by Synapse Link write operation on the same table.

It uses Append Only mode and stores data in CSV.

Export to Data Lake

Export to data lake stores the Incremental data in the Changefeed folder which follows more of a linear structure and is not timestamped. Within the Changefed folder, a subfolder for each table is created which stores the data for that table.

Since data is written in the same folder every time, downstream read operations can be impacted by the Export to Data Lake write operation. The only way to fix this is to add a retry in a downstream read operation.

It also uses Append Only mode and stores data in CSV.

Surrogate Key

Azure Synapse Link

Azure Synapse Link adds an “Id” column as a surrogate key, which contains the GUID value of RecId.

Export to Data Lake

Export to Data Lake adds “_SysRowId” as a surrogate key, which holds the same value as RecId.

When exporting data via Azure Synapse Link, any ‘Id’ column from a D365 table will be renamed to ‘fno_id’. For example, the ‘Id’ column from the DataArea table will be exported as ‘fno_id’.

Metadata fields

There are a total of 12 metadata fields in Azure Synapse Link. Out of 12, we will be focusing on the ones that have a common description as Export to Data Lake

Azure Synapse Link

When the record is created in the source system (D365 FnO)createdon

When the record is updated in the source system (D365 FnO) modifiedon

When the record is created in the target system (Azure Data Lake gen2) SinkCreatedOn

When the record is updated in the target system (Azure Data Lake gen2) SinkModifiedOn

Export to Data Lake

When the record is created in the source system (D365 FnO)LastProcessedChangeDatetime

When the record is updated in the source system (D365 FnO) LastProcessedChangeDatetime

When the record is created in the target system (Azure Data Lake gen2) DataLakeModifiedDatetime

When the record is updated in the target system (Azure Data Lake gen2) DataLakeModifiedDatetime

The LastProcessedChangeDatetime and DataLakeModifiedDatetime will be updated once a record is updated in the source and target system.

Data Flow

Azure Synapse Link

When changes are made in D365 FnO, they’re initially logged in a timestamped incremental folder. At this point, Synapse Links come into play, creating views based on this incremental data with a consistent naming convention like dbo.<table_name>_partitioned. These views act as a temporary staging area.

At the intervals you’ve set in the Synapse Link Profile, Delta Lake conversion jobs get to work. They pick up these changes, store them in Delta Lake format tables, and perform “In-Place” updates to the corresponding table in the “deltalake” folder. Once the data is securely captured in Delta Lake, Synapse Link wraps things up by deleting the incremental folder.

To learn more about how data flows from D365 to a data lake in Delta Parquet format via Azure Synapse Link, check out this blog.

Querying dbo.<table_name>_partitioned might result in an error since this view serves as a staging area specifically for Delta Lake conversion jobs.

Export to Data Lake

When changes are made in D365 FnO, they are initially logged in the Changefeed folder. From there, Export to Data Lake picks up these changes, stores them in CSV format tables, and performs “In-Place” updates to the corresponding table in the “Tables” folder.

How are Updates and Deletes stored within Incremental Folder?

Azure Synapse Link

Updates: For every update to a row in a table, a new row is appended to the table’s folder in the timestamped folder with an updated sysrowversion and versionnumber.

Deletes: For every delete in a table, a new row is appended to the table’s folder in the timestamped folder with all fields set to NULL except for the Synapse metadata field, and IsDelete is set to 1.

Export to Data Lake

Updates: For every update to a row in a table, a new row is appended to the table’s folder in the Changefeed folder with DML_Action set to "AFTER_UPDATE".

Deletes: For every delete in a table, a new row is appended to the table’s folder in the Changefeed folder with DML_Action set to "DELETE".

How are Updates and Deletes captured in the Regular table folder?

Azure Synapse Link

Updates: For every update to a row, the existing row in a table will be overridden based on the latest value for that row from the incremental folder.

Deletes: For every delete, the existing row is updated with all fields set to NULL except for the Synapse metadata field, and IsDelete is set to 1.

Export to Data Lake

Updates: For every update to a row, the existing row in a table will be overridden based on the latest value for that row from the incremental folder.

Deletes: For every delete, the existing row will be deleted.

Security

Azure Synapse Link

Identity Access Management: Azure Synapse Link uses a Managed Service Identity to handle authorization and authentication, providing the initial layer of security. Additionally, we can control user access within the organization by configuring the appropriate Role-Based Access Control (RBAC).

In-Transit Data Security: The Managed Service Identity establishes a secure channel between D365 FnO and Azure Data Lake Gen2 for data transfer.

Data At-Rest Security: Azure Synapse Link can connect to the Data Lake with restricted network settings. This enables us to either block all public access and rely solely on private endpoints for communication or allow access through a Virtual Network (VNet) and specific IP addresses.

Export to Data Lake

Identity Access Management: Export to Data Lake utilizes a Service Principal to manage authorization and authentication for communication, establishing the initial layer of security. Additionally, we can control user access within the organization by setting up the appropriate Role-Based Access Control (RBAC).

Data In-Transit Security: The data transferred between D365 FnO and Azure Data Lake Gen2 via Export to Data Lake travels over the Microsoft Network Backbone. While this provides some security, it’s not fully secure since the same network is used for other customers’ transactions between D365 FnO and Azure Data Lake Gen2.

Data At-Rest Security: Export to Data Lake can only connect to the Data Lake through a public network or a broad range of IPs within the D365 IP range. This limitation prevents us from implementing Network Restrictions or using private endpoints for secure data access. Consequently, the Data Lake used by Export to Data Lake is either exposed or restricted to a large range of D365 IPs.

Export to Data Lake offers little to no security, making it a less secure option. On the other hand, Azure Synapse Link is designed with strong security measures, providing a much more secure solution.

Data Access

Azure Synapse Link

Azure Synapse Link creates a Lake Database on top of the exported data. To access data from the Lake Database, you can use any authentication method except SQL Server Authentication.

However, since the Lake Database is part of the Serverless Pool, you can create a Serverless SQL Database and access the data from the Lake Database using any authentication method supported by traditional SQL. Use the following query to access the Lake Database data from the Serverless SQL Database:

SELECT * FROM <lake-database-name>.<schema>.<table_name>

Export to Data Lake

To access data exported by Export to Data Lake, you can create either a View or an External Table in the Synapse Serverless SQL Database. Once the View or External Table is set up, you can use any type of authentication supported by traditional SQL databases, such as SQL Authentication, Active Directory Authentication, Service Principal-based Authentication, or Managed Service Identity-based authentication.

If you found this guide helpful, don’t forget to clap 👏 and share it with your network! Have questions or experiences with D365 F&O data exports? Drop a comment below — I’d love to hear your thoughts and help clarify any confusion. Stay tuned for more in-depth blogs on Azure Synapse Link, where we’ll dive deeper into its inner workings and share key insights — especially if you’re migrating from Export to Data Lake. For the latest updates and expert tips on Microsoft’s evolving data solutions, follow me on Linkedin, Github and Medium.

Thanks for reading. Happy Learning.

--

--

Nadeem Khan(NK)
LearnWithNK

Lead Technical Architect specializing in Data Lakehouse Solutions with Azure Synapse, Python, and Azure tools. Passionate about data optimization and mentoring.