Data Migration Framework & Best Practices

Cevvavijay
10 min readOct 7, 2023

--

Introduction

Data migration is a necessary investment while maturing as a data-driven organisation. As the volume, variety, and velocity of data continue to increase, you’ll likely find that your old systems, software, and databases are no longer suitable for your needs. In such situations, data migration remains one of the solutions for organisations to implement.

Among the many definitions of data migration, the ones that I found most useful are as follows:

  1. The process of moving data from one location, format, or application to another is called data migration. Generally, this is the result of introducing a new system to house data. — Netapps

2. It is the process of selecting, preparing, extracting, and transforming data and permanently transferring it from one computer storage system to another. — Wikipedia

What Drives Data Migration?

The driver behind a data migration project is usually an application migration or consolidation in which new applications replace or augment legacy systems, and both share the same dataset.

Most businesses use cases for such migrations are to take advantage of recent technological developments that help to process data faster, cheaper, and in a more efficient manner. For example, data migrations can start as firms move from on-premises infrastructure and applications to cloud-based storage to optimize or transform their company.

A few business drivers are listed below:

· Modern applications offer far greater functionality in streamlining workflows, improving the customer experience, and providing opportunities for business growth.

· The decision to expand selected lines of businesses into e-trading channels or other distribution models

· Ease of product design and development supported by advancement in technology. Sometimes, this is the only option to gain the agility required to respond to changing market conditions.

· Simplify data systems by consolidating them on a single platform and enable a single view of the business.

· Simplification of complex data landscape that comes as a piece of baggage with frequent mergers and acquisitions.

The Framework:

This below diagram provides an overarching framework to approach a data platform migration.

Classification and Business Case driving Data Migration Activities

While the data migration project classification involves checking if there is a change in the underlying technology, business case is the need that drives these projects.

Data Migration Classification

Symmetric Data Migration — Movement of data from an application/ storage to the same application/ storage but on a newer version.

These kinds of migration improve infrastructure capability. It ensures that a new functionality critical for business growth is in place.

These migrations are usually low-complex projects as it involves the transfer of like-to-like datasets from one application to another.

When migrating from an old version of the application to the latest one, the complexity can be moderately high, for example, SQL server 2008 to SQL server 2022.

Examples of symmetric data migration are SQL Server migrations, Core banking application upgrades, File system upgrades, ETL tool version upgrades, Database upgrades, etc.

Asymmetric Data Migration — Under this migration project, we move data from an existing application/storage to a new application/ storage. With such projects, organizations generally move from existing technology vendor to the new one. The objective behind such projects is to reduce operational and license costs. Sometimes, the reason can also be to utilize additional functionalities available with the technology provided by competitive vendors.

Such projects are highly complex because they require data transformation such that they can fit into the new container.

A few examples of symmetric data migration are the movement of core banking applications from Temenos to SAP, movement of data from one cloud provider to another, and movement of data from traditional on-premise data warehouse to a PaaS-based data warehouse on the cloud, etc.

Business Case driving Data Migration Activities

The need to transfer and convert data can be due to different business requirements. Therefore, the approach taken changes based on those needs. Below is the list of the four needs that drive migration projects.

Application Migration — The process of migrating an application program from one environment to another. It may include moving the entire application from an on-premises IT center to the cloud, moving between clouds, or simply moving the underlying data to a new form of application hosted by a software provider.

Platform Migrations — The process of moving data, applications, or other business elements from existing platform to another. For example, Data stage to Talend migration, Php to .NET migration, etc.

Database Migrations — The process of moving data, applications, or other business elements from an on-premises data center to a cloud or from one cloud to another.

Storage Migrations — The process of moving data off existing arrays into more modern ones that enable other systems to access it. It ensures faster performance, cost-effective scaling and provides data management features such as cloning, snapshots, backup, and disaster recovery.

Approaches to Data Migration

We can approach a data migration project in one of two ways:

Big Bang Data Migration

In a big bang scenario, you move all data assets from source to target in one go and within a relatively short time window. While the migration is one, the systems are down and unavailable for users. The migration is typically executed during a holiday or weekend when customers presumably don’t use the application. Such migrations are less costly, less complex, and take less time, and all activities happen once. Sometimes, these involve a high risk of expensive failure that might lead to undesirable application downtime. This approach is best suited for symmetric data migration.

Agile Data Migration

In this scenario, we break the entire process into sub-migrations, each with its own goals, timelines, scope, and quality checks, and usually delivered using agile project methodology. This approach is best suited for larger applications with independent modules & highly critical applications.

Agile migration involves parallel running old and new systems and transferring data incrementally. As a result, you ensure complete customer satisfaction due to zero application downtime.

On the other hand, the iterative strategy takes a lot longer and adds complexity to the project. The project team must track the data elements not migrated yet and ensure that users can switch between two systems to access the required information.

Another approach is to keep the old application entirely operational until the end of the migration. As a result, your clients use the old system as usual and switch to the new application only when the complete migration is successful.

This approach is less prone to unexpected failures and does not require application downtime. These advantages come with a few shortfalls, the project is more expensive, takes more time, needs extra effort, and requires resources to maintain two systems running in parallel.

Data Migration Project Phases

A data migration project goes through the same phases as any other IT Project:

The complexity and duration of these phases will vary based on the classification, business case, and approach taken for the project. The details of each project phase are as follows:

Project scoping & estimation

Every data migration activity requires a clear understanding of the project scope to be delivered, required capabilities and skill sets for successful delivery, and impacted stakeholders while the migration is ongoing.

Scoping of the project depends on the classification and need behind the migration project. The classification helps defining its complexity whereas the business case helps in determining the required skill sets. It also helps outline project tasks/activities that further helps in better project cost estimation.

Defining the approach during this phase helps plan rollout activities required during the deployment phase.

A few business considerations that would impact the migration cost are as follows:

  • What will be the impact on the BAU process?
  • Workaround solutions to mitigate the identified impact
  • Pre-migration activities on the source systems
  • Post-migration testing
  • Impact of consuming application from Target system.
  • Approach to migration consuming applications from source to the target system.
  • Defining during & post-migration processes that are required to support BAU activities.

Profiling of Source & target container

During this phase, we perform a high-level analysis of the source & target containers to understand the complexity involved in migration activities. It will further define the depth of the capabilities required to migrate data. This phase also indicates any technology-related roadblocks that one can encounter. The duration and depth of this analysis will entirely depend on the classification of the migration type.

Migration Solution

In this phase, the project team drafts a detailed migration approach and shares it with the project sponsors and stakeholders. The migration approach contains the set of activities as listed below (may vary from project to project)

Technology

  • How would the data be extracted from the source? ( volumes, History , current & post-migration )
  • When will the data be extracted? ( Timings )
  • Infrastructure requirements. ( Where )
  • Security requirements (Infrastructure & PII data elements ).
  • Technology complexities and work around solutions.
  • Automation capabilities to migrate and test the data post migration
  • Refinement Cost estimation for infrastructure, migration & post migration activities
  • Tools required for migration. ( Cost estimation )
  • Third-party accelerators required for migration.
  • Testing approach to testing the data between source & Target containers.

People — Skills required for migration.

Project Planning

A key element of project execution is to define the project management methodology and the roles and responsibilities to perform different tasks in the project.

  • Project development and implementation methodologies
  • Reporting structure
  • Roles and responsibilities
  • Cost management structure
  • Escalation procedure
  • Project ceremonies
  • The approach to identify & resolve roadblocks
  • Stakeholder reporting structure and frequencies
  • Communication with the external application owner
  • Change management approach

Testing Approaches

Following is the list of tests performed during the project execution:

  • Business User testing
  • Consuming application testing
  • Migrated data testing approach

Post Migration Activities

Once the migration is complete, the project team performs activities that fall into three buckets. We define these buckets based on the classification, business case, and approach used in the project.

  • Parallel runs between source & target containers
  • Business review and sign-off to decommission the source container
  • Archival of the source container for a specific period.

Best Practices

Moving critical information is a delicate task and should be treated as such. Here are some best practices to ensure your migration project goes smoothly.

  • Create and Follow a Migration Plan — One should prepare a concrete plan for what data needs require migration, where it will go, and how we will get it there. It should also set parameters for access permissions. The migration plan should outline each step and who will be involved. Also, consider the potential system downtime and technical or compatibility issues. Data integrity and protection should also feature prominently in your data migration plan.
  • Build Strong, Allied Teams for Data Migration — Knowledge resources to understand the existing system and process will be critical for a successful migration. Access to these during the planning and testing phase ensures that we completely understand the data and processes surrounding it. Business subject matter experts who use the data will be critical to sign off on the migrated data assets.
  • Choosing the correct accelerators for Data Migration — It is critical to select the right tool sets for the migration activity as this can significantly reduce the overall migration cost, timelines & complexity. It will also enable a systematic, streamlined & repeatable approach that helps reduce manual errors and therefore increase the overall quality of migrated data.
  • Test & Validate Migrated Data — After successful migration, ensure everything is where it should be. Part of testing and validation is creating an automatic retention policy to prevent data leakage. Also, ensure to clean up stale data and double-check permissions. Back up the old legacy systems so that you have data access if the system goes offline.
  • Communication — Data migration usually involves a host of different teams, often competing like users, new system providers, old system experts, migration experts, and project managers. Encourage these teams to communicate and share data and system issues to create a central repository for data quality. It will improve visibility and avoids a silo mentality.
  • Adoption — You would need people to agree to turn off the old systems. Therefore, identify these people and bring them with you. Decide what documents they will sign off on and ensure that these documents are easy to comprehend.
  • Audit and Document Processes — One thing that your compliance team will appreciate is complete documentation. Depending on your industry, regulators may require proof that you have taken adequate or reasonable care of sensitive data like financial or healthcare information. Auditing the process will ensure that you have done everything correctly and will help defining scope of improvement.

Commonly Asked FAQ’s

How do I determine which data migration approach is best suited for my project?

What are the common challenges faced during the data migration process, and how can they be mitigated?

Are there any specific tools or technologies recommended for implementing a data migration framework effectively?

References

https://www.netapp.com/data-management/what-is-data-migration/#:~:text=Data%20migration%20is%20the%20process,or%20location%20for%20the%20data.

https://en.wikipedia.org/wiki/Data_migration

https://www.talend.com/resources/understanding-data-migration-strategies-best-practices/https://www.talend.com/resources/understanding-data-migration-strategies-best-practices/

https://www.datamigrationpro.com/data-migration-testing-strategy

https://www.altexsoft.com/blog/data-migration/

https://cloud.google.com/architecture/database-migration-concepts-principles-part-1#:~:text=Database%20migration%20is%20the%20process,restructured%2C%20in%20the%20target%20databases.

https://www.varonis.com/blog/data-migration

https://www.cio.com/article/219638/data-gravity-and-what-it-means-for-enterprise-data-analytics-and-ai-architectures.html

https://www.integrate.io/blog/7-data-migration-best-practices/

--

--