Five Lessons Learned From A Large Data Migration Project

PROCON IT Technology Blog
4 min readOct 26, 2023

From December 2021 until May 2023, PROCON IT was part of a large data migration project. Overall, the project has been running for more than three years and up to a hundred developers from various different contractors and sub-contractors were involved in it. In this blog post, our Data Engineer Andreas Gyrock writes about five lessons he learned that help ensure a successful progress in such a large project.

Lesson 1: Structure teams reasonably and give them clear responsibilities

If a developer needs to keep track of the work of too many other developers on a daily basis, they will easily become overwhelmed. Therefore, having a reasonable team size is key. Jeff Bezos’ two pizza rule seems to work fine here, so not more than ten people per team. Distributing responsibilities so that each team can work in a largely isolated way eases the everyday work of every single developer. But, of course, parts of the code will always be used by multiple teams, which brings me to the next point.

Lesson 2: Define requirements clearly

Especially the properties of the most fundamental functionalities should be assessed carefully before they are implemented. An example for this would be an API through which developers can request data from the database. While it is no problem if the implementation of the API changes at a later stage of the project, its formal properties like the accepted input arguments or the format of the returned data ideally are touched only once directly at the beginning of the project. However, this requires a complete picture about the project’s goals. The scenario in which late changes of such formal properties due to new requirements affect dozens of developers can slow down the progress of the project dramatically. In case such a late change is nevertheless inevitable, the next point becomes crucial.

Lesson 3: Put effort into communication

It is important that developers working on the same code parts inform each other about their progress and impediments regularly to make sure that everybody still works in the same direction. Also, sometimes a developer’s sudden inspiration helps solving a problem for which others would have needed days. Communication becomes even more important when project wide decisions have to be made. A lack of inter-team alignment can easily lead to a situation where one team’s code changes contradict another team’s requirements.

Lesson 4: Implement mechanisms to ensure code quality and correctness

When code is added to the code base by a developer, several requirements should be fulfilled. First, the changes should be easily traceable for other developers. Version control tools like Git are the standard approach to enable this. Second, the new and also the existing code should work correctly. A CI pipeline running (unit) tests automatically, a required test coverage and rules implying that the code has to be reviewed and tested by developers other than the author can help to achieve this goal. Third, the code style should follow consistent rules. Therefore, establishing and automatically enforcing project wide style guidelines — which in turn can follow general guidelines like e.g. PEP 8 for Python — is important. Many of these mechanisms also facilitate the aforementioned communication about the code.

Lesson 5: Invest into a stable development environment

On a data migration project, developers usually have to work with the affected data frequently. Regarding testing, for some functionalities it can be enough to have good test data to be able to test locally. For functionalities which involve interaction with the database, however, it is essential that the developers can work in an environment where the data is highly available and consistent. Also, they need enough compute resources in that environment to test their code in reasonable time. Depending on the size of the project, installing system administration teams which exclusively take care of the infrastructure (including availability of data and compute resources) will pay off hugely on the long run when the alternative is that dozens of developers regularly sit in front of their computers waiting for the environment to work fine again.

Conclusion

We hope you can take something useful from these lessions. Of course, every project is different and will have its own unique challenges, but I believe that these are points which are relevant in some form in nearly every large data migration endeavor.

If your company could need support with a project like this one or you are based in the Munich area and would like to work on one — please don’t hesitate to reach out! You find more information about us on https://www.procon-it.de/. Unfortunately, it’s in German only at this point, but you can contact us also in English by writing to anfrage@procon-it.de.

--

--

PROCON IT Technology Blog

Consulting Company from Munich, Germany. We focus on Business Consulting, Data Intelligence, Digital Solutions and SAP. Both hiring and looking for clients!