Data Platform Modernization(Part 1)

Thirumalvalavan Venkatesan
3 min readNov 12, 2020

--

I have seen a lot of content and materials around data platform modernization in the recent past, but all those articles and contents do not talk about some of the practical issues that customers face when we are moving the data and data processing workloads to the Cloud.

Let me start the blog by talking about some of the pre-migration activities and assessments that need to be done.

  1. Do not embark on this journey as a technology project. Requires a lot of people and processes and changes in the operating model too. This requires a separate article by itself which I will write at a later point in time
  2. Create a vision of the migration and what are your objectives and goals. Cost, Newer Capabilities (Real-time features, etc.), Flexibility, Agility, Performance metrics improvement (Reducing batch SLA’s, etc.). This is an important step to measure the success of the project/program.
  3. Finalize the target platform and do not look at it for just the next 1–2 years' timeframe. Look at this for the next 5 years' timeframe and choose the platform. There needs to be a broader horizon, in spite of all the technologists talking about technology changes every few years, In the data world, it is difficult to move around platforms un-like the app world. Look at the vision of each platform and do not just rely on the “Analyst ratings”
  4. Perform a Proof of concept with measurable KPIs for comparison. Choose an end-to-end use-case with medium to complex data engineering including all 4 layers of architecture (Ingestion, data storage, data governance, and consumption). I will share some of the metrics that are required (bare minimum) for various cloud modernization journies.
  5. Business case creation is a big issue in the cloud world particularly server-less services like BigQuery or partial server-less services like Snowflake as these services cost model is based on the “Pay-as-you-go” model. Mapping the Teradata/Netezza compute machines to these services and that too you are not investing in infrastructure or appliances and you are investing in services and cost model is based on the usage. I have created a simple mapping to calculate and project the cost based on our experiences working with various migration projects. BigQuery or other MPP databases offers a lot of flexibility in the costing model which we will talk about in a separate article.
  6. Target operating model and change management. Very important one as there will be a cultural shift both from end business users' standpoint and also for internal infrastructure, procurement, contract services teams too. Business and change management consultants need to be engaged in order to redefine the operating model including training sessions for the internal teams for them to understand the cloud world.

Let us then talk about the actual migration approach itself. You could choose to do an ROI analysis for each application that you are hosting on legacy Teradata/Netezza platforms and decide to do the following options.

  1. Retire
  2. Re-host
  3. Re-factor
  4. Re-engineering

There are other R’s that you can find when you Google on migration. To me, these 4 R’s are important for Legacy Teradata/Netezza kind of workloads.

Decision tree to choose a particular option over the other in the context of Teradata & Netezza.

4 R’s for Data Platform Modernization

In the next article, we will get into the details of these 4 R’s and options and tools to migrate data, compute, and other tools related to the legacy platforms.

--

--