data-ops
Published in

data-ops

Your Cloud Migration is Actually an Agility Initiative

No matter what industry you are in, a competitor or market trend threatens to disrupt your business model. Organizations that thrive amid these challenges will be those that best adapt to change. Business agility is the capacity to rapidly respond to constantly evolving threats and opportunities. It is the one capability that can help a business survive and thrive no matter what problem it faces. Agility helps enterprises create and sustain competitive advantage. Like everyone else, agile companies make mistakes, but their mistakes are less costly because of their ability to quickly change course. Whatever the challenge, agile businesses keep iterating on responses until they find an approach that works. Many companies understand the importance of business agility and are willing to make significant investments to improve it. Cloud computing has become one of these investments, even though it doesn’t always deliver increased agility.

Business Agility and Cloud Migration

In a 2017 cloud computing study conducted by Harvard Business Review (HBR), companies cited “business agility/flexibility” as the main reason behind their adoption of cloud systems or a hybrid cloud architecture. Companies that moved to the cloud listed “increased collaboration” and “business agility/flexibility” as the top two benefits of their cloud migration initiative. Cloud computing offers many advantages, yet not every cloud migration delivers on its promises. A study by IHS Markit found that 74% of the companies surveyed moved a cloud-based app back on-premises after failing to realize the anticipated benefits. In our experience, many companies do not fully understand the ways that cloud computing impacts business agility.

Returning to the HBR study, companies cited “IT implementation time” as the leading hindrance of their legacy, on-prem systems. When seeking greater agility, the IT team needs to be able to quickly enhance or update a system. If IT execution keeps pace with the flow of ideas, then new requests and ideas keep flowing.

When reflecting on their legacy systems, respondents did not mention the usual list of cloud selling points: cost, scalability, maintenance, backups, reliability, and mobility. Cloud migration as an IT operations project is not that exciting. IT spending varies by industry, but let’s say, for example, that it is 4% of an enterprise’s budget. If you can reduce that by 25%, you have improved the bottom line by 1%. Maybe you’ll get a nice email from your boss.

On the other hand, cloud migration as a means to streamline IT implementation time directly impacts business agility. Companies that wish to deploy analytics or add new services need to quickly update their IT systems. When IT agility drives business agility, it’s a game-changer that can make or break a company. Anything that improves business agility can stimulate the type of transformational change that establishes market leadership and gets people promoted. Let’s explore the critical relationship between IT implementation time and business agility from a data-industry perspective.

Figure 1: Cycle time is the elapsed time between the proposal of a new idea and the deployment of analytics.

The Impact of Cycle Time on Business Agility

In data analytics, the equivalent of “IT implementation time” is a concept called “cycle time.” We define cycle time as the period elapsed between the proposal of a new idea (or a new question) and the deployment of finished analytics (Figure 1). The time that the data team waits for access to a new data set lengthens cycle time. The fire drill that results when data operations go offline (putting new development on hold until the problem is solved) affects cycle time. The time required to build a development environment for a new data project impacts cycle time. Everything that delays cycle time interferes with agility.

Consider an example data team producing analytics for a VP of Marketing. Imagine the VP has an idea to feed user search history to the algorithm that displays product offers to potential customers. A data-analytics team with an average cycle time of ten weeks comes back in a couple of months with a prototype model. As we all know, “version 1.0” of a great application doesn’t always work as hoped. Perhaps the VP then asks to add both search history and customer ratings into the algorithm. After another ten weeks, the data team delivers the second iteration. At this pace, the VP can only propose several ideas per year.

What happens when analytics cycle time is reduced, for example, through DataOps automation, to a single day? Now the VP can make several requests per week. The marketing team can brainstorm ideas. Working closely together, marketing and data analytics iterate toward more effective algorithms. With enough iterations, they’ll eventually stumble onto a blockbuster idea. We’ve seen it happen many times.

Lengthy cycle time in an analytics organization cuts two ways. It interferes with an organization’s ability to understand the external environment and slows its response to threats and opportunities. When DataOps methods, like Agile Development, DevOps and lean manufacturing, are applied to the end-to-end data-analytics lifecycle, the organization starts to understand and address all of the factors that lengthen cycle time.

Obstacles to Analytics Agility

In a DataOps enterprise, data teams work hand-in-hand with their users like a well-oiled machine, fielding new idea proposals, implementing them rapidly and quickly iterating toward higher-quality models and analytics. The experience of a non-DataOps enterprise is quite the opposite. Data teams are interrupted continuously by data and analytics errors. Data scientists spend 75% of their time massaging data and executing manual steps. Slow and error-prone development disappoints and frustrates data team members and stakeholders. Below are some common challenges that impact data-analytics cycle time (Figure 2):

Figure 2: Factors that derail the dev team and lengthen analytics cycle time.

These factors and others interfere with the development and delivery of new analytics. Slow response time ripples through all of the interactions that the data team conducts with colleagues, customers and stakeholders. It interferes with data-driven decision making. It blocks creativity and innovation. Companies cannot maximize their business agility unless they establish a robust, repeatable set of processes and workflows that control the factors driving lengthy cycle time.

The Myth of Cloud Migration Agility

Companies that initiate cloud migrations, but do not address the factors that lengthen analytics cycle time, will not achieve their full potential in terms of agility. Migrating from an on-prem, proprietary database to a cloud database may produce cost, scalability, flexibility, and maintenance benefits. However, the cloud initiative will not deliver agility if the data scientists, analysts and engineers are constantly yanked from development projects in order to fix broken reports and manage data errors. If the impact review board requires four weeks to approve a change, that is four weeks whether the analytics run on-prem or in the cloud.

The “lift and shift” approach to cloud migration assumes that a company moves its current workflows from on-prem to the cloud. We would only expect agility to improve for those areas in which the on-prem tools are a bottleneck. Generally speaking, the major bottlenecks in data-analytics workflows tend to be related to people — either an outstanding individual contributor who gets pulled into every critical project or a group like IT, tasked with provisioning a new development system or a data set.

Data organizations are managed by very talented people who are sometimes bewitched by conventional wisdom. One common misconception is that a company’s applications and data are its most valuable assets. These companies take a data-centric or application-centric view of their cloud initiative and perhaps their entire data operations. A manager with a data-centric perspective will phrase the goals of a cloud initiative around applications and data — focusing on architecture, application KPIs, migration plans, and refactoring. These things are important, but if the team does not improve cycle time, a cloud initiative will fail to improve business agility.

Sometimes companies say “people are our greatest asset.” It is nice when a company values its employees. Our concern with this approach is that it often devolves into an addiction to heroism. Heroism is a brute-force tactic that throws people at problems. When the purchasing system goes offline three days before Black Friday, the company dispatches its heroes to work 24x7 to fix the problem. Heroes are the enterprise’s star employees who swoop in and save the day. Heroes are a precious resource, and when they become a bottleneck, a company can’t grow, evolve and achieve its potential. Heroism accepts unplanned work as a way of life. Also, heroes are not a company asset. The company does not own them. Heroes will eventually burn out and leave the company. It’s no fun to work the long and unpredictable hours of a hero. DataOps has helped many managers understand how modern workflow methods and automation eliminate the need for heroes.

Instead of relying on heroism, wouldn’t you rather have a robust, repeatable process for producing and maintaining your data analytics development and operations pipelines? A process-oriented approach to data analytics frees the data team from the bondage of heroism, and it enables them to apply their considerable brainpower to the enterprise’s top and bottom-line challenges. Your workflows and methodologies can make it possible to eliminate errors, shorten cycle time and improve collaboration. With workflow automation, your entry-level college grad can deploy a data science model with the same efficiency as your ten-year veteran employee.

An Enterprise’s True Assets

If an enterprise strives for business agility, then its most valuable assets are the business processes, methodologies and workflows that enable it to rapidly respond to change. People come and go, but workflows remain. The cloud is a powerful tool, but it serves the enterprise’s workflows, not the other way around. Some may find it ironic that focusing on workflows instead of data and tools is the optimal approach to data monetization using cloud technology.

People remember Henry Ford as a manufacturing innovator who invented the assembly line. He actually didn’t set out to revolutionize manufacturing. Ford started with a marketing requirement. He knew that he could sell millions of cars if he could offer them for $500. When he challenged his engineers to build a $500 car, they had to entirely rethink how to build cars. There was only one way to do it — mass production.

It’s time for data organizations to rethink how analytics are built with business agility as the goal. Data teams need to implement the “mass production” of the big data and analytics industry.

Translating Business Agility into Concrete Goals

We recommend phrasing the goals of your business-agility initiative in concrete terms that minimize data-analytics cycle time. In addition to the usual array of application and performance goals, managers must craft goals that specifically address analytics agility. For example:

  1. Less than one data operations error per year
  2. Spin-up/down analytics development environments in one day
  3. Push a button and fully test and deploy completed analytics in less than 2 hours

These goals all relate to minimizing cycle time. With reduced errors, your development resources won’t be constantly interrupted and distracted from their highest priority work. With fast creation of development environments, your data scientists will start work immediately on a new project — no more waiting for data, impact review, or anything else. With fast deployment (continuous deployment), your most junior data analyst will deploy with the same speed and confidence as your senior staff members. All of the collective knowledge and expertise from the data team migrates into the automated workflow system — the company’s true asset. The result is a robust and repeatable process that produces error-free analytics extremely efficiently. The system produces rapid, high-quality results today and five years from now, when your team may have grown or changed considerably.

You may find that your agility initiative needs to include some cloud capabilities. The cloud is very good at certain things that contribute to business agility, such as scaling up/down application infrastructure in response to demand. With project objectives that specifically address agility, you can be sure that you won’t be moving error-prone and inefficient workflows to the cloud. With workflows redesigned to improve analytics efficiency and cycle time, your agility initiative can most effectively utilize cloud capabilities. With business agility as your primary objective, you can better decide which applications should migrate to the cloud and which should remain on-prem.

The DataOps Cloud Migration

A cloud migration project itself can benefit from a DataOps approach. DataOps is a methodology that applies Agile Development, DevOps and lean manufacturing to analytics to maximize business agility. DataKitchen provides a DataOps Platform that incorporates these principles and automates workflows alongside your on-prem or cloud toolchain. It can help your data organization virtually eliminate errors, minimize cycle time, and enable seamless collaboration of data team members and their stakeholders. DataKitchen is particularly strong in managing data pipelines that span multiple data centers. It can also effectively modularize a toolchain so data pipelines can be migrated one processing stage or tool at a time. DataKitchen enables enterprises to get the most from a hybrid cloud or multi-cloud initiative.

Mitigate Risk with Parallel Data Pipelines

Enterprises may mitigate risk by instantiating cloud data operations while running their legacy on-prem data pipelines in parallel, comparing results after each processing operation (Figure 3). There’s often underlying business logic embedded in analytics that people sometimes have a hard time recreating from scratch. DataOps testing will help the team find those discrepancies and address them before they appear in your critical analytics. DataKitchen brings transparency into on-prem and cloud data pipelines, enabling the data team with a unified view of all of your end-to-end pipelines — the data analytics version of “a single pane of glass.”

With parallel cloud and on-prem data pipelines, the implementation team can decide whether to cut over to cloud operations one processing stage at a time or all at once, depending on a project’s use case and risk profile.

Figure 3: Enterprises may mitigate risk by running cloud and legacy on-prem data pipelines in parallel, comparing results after each processing operation.

Migrate with Confidence

DataOps testing helps ensure that a migration proceeds robustly. In one case that we encountered, a company moved its data operations to the cloud only to hear reports from users that the data looked wrong. They found that their 6 billion row database had only partly transferred. DataOps prescribes testing inputs, outputs and business logic at each stage of processing. It verifies code and configuration files by automating impact review. It also verifies streaming data with data tests, including location balance, historical balance and statistical process control. In this case, a simple location balance test, like a row-count test, would have easily identified the problem of the missing data. The data team would have received an alert and remediated the issue before it affected user reports.

Keep Teams Coordinated

DataKitchen automates workflows that enable your data teams to collaborate more efficiently. If you have an on-prem team and a cloud team that need to stay coordinated, the groups will be able to continue making changes without interfering with each other’s productivity. Imagine that you are delivering a dashboard to the CEO. Some of it comes from the on-prem team and some from the cloud team. The teams may be in different locations with minimal communication flowing back and forth. These two teams want to work independently, but their work has to come together seamlessly. Task coordination occurs intrinsically when teams use DataKitchen, enabling the groups to strike the right balance between centralization and freedom. Figure 4 shows a cloud team and an on-prem team each managing their own data pipelines. DataKitchen enables each team to manage its local data pipeline and toolchain. The output of each group merges together under the control of a top-level pipeline. The model also applies to multi-cloud implementations.

Figure 4: DataKitchen enables teams using different toolchains to stay coordinated.

Development and Test Environment Agility

Analysts need a controlled environment for their experiments, and data engineers need a place to develop outside of production. Instead of having to talk to the hyper-dominant database salesperson for more licenses or go through a long procurement process to buy new hardware, the cloud offers the ability to turn on compute, storage and software tools with an API command. You can create on-the-fly development and testing environments that accurately reflect production and have test data that is accurate, secure, and scrubbed. A key feature that enables agility is the ability to spin-up hardware and software infrastructure. A common misconception is that on-demand infrastructure is all that you need. Don’t be misled by the hype! Cloud infrastructure is an essential ingredient of agility, but you need to implement the ideas in the DataOps Manifesto backed by a technical platform like DataKitchen to fully maximize agility.

Conclusion

Companies choose to migrate applications to the cloud in order to be more agile, but there is more to business agility than on-demand infrastructure. Business agility derives from agile workflows. In data analytics this means implementing DataOps to reduce errors, minimize cycle time and improve collaboration within the data organization. DataKitchen works alongside your toolchain and automates your workflows to bring the benefits of DataOps to your analytics teams. With robust and efficient workflows, you’ll maximize your company’s business agility, whether on-prem, in the cloud, or a mix of both.

Originally published at https://blog.datakitchen.io.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store