Navigating the Unchecked AI Wave: Balancing Innovation with Resilience and Operational Efficiency

Image source : https://agi-sphere.com/install-llama-mac/

As GenAI gains more traction, it has sparkled a revival of innovation appetite that seem unprecedented but for a few cycles like the dot com era two decades ago. However, based on my observation just like previous cycles this seems to be another untamed race with more focus on getting it to work with little regard to longevity and operations. Unsurprisingly, AI initiatives tend to get a free pass under exploration tag to get it working thanks to the fear of being left out. However, history has taught us a few times that taming untamed innovation is not easy if not impossible, and it will continue to exist and haunt the enterprise for years to come accruing tech debt, security risks and exponential operating costs.

I see three different ways applications are built in an enterprise :

  1. Build to Excite : These are applications created to impress users with innovative features or designs, aiming to generate excitement and engagement. These applications are meant to be dismantled after a short lifespan but can sometimes survive way beyond their intended lifetime. I believe most of the GenAI experimentation that you are building will live longer than you intially planned for.
  2. Build for Resiliency : This involves creating applications that can withstand challenges like heavy usage or technical issues, ensuring they remain stable and reliable.
  3. Build to Operate : These applications are designed to be easily managed and maintained by the operations team, focusing on efficiency and simplicity in day-to-day operation.

In essence most enterprise applications are build with resiliency and operational efficiency as core focus, it’s important for AI apps development to follow the same approach by striking a balance between creating exciting applications that capture users’ attention and ensuring they are resilient and efficient to operate for delivering maximum business value with minimal disruption. Leveraging Platform Engineering and existing Application Lifecycle practices, which are already established in most enterprises, can be the solution. Regardless of how eager enterprises are to progress rapidly with their AI initiatives, investing time upfront to integrate these practices within the established framework of Platform Engineering will yield significant benefits in the long run.

Within platform engineering, many enterprises have honed their ability (or are currently refining it) to develop applications that prioritize resilience and operational efficiency. They achieve this through well-established practices such as granting developer self-service for essential tools and environments, ensuring comprehensive documentation, fostering collaboration through dedicated platforms, and implementing secure and efficient application delivery pipelines, often referred to as Secure Software Supply Chains. This robust framework typically revolves around the integration of Internal Developer Platforms (IDPs) and Trusted Software Supply Chain mechanisms. However, when introducing AI initiatives and addressing the need to create exciting applications, additional considerations come into play. This entails the deployment of new sets of developer tools tailored to AI development, accommodating diverse infrastructure requirements, and extending support to novel user profiles such as data scientists and data engineers. Moreover, these adaptations must occur at an accelerated pace to meet the escalating demand for innovation and functionality. In essence, while organizations have mastered the fundamentals of building resilient and efficient applications through platform engineering, integrating AI and catering to excitement demands entail a strategic expansion of capabilities, infrastructure, and support mechanisms, all of which must be executed swiftly to maintain competitiveness and relevance in today’s dynamic market landscape.

Now, let’s talk briefly about using AI code generators to make IT work faster. Honestly, developers and operators will eventually find ways to leverage them irrespective of whether it is offered to them by the enterprise, but it’s important to be careful about how they’re introduced into the tools used by the enterprise. If you decide to use AI code generators, it’s essential to choose one supported by a trustworthy company. This way, you’ll know where the recommendations come from and how reliable the training data is. While open source Language Model Models (LLMs) might seem like a good choice because they’re good at generating code, they might not offer the same level of transparency and data quality. Using open source LLMs for code generation could potentially be a weak point in the company’s security, so I strongly advise against going down that path.

Returning to our main focus, how can we tackle this issue effectively? One way is to thoroughly understand the goals and expected outcomes of AI projects across the entire organization, gathering input from various stakeholders such as executive leadership, finance, sales, services, and operations. Armed with this insight, you can determine whether the AI initiative is a short-term experimental project aimed at generating excitement and testing ideas, or if it’s a long-term investment with the potential to create significant business value over time.

If it’s the latter, then integrating AI into your Platform Engineering practice may be the best course of action. This ensures that all AI initiatives adhere to established enterprise standards from the outset, promoting resilience and operational efficiency over the long term. However, you may still want to allow for some flexibility to foster rapid experimentation outside of the Platform Engineering framework, addressing the organization’s appetite for AI innovation. This can be achieved by setting specific timeframes and guidelines for experimentation, with a clear process in place to integrate successful experiments into the Platform Engineering practice within a defined timeline.

There are several approaches you can take to conduct the information-gathering exercises with different stakeholders, and to document and track their input. One method I recommend exploring is the workshop framework outlined in our book ‘Technology Operating Models for Cloud and Edge’. In this book, we propose utilizing Open Practice Library techniques to facilitate workshops that effectively gather the required information. Additionally, we offer a template for your convenience.

In summary, as we explore the realm of enterprise application development, we encounter three distinct paths: ‘Build to excite,’ where innovation takes center stage, ‘Build for resiliency,’ prioritizing steadfast stability, and ‘Build to operate,’ focusing on seamless functionality. Throughout this journey, Platform Engineering and established practices serve as our guiding compass, ensuring our creations harmonize seamlessly with the enterprise’s standards for reliability and operational efficiency. However, as we progress, integrating AI into our processes presents both excitement and challenges. It’s like embarking on a new adventure, where the promise of increased productivity beckons, yet the importance of maintaining transparency and data integrity remains paramount. Amidst this complexity, workshops such as the ones recommended by Open Practice Library emerge as invaluable spaces for collaboration and discovery. Ultimately, our mission is one of balance — navigating the dynamic landscape of innovation and stability to unlock enduring value in our enterprise AI endeavors.

I look forward to hearing your thoughts. Have a great weekend!

--

--