Navigating risks in Gen AI projects: A Project Manager’s Nightmare

Shobhna Lenka
Thomson Reuters Labs
8 min readMay 28, 2024

by Shobhna Lenka | on 28 May 2024

Cover picture of Navigating risks in Gen AI

Introduction:

Thomson Reuters (TR) Labs is at the forefront of Generative Artificial Intelligence (GenAI) technology, focusing on tasks such as document drafting, information distillation, and multilingual summary generation. The team includes experts in machine learning, software development, and cloud engineering, all working together to advance AI innovation. Capitalizing on the potential of Large Language Models (LLMs) like ChatGPT, TR Labs has introduced Open Arena, an internal platform that allows users to experiment with LLM applications in a secure, user-friendly environment. Open Arena offers Model-as-a-Service for incorporating LLMs into products and solutions, featuring an intuitive interface with pre-set tiles. These tiles serve as pre-set interactions that cater to the specific requirements of the users. You can find detailed information about “How TR Labs developed Open Arena, in under 6 weeks” in this Learning Blog.

Open Arena Overview

TR Labs worked on many use cases including supporting our global news agency, Reuters, by developing the groundbreaking Reuters Open Arena platform which aimed at fostering creativity and exploration where we initially focused on a single tile — Azure Open AI but later expanded to cater to diverse needs by introducing multiple tiles tailored to specific scenarios including language-specific translation and editing assistance. The platform has also explored new approaches to storytelling by altering vocabulary.

Our collaboration extended beyond newsroom boundaries, as we joined forces with the Sales and Marketing team at Reuters which involved using technologies like Textract and Azure Open AI models to extract important information from contracts for better client understanding.

Additionally, TR Labs supported the Customer Support and Services (CSS) team in the LATAM region by developing a Spanish-based customer support system that utilizes AI to respond to customer queries. The system, particularly the AI project for TAP customers, successfully handled over 300 inquiries in its first week via WhatsApp integration.

CSS LATAM Open Arena Overview

To promote diversity and inclusion, TR Labs created a chatbot for Black History Month that was hosted on TR’s Atrium page. The chatbot provided users with information about significant Black historical figures, using AI to pull data from reliable sources, thus promoting education and awareness during February.

TR Labs team collaborating with Reuters, CSS LATAM, and BHM

As pioneers in the field, we are committed to continuous innovation and refinement of our methodologies to stay ahead of the curve. However, with innovation comes risk.

In this blog, we will delve into the key risks associated with GenAI projects, drawing from our experiences and insights gained along the way.

Key risks associated with GenAI projects include:

I. Competing Priorities in a Sea of Use Cases — The multitude of potential use cases presents significant risks because of the hype around GenAI and initial teasing success. Allocating resources among numerous projects becomes challenging, potentially leading to inefficiency and suboptimal outcomes. Moreover, there’s also a risk of scope creep, where projects expand beyond their initial plan, causing delays. Scope creep is often detected during project reviews, milestone assessments, or regular progress updates. The signs can be seen in projects with additional features without proper evaluation, changing requirements leading to adjustments in project scope, unclear objectives, continuous iterations without clear boundaries, and lack of stakeholder involvement or feedback loops, raising quality concerns.

II. Lack of Formalized Structure for Intake Requests — Without a formalized intake process, requests can come in from various channels and in different formats, leading to confusion and inefficiency. These challenges are further amplified by the team’s size, capacity, skills, and the sources of the requests. A small team might struggle with a high volume of requests, leading to inefficiencies, while a large team could face coordination issues. The team’s capacity, or bandwidth to take on new tasks, is crucial, as overloading can lead to burnout and decreased productivity. The diversity of skills within the team is vital to handling a wide range of requests, and without a clear understanding of these skills, there’s a risk of misassigning tasks. Requests can come from various sources, both internal and external, and tracking and prioritizing these without a formalized process can lead to oversights and dissatisfaction among stakeholders. A structured intake process ensures that all requests are evaluated against the organization’s strategic objectives. Without this, there is a risk that the team might spend time and resources on tasks that do not align with the overall goals. Moreover, important requests may be overlooked, potentially impacting project outcomes. Resource misallocation and decreased stakeholder satisfaction further compound these risks.

III. No Proper Scale Testing — Scale testing is a critical aspect of deploying generative AI solutions, particularly when these solutions are expected to handle a significant load. As the load increases, there is a risk that the performance of the AI solution may degrade, leading to slower response times and a poor user experience. If the system has not been adequately tested for scale, this degradation could come as a surprise, impacting user satisfaction and the overall success of the project. In extreme cases, an AI solution that has not been properly scale-tested might even fail under heavy load. This could lead to downtime, data loss, and other profound consequences. Without proper scale testing, it is difficult to predict how much computational resources (like processing power and memory) the AI solution will need as the load increases. This could lead to overuse of resources, increasing costs, or underuse, which could lead to performance issues. Without scale testing, these bottlenecks might remain undiscovered until they cause problems in a live environment.

IV. Exposing an API for an external platform Integration — Exposing an API for integration with third-party platforms can indeed enhance the reach and functionality of AI solutions. However, this approach comes with its own set of risks like security vulnerabilities, compatibility issues, data privacy, and increased maintenance. Also, relying on a third-party platform means that your service is dependent on the availability and performance of that platform. Any issues with the third-party platform could impact your service.

V. Customer Dependency in User Acceptance Testing (UAT) — User Acceptance Testing is a critical phase where the intended users of the system validate the system against their requirements. However, this phase heavily relies on the customer’s availability, understanding, and effort. Dependency on customers for UAT in Generative AI projects entails various risks. Firstly, customers may face availability constraints due to their business commitments, potentially causing delays in project timelines. Additionally, customers’ limited understanding of the technical intricacies of the system may lead to miscommunication or misunderstandings regarding its functionality. Moreover, conducting thorough UAT demands a significant investment of time and effort from customers, and their lack of full commitment may compromise the quality of testing and feedback. Furthermore, the subjective nature of UAT, reliant on customers’ personal biases and evolving requirements, introduces uncertainty into the evaluation process.

VI. Resource Constraints — Resource constraints in GenAI projects present challenges primarily in terms of specialized expertise and time-intensive preprocessing efforts. These projects demand professionals with advanced skills in machine learning, data science, and software engineering to effectively develop and deploy AI models. Additionally, the data annotation, cleaning, and formatting processes are labor-intensive and crucial for model training. Inadequate access to skilled personnel and insufficient time for preprocessing can lead to delays and quality issues, jeopardizing project timelines and success.

VII. Mishandling Sensitive Data — Proposing the creation of deep fake videos or hosting a chatbot with website links of uncertain origin carries substantial risks related to mishandling sensitive data. Deep fake videos often require access to extensive personal data, including images and videos of individuals. Their manipulative nature means they can distort or misrepresent individuals and events, leading to the spread of misinformation and undermining trust among viewers. (See our recent post on DeepFakes for more on this.) Similarly, hosting unverified website links exposes users to potential misinformation or malicious content. These outcomes have the potential to damage the organization’s reputation and compromise the integrity of the initiative at hand.

VIII. Transition from Experimentation to Production — This encapsulates the significant risk involved in moving from the controlled environment of AI experimentation to the unpredictable realm of production. During experimentation, models are often tested with curated data, but the shift to production introduces complexities due to real-world variables that can affect performance. Challenges such as scalability issues arise when models grapple with the scale and variability of production data. The integration of AI models with existing systems can require substantial infrastructure changes, and data inconsistencies and quality issues often surface in the production environment. These factors collectively contribute to the complexities and risks involved in the transition from AI experimentation to production.

IX. Maturity of AI — The maturity of AI can pose significant risks in AI projects due to several factors. For starters, these technologies might have technical limitations that could affect their performance and accuracy. They might also lack standardization, which can make it difficult to use them effectively. Data challenges can further complicate matters, as issues with the data can lead to the AI not working as expected. In addition to these technical issues, there are also ethical and legal concerns to consider. Without clear guidelines and compliance frameworks, there’s a risk that AI could be used in ways that are unethical or violate laws. This risk is heightened when the technology is immature, and these frameworks aren’t well-established. Another challenge is the skill gap. As AI technologies advance, it can be difficult to find or train personnel who are proficient in the latest developments. This issue is compounded when the systems are immature and lack long-term support and maintenance, which can complicate updates and fixes. Finally, investment risks are higher due to the uncertainty of the technology’s success and longevity, and integration issues can arise, causing operational disruptions.

X. Impact of Capital (CapEx) and Operational (OpEx) Expenditures — It involves a significant financial risk due to both CapEx and OpEx. CapEx includes the substantial upfront costs for acquiring necessary hardware, software, and infrastructure, with the risk of investments becoming obsolete quickly due to rapid technological advancements. OpEx covers the ongoing expenses for data processing, cloud services, software updates, and specialized personnel. These operational costs can escalate unpredictably as AI systems scale, and continuous monitoring and retraining add to the financial burden.

XI. Managing and Setting Expectations — Managing and setting the right expectations is a significant risk in AI projects. Overly optimistic projections can lead to unrealistic expectations among stakeholders, resulting in disappointment and loss of trust if AI systems don’t perform as anticipated. Misunderstandings about the time, effort, and data required for AI development can misalign expectations regarding project timelines and deliverables, causing friction and potential delays.

The road ahead:

Gen AI offers tremendous potential for innovation and progress across various fields. However, navigating the associated risk requires a multi-pronged approach. By prioritizing responsible development, promoting transparency, and fostering collaboration, we can harness the power of GenAI for a safer and more equitable future. By understanding these risks and implementing strategies to mitigate them, managers can lead successful GenAI projects. As with any technology, the key is to approach it with a clear understanding of its capabilities and limitations, and a commitment to ethical and responsible use.

--

--