Generative AI: The Business Use Case & The Expectations

Arvind Dutt
5 min readNov 27, 2023

--

Beauty of Uttarakhand

Generative AI, yes the talk of the town which somehow lands up in every conversation be it in the board room, cafeteria or some casual interaction.

Everyone wants to harness its power but ‘how?’ the challenge remains the same ‘Where is that perfect use case?’ I mean yes you can do so much of interactive work with it like a chat or help you in explaining the code or write a small snippet, content generation and most commonly it is being used for POCs. But full scale implementation is a different ball game all together.

I recently got a chance to work on a large scale implementation where we were tasked to leverage GCP’s Vertex AI for extracting, transforming and generating information from the given data. And I am going to be very honest with you, the moment I started working on it, the first thing which came to my mind was to go through some of the posts on medium and figure out from the experiences of fellow enthusiasts how they have been handling things. But hard luck, I cloud not find posts where any such details were mentioned. Don’t get me wrong, there were amazing articles about GenAI’s applications but none of them were talking about challenges in real projects. There were posts about technical understanding of LLMs, abstract ideas about models and projects but they were not what I was looking for at that moment. Now either I could go across all those posts and then formulate my plan (which I was sure I would be able to do but time was of the essence) or I could continue to try and experiment till I succeed.

The sole reason why my first approach did not work was due to its novelty. And I am sure, if anyone of you have been in industry for over a decade then they will agree that same was the problem at the advent of Big Data processing (Hadoop, Storm, Spark, and Scoop etc.), Data Lake, Neural Networks and even Cloud Services. In fact, this is always true whenever a new technology surfaces. Till the time it is not fully adopted you will find bits and pieces of valuable information scattered across multiple locale.

At that time, I decided once I am in a comfortable position I will lay down my experience, some basic technical and architecture design and challenges that I had faced during the course of the project so that anyone new can understand what to expect when it comes to implement a GenAI project.

This will be a 3 part series where I will take you through the multiple aspects of the implementation. I will start with the ‘Business Use Case’ then we move towards the ‘Architecture and Technical Design’ and lastly towards the ‘Challenges and My Experience’.

The Use Case:

Now, I can’t mention the exact use case or about the customer here but I will take you as close as I can and I will try to keep the essence of the project intact. During this there will be some assumptions (minor ones) and you can then use your imagination to weave around the missing links if any.

When I was introduced to the business team following were the business requirements:

“The data points will be coming in real time and you need to use Vertex AI to extract, interpret and generate results which will be send to the users (actual customers). The results from the Generative AI needs to be in a specific format and should always follow that same output format.”

Along with this, I was provided with some documentation around the business and output format in which final results were expected. There were a set of mandatory fields and non-mandatory fields which needed to be part of all the results. The result will be in form of a JSON document which will be consumed by an API.

Looks very simple right? Simply create a project consume the real time data coming via requests, process it using GenAI and send the output to the API. Believe me it was not.

Well for starters, business had very high expectations from Generative AI, everyone was of the opinion that this entire thing will be a walk in the park and it will be over in a month’s time since ‘AI’ is being used we don’t have to code or do anything. This is one of the most common misconception as compared to the traditional ML project where a team of 3–4 members will get a time of at least 2 month (of course depending on the type of project) just because of the name ’AI’ being attached it reduced the time of project significantly.

Second, no one had an idea about the scalability of the project. How many request will the model be able to handle? What would be the test plan? The only input I had was that the entire process for a single request should be completed under 10 seconds i.e. from receiving the request, pre-process it, extract some information from the data point, transform and generate the results using Vertex AI and then send back the response.

Next up was the Accuracy of results, in our traditional approach we can use F1 scores, Confusion Matrix etc. to have some quantifiable outcomes but what needs to be done in such a scenario?

Model Tuning; is that an option here? What is Text Bison is not giving expected results? What needs to be done during hallucinations of the model? Since the output was a complex nested JSON how to design a prompt that will have everything wrapped up and as per business expectations.

And last but not the least: What about the pipelines for flow of data? Scheduling the runs, costing of each run and how to optimize the code so that we get maximum benefits.

The entire project can be broken into 3 modules. Each module has a particular function and decoded a certain request coming from the source system. The output from these modules were also different and everything needed to run in parallel. Since request can come simultaneously and synchronously for each module the system should be proficient in handling them and same expectation was from the Vertex AI.

My job here was not only to architect this solution but also to explain and give realistic projections to the business and project teams. Right from selection of technologies to develop the solution and then perform a test for each tasks using generative solutions.

In my next post I will talk about the architecture and technical design of this implementation. We will look at the solutions of the problems which were mentioned above and what other alternatives we can use. This entire series is based mostly on GCP and since most of the services are common on other clouds as well, you can choose and select similar options if it fits your requirement.

Until next time!!

--

--

Arvind Dutt

I am Arvind, currently working as an Architect and leading AI-ML-DE programs.