Making GPT from Cool to Useful — Part 2: Try It Yourself

Published in

Ekohe

8 min readFeb 26, 2024

The core elements to consider for a successful pilot

Previously in our series, we have discussed the reasons WHY that understanding and utilizing LLMs can start new and exciting opportunities for the future of businesses, as well as explored the LLMs’ practical implications in the real world, going beyond its recreational purposes.

In the past several months, the Ekohe data science team has successfully built several features integrating LLMs for our clients — custom developed to meet their unique business use cases. We built an RAG (Retrieval Augmented Generation) structure back in March 2023 when RAG was not even a thing.

These LLM features span to different sectors and they are different in many ways, providing us with diverse development experiences and challenges. However, there are several common threads and shared experiences that we would like to share in this part of the series — one of the most common questions that we faced, both from clients and within our team, is HOW to get started.

Identify a Use Case

To encourage more adoption of the technology, it’s crucial to have the first LLM feature to be a successful pilot. So developing effective use cases is a vital first step but a challenging one, as only these can highlight a technology’s potential and minimize its limitations, increasing the chances of the success for the LLM feature.

For all of the things that LLMs and GPT services claim to do, we argue that LLMs mainly assist in two ways: making the work easier (more accessible) and faster. However, they do not necessarily improve the quality of the work, (In later articles, we will discuss the challenges in evaluating LLM results.) as well as, they aren’t particularly very effective in making the work more consistent or reliable, which are the areas to avoid when conceiving appropriate use cases.

In the following, we listed a few areas to think about, to be applied to your own business setting, where LLMs can be utilized. This could be to optimize day-to-day tasks for better efficiency or to tap into under-utilized data for new business opportunities. It’s worth noting that LLMs and GPT agents may not fully automate a lot of the following work. In fact, during our practices, we find that it works better to put humans in charge. But the ever-comprehensive context and the boost in productivity, provided and enabled by LLM features during the work process, are still invaluable

For day-to-day tasks, the areas you can explore are:

Tasks that involves heavy language processing, for example: reading notes or reports, sometimes in different languages, to provide summaries or reach decisions.
Repetitive, recurring and standardized language related tasks, for example: writing monthly reports of similar format, codes review,…, etc.
exploration tasks to a new/unfamiliar knowledge domain. New business directions that just get started where you don’t want to commit to a full workforce but still expect relatively good results.
Operational tasks (hiring, onboarding, training,…, etc) where we rely on a lot of corporate documents and guidelines.

For data assets whose value was either untapped or under-developed, LLMs and GPT agents offer exciting new opportunities to maximize their value. These include:

Corporate knowledge base: to be easier searched, referred, benchmarked and managed. Role and permission management can also be purposefully included.
Code repositories, design/development specifications and documentations.
News, reviews and social media comments can be easier screened, analyzed for deeper market research or customer understanding.
Text data embedded in other media forms: audios, videos and images of PDFs, all of which can be used now since these can be effectively converted with different representation models.

Even though there are many areas where we can apply our LLM revolution, it’s often better to start small. Find one type of data that is critical to the business or identify a task that is particularly slow or painful to get ourselves a good place to start!

Decompose the Use Case

This may come to your surprise but what tasks that we set LLMs to perform can make such differences in the quality of the results, which will eventually affect our confidence regarding the LLM feature that we set out to build.

A common concern that we received from our clients, is that they are hesitant to adopt LLMs to business applications because the results are either not good enough or the results are hard to validate. And indeed, this can happen when we solely rely on LLMs or GPT agents for every process of a use case and basing our judgement only from the final output.

Most business use cases involve various types of data, multiple reasoning steps, and different decisions to reach for a final output. It’s risky to assign everything to LLMs. Instead, we should break down the use case into separate tasks and create a task pipeline where LLMs, machine learning models, rule-based systems and even human workforce all being responsible for the tasks that they are specialized in.

This is one of the Ekohe’s secret sources to make our LLM features more likely to succeed. Not by using the most advanced model for every single use case but to be precise in the steps where we apply LLMs, as well as other technologies or tools.

Refine the Questions to Ask

Even when we have landed on the right tasks, the questions that we ask the models and how we ask the questions can also affect the quality of the results. This is the realm of, as you may have known the term, “prompt engineering”. There are a lot of resources to learn about prompting techniques but here are a couple of guides that our team uses.

Prompt Engineering Guide, developed by DAIR.AI is one of the first resources to introduce prompt engineering systematically
ChatGPT Prompt Engineering for Developers, a short course developed by Andrew NG in collaboration with OpenAI.

There are many complex techniques to consider, but simply put, there are 5 key elements in a prompt which are listed below in order of importance.

Task — a specific task you need LLM to perform or the question LLM to answer
Context — external information to help the model to answer better. Key to RAG (Retrieval Augmented Generation)
Input Data — input that you would want to find a response for
Output Indicator — the type/format of the output
Examples — examples to show to the model for the desired output with an input.

Task here is the most important element that has the most effect to the quality of the results. Being specific is GOLD. Use the right verb to describe the task; Be precise as well to steer the output. For example, if you look for a short-length summary, avoid the phrase of “Make it short”. Instead, tell the model to “Summarize the article in 2–3 sentences”; Try to replace any “Do Not” with “Do opposites” so the model can stick better to your instructions.

The right set of context is also very crucial, as many current models have a limited context window. And the attention given to the beginning, middle and end of the context varies. Therefore, context mentioned at the start of a prompt will carry the most weight for the model to answer a question. Additionally, whether the context is relevant enough for the model to provide answers is also quite crucial for a high-quality LLM output, which extends to its own field of RAG.

The design of the task pipeline and the quality of the prompt have direct effects on the quality of the outputs, and there are many more factors to be considered for different use cases. Due to the length of this article, we’ll have to cut short here to continue on our topic today. However, if you need more information, reach out to us for a specific use case. And later, when we delve into specific and successful practices, we will share more on this topic!

Envision a LLM Feature Interface

Now that you’ve gathered the right use case and have a better idea of the tasks that LLMs need to be responsible for, the next step is to make sure that this LLM application can be embraced by a wider range of people with a more sensible user interaction design

As ChatGPT is the most famous tool in the LLM scene, it’s easier to think that a chatbot is the most natural way to go for a LLM feature interface, but it’s not always the most useful one. As for a lot of the times, when holding few knowledge over the data that we aim to learn, we don’t often know what to ask in the first place. In fact, there are numerous ways of user experiences that we can explore and enhance the LLM magic.

One of the core beliefs we hold in designing user experience of such features is to put users on the driver seats and assist them using LLM capabilities as much as we can.

For business applications, the use cases are usually pre-defined so it’s not as open-ended as a chatbot interface would provide. Therefore, try to guide the user to choose and click rather than to type when gathering user queries, as well as, try to suggest the use of certain words when it’s necessary to type, so that it helps to align the intentions of users with the instructions that the model can perceive. These considerations in design will not only provide users clearer and easier ways to use the LLM tool, but also improve the quality and the consistency of the model outputs, which will eventually promote future use and form a virtuous cycle.

There are other possibilities to explore as well. For instances, we can design interfaces that will allow users to confirm the intermediate results while a LLM agent is processing a task, or let them see and adjust the final prompts being used, or even further, make suggestions on possible tasks while users are typing in descriptive language, all of which will make it easier to validate the final output, and thus, enhance the users’ confidence in the results.

It needs to be carefully designed so that the users are not overwhelmed with all of the additional information and are only making necessary decisions during the process. However, from our perspective, the key aspect is that we should make endeavors to build the TRUST between users and the LLM applications (or any other new technology). So the results will be confidently used to be truly impactful to the business.

There you have them. The 4 steps to help you shape your LLM vision and make it more tangible and concrete. You may discover that your initial vision has evolved along the way and it’s possibly not the same mountain that you set out to hike. Regardless, I hope you still enjoy the view!

Have fun exploring!

Making GPT from Cool to Useful — Part 2: Try It Yourself

Identify a Use Case

Decompose the Use Case

Refine the Questions to Ask

Envision a LLM Feature Interface

Written by keira