I’ve spent the last 6 months getting the most out of 🧠 LLMs … here are the 🔑 key insights💡

4 min readSep 17, 2023

I’m Juraj Bezdek, and I’ve been delving into the world of Natural Language Processing (NLP) for the past 8 years.

Like many others, I’ve been really impressed by what chatGPT can do. For the last 6 months, I’ve been diving deep into it, trying to figure out how to solve even the toughest problems using this tool.

I signed up for GPT4 as soon as I could, but I didn’t get access before it was made available to everyone. So, I’ve been working on getting similar results using the chatGPT API. Along the way, I’ve picked up a bunch of useful tricks and had some “Aha!” moments.

Along the way, I've developed a popular library LangChain Decorators ✨ as well as PromptWatch.io which allows you to tweak and replay any of the prompt requests in the chain.

Here, I want to share some of the insights I’ve gained while exploring 🧠 LLMs:

Divide and conquer
“Divide and conquer” is a well-established and effective problem-solving technique that has been known since ancient times. This strategy involves breaking down a complex problem or task into smaller, more manageable components. Importantly, it’s a method that can be seamlessly applied to harness the capabilities of Language Models (LLMs). This approach has gained significant popularity within various Agent architectures but can be used in a wide range of applications.
However, it does come with a drawback. Implementing this strategy typically incurs higher costs, as it requires generating more prompts, which consume additional tokens. Additionally, it can lead to increased latencies because you have to execute multiple prompts in sequence to address each subtask. Furthermore, this method doesn’t lend itself well to streaming solutions, as some of the prompts are part of the LLM’s internal “thinking” process, and only the last prompt can be streamed in real-time.
Adapting the problem as a Chat conversation
The latest GPT models are designed for chat interactions, and to get the best results, it’s helpful to structure your problem in a way that allows the language model to engage naturally. Instead of listing out instructions, you can frame your request as a Question-and-Answer pair, followed by a follow-up question or clarification.
Including a follow-up question can be especially valuable when you need high-quality responses.
However, it’s important to keep in mind that most Language Models (LLMs) are designed to be quite submissive due to their RLHF training. Even a minor indication of a mistake can prompt the model to enter an “Apologize for the confusion” mode. To avoid this, providing a hint or guidance on what specific aspect you’d like the model to focus on can be quite helpful.
OpenAI’s functions are a game changer
OpenAI's functions are basically what powers ChatGPT Plus plugins. All the most impressive and useful applications of LLMs I have seen all involve generating data in specific formats that can be interpreted using traditional code.
Before the introduction of functions, people utilized the capabilities of Large Language Models (LLMs) to generate JSON or other structured formats. However, there was a drawback to this approach — it required the LLM to perform two tasks simultaneously: solving the problem at hand and formatting the output correctly. This dual demand often led to divided attention and, consequently, issues.
Fortunately, OpenAI has introduced new models that are specifically fine-tuned to streamline the generation of JSON format outputs through the functions interface.
It not only allows you to work with shorter and more straightforward prompts but also conserves valuable tokens and most importantly enhances the reliability of the generated output. you some tokens yields also get better precision and reliability.
You need to remind LLM what the goal is
LLMs are highly sensitive to the placement of instructions. While initially including the goal in the system message is a good baseline, it’s common for the model to lose track of it when there’s a lot of other context information or a lengthy message history in your prompt.
In such cases, it’s beneficial to have a short, fixed message at the end to remind the LLM of what we want it to do. Here’s a handy trick: you can insert an arbitrary function message in this spot to simulate a function output. This approach is useful because using an Assistant/User message for this purpose can disrupt the natural flow of the chat conversation, whereas the System message can overwrite the instructions.
Ask LLM to write a reasoning before giving you the final answer
This is a big one.
Asking LLM to explain its reasoning before providing the final answer is a game-changer if you prioritize quality over speed. Imagine yourself given a dilemma with options A, B, and C.
If it’s an important decision, you’d probably “think about it” first. Humans can think by internally verbalizing our reasoning process. This means that you would use your internal voice to consider the options by mentally vocalizing the reasoning.
But LLMs don’t have an internal voice. They output what they think right away. If you ask them for the answer, they’ll give it to you (without thinking). If you want the LLM to think before acting, you need to ask it to write the reasoning before writing the final answer.

Those are just some of the handy tricks that come to my mind right now. I'll try to go into further detail with examples for each one of them in the future as well as bring more hints, so hit the follow button if you are interested.

I’ve spent the last 6 months getting the most out of 🧠 LLMs … here are the 🔑 key insights💡

Written by Juraj Bezdek