Getting started with Watsonx.ai III

6 min readMay 23, 2024

This article is Part III of the series Getting Started with Watsonx.ai.

> Getting started with Watsonx.ai series: https://medium.com/@nathalia.trazzi/list/watsonxai-series-english-08947bbd62f2

In this article, two prompts will be tested to essentially perform the same task but using two decoding methods.

The decoding methods and other parameters were explained in Getting started with Watsonx.ai Part II: https://medium.com/@nathalia.trazzi/getting-started-with-watsonx-ai-ii-b9fbfec80825

Getting started with Greedy Decoding

In this example, the content generated will be a recipe of chocolate cake.

Prompt Structure: Freeform

Prompt Type: Zero-shot prompting

Selected Foundation Model: llhama-2-70b-chat

Note that the reason the model stopped generating content is that the sentence ended. The token expenditure was 11 for my prompt and 414 for what was generated.

Tokens are like the small pieces of a puzzle that make up text. They can be words, parts of words, or even individual letters. These pieces are what the language model works with to understand and generate text.
For every 700 words, there are approximately 1,000 tokens. (Depending on how the model is used)

The model did not hallucinate and presented a response structure. This is expected when the greedy decoding is used. The output content looks good, it is a good chocolate cake recipe.

Getting started with Sampling Decoding

With the Sampling decoding method, the same previous example will be used, but with some parameter changes that Sampling allows.

Temperature: 1,63

Increasing the temperature makes the model more creative. Raising the temperature is beneficial for content generation tasks, but it may not be as suitable for highly precise activities.

Top P: 1

Top P is an auxiliary parameter to Top K, representing the confidence level for inserting the next word.

When Top P = 1, it instructs the model to choose only words with a confidence level of 100%.

Top K: 50

Top K serves as an aid to the temperature parameter; It controls the size of the vocabulary from which the model will choose words. It allows the model to use only the top (chosen number) most likely words.

In the example below top p, top k and the temperature parameters will be modified to explore the Sampling decoding.

This example contains instructions and a short article about tomatoes, along with a question about their origin.

By changing some parameters again, a better result is obtained.

Using the sampling decoding method brings many advantages, which is precisely what the examples here demonstrate. You can use it a lot and achieve great results, but it is important to know when to use it and how to control the parameters very well, or you will get unexpected results, especially when dealing with using these models for business purposes. No one wants the end user to receive strange responses.

Remember that for content generation activities, use the sampling method, but always keep an eye on the control parameters to ensure good output from the model. For tasks that require more predictability and less creativity, use the greedy method. Of course, you can control everything with the parameters — keeping the temperature low and paying close attention to the top p and the top k d will guarantee safer outputs, as long as you write a good prompt. But again, you need to know how to use them.

Setting the “Stopping Criteria”

When controlling the stopping criteria, it is specified to the model when it needs to stop. To do this, simply use special characters such as ., ,, ;, or even words. It’s allowed to set as many stopping criteria as you deem necessary.

In the following image, the stopping criteria control does not seem good enough. It is cutting off the recipe in the middle.

Controlling the Stopping Criteria with Minimum and Maximum Tokens

Note that in the image below, there is control of the stopping criteria and a minimum number of tokens to be generated is defined. Even if there is a stopping criterion of the character “.”, the model will only stop generating content when it reaches the minimum number of characters.

This allows for greater control over where and when the model will stop generating content, avoiding incomplete responses, overly long ones, or excessively repetitive ones (if the prompt is not well-written and is using a sampling option with parameters not properly configured).

GenAI Tenchiniques

Another important point to be addressed in this article is the techniques of generative artificial intelligence.

The image below illustrates what a zero-shot prompt, an in-shot prompt, and even more complex prompts like fine-tuning and model creation entail. Many foundational models, such as Meta’s Llama model, require a prompt to be written with examples and outputs for you to obtain the best responses the model can offer.

Still, the shortest way to write a prompt, besides knowing which encoding method (which you need to know how to make good use of the parameters), is by reading the documentation of this model to understand how to structure it best. All models available on Watsonx.ai have documentation that can be viewed when selecting the model you will use. Additionally, another good way to view documentation is by accessing the provider’s Foundation Model website.

Saving your current work on Watsonx.ai.

Watsonx.ai allows you to save your Current Prompt to work on it later. To do this, simply click on the floppy disk icon in the interface, located in the second top menu of the interface, as illustrated in the image below:

When you click on the floppy disk button, you have the following options:

Prompt template: To save the current prompt to continue working on it later.

To save a prompt template, choose a name, the type of task that are being performed, and provide a description. Then, click on Save.

Upon saving the prompt, it’s possible to already see that it is now within the saved project. Autosave can be enabled to avoid worrying about lost changes, or disabled as well.

Note that the prompt has been saved.

In this example, it may not make much difference, after all, it’s a simple prompt asking the model to write a recipe. However, when it’s a large and structured prompt, this makes a big difference.

Other available options when clicking on the diskette button to save are:

Prompt Session: This option allows to save the history and data of your current session.

Notebooks: The entire prompt will be saved in a notebook in .ipynb format.

To view what is saved in the project, simply access it through the name of project created earlier.

And there’s the assets inside the project.

To access what has been saved, simply click on the name of the asset.

And here it is. The asset that is already saved in the project.

This article finish here. The next one will be a bit more technical (although you don’t need advanced knowledge of programming, just basic understanding to understand it better). Proceed to Getting started with Watsonx.ai IV: https://medium.com/@nathalia.trazzi/getting-started-with-watsonx-ai-iv-015f14db5236

Getting started with Watsonx.ai III

Getting started with Greedy Decoding

Getting started with Sampling Decoding

Setting the “Stopping Criteria”

Written by Nathalia Trazzi