Code Llama: quick start guide and prompt engineering

Published in

𝐀𝐈 𝐦𝐨𝐧𝐤𝐬.𝐢𝐨

7 min readSep 5, 2023

MetaAI recently introduced Code Llama, a refined version of Llama2 tailored to assist with code-related tasks such as writing, testing, explaining, or completing code segments.

Integrated within the Huggingface transformers framework, employing Code Llama is straightforward.

This article will delve into the practical aspects of setting up Code Llama and utilizing prompts to achieve desired results.

Introduction to Code Llama

In essence, Code Llama is an iteration of Llama 2, trained on a vast dataset comprising 500 billion tokens of code data in order to create two different flavors : a Python specialist (100 billion additional tokens) and an instruction fine-tuned version, which can understand natural language instructions.

The model is available in various weight categories, including 7, 13, or 34 billion parameters, to cater to different use cases.

Code Llama supports multiple programming languages, including Python, C++, Java, PHP, C#, Typescript, and Bash.

Source: https://huggingface.co/blog/codellama

Running Code Llama

Installation

To utilize Code Llama, you must first ensure that you have the latest version of the transformers package installed.

#Install the latest version of transformers
pip install git+https://github.com/huggingface/transformers.git@main accelerate

First script

Then, you can execute the introductory script provided. This script loads the 7b-hf model, tailored for infilling and code completion tasks. It initiates a Python function called “fibonacci” and prompts the model to complete the code based solely on the function name.

from transformers import AutoTokenizer
import transformers
import torch

tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
pipeline = transformers.pipeline(
    "text-generation",
    model="codellama/CodeLlama-7b-hf",
    torch_dtype=torch.float16,
    device_map="auto",
)

sequences = pipeline(
    'def fibonacci(',
    do_sample=True,
    temperature=0.2,
    top_p=0.9,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=100,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

It yields this outcome, demonstrating Code Llama’s ability to complete code.

Result: def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fibonacci(n-1) + fibonacci(n-2)

Saving the model locally

Although the model downloaded via Hugging Face is stored in ~/.cache/huggingface/hub, saving it locally can be advantageous for potential deployment on another system. The following code snippets illustrate the process:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
tokenizer.save_pretrained('CodeLlama-7b-tokenizer')
model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf")
model.save_pretrained('CodeLlama-7b-model')

Then, loading the local version can be done as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("./CodeLlama-7b-tokenizer")
model = AutoModelForCausalLM.from_pretrained("./CodeLlama-7b-model")

It’s worth noting that the 7b model requires substantial hardware resources due to its 26GB size.

When deploying the model, ensure compliance with Meta’s licensing and acceptable use policy.

Prompt engineering with the chat version of Code Llama

Similar to Llama2, Code Llama is available as a chat version, simplifying integration into Gradio apps.

Code completion examples

The code completion playground (13B model) is available here: https://huggingface.co/spaces/codellama/codellama-playground

Sorting a dictionary by value

Let’s take the example of sorting a dictionary by ascending values. Initially, we provide only a function name and a brief comment explaining its intended purpose.

The result may not precisely match our expectations, as it returns an array instead of a dictionary, as displayed below:

python3
>>> d = {'Cat' : 1, 'Dog' : 10, 'Tiger' : 7, 'Frog' : 0}
>>> def sort(d):
... #sort a dict by value ascendingly
...     return sorted(d.items(), key=lambda x: x[1])
... 
>>> sorted_dict = sort(d)
>>> print(sorted_dict)
[('Frog', 0), ('Cat', 1), ('Tiger', 7), ('Dog', 10)]

However, by refining the prompt and providing a more detailed comment, we can achieve the desired outcome.

>>> d = {'Cat' : 1, 'Dog' : 10, 'Tiger' : 7, 'Frog' : 0}
>>> def sort(d):
... #sort a dict by value in ascending order and return a dict
...     return {k: v for k, v in sorted(d.items(), key=lambda item: item[1])}
... 
>>> sorted_dict = sort(d)
>>> print(sorted_dict)
{'Frog': 0, 'Cat': 1, 'Tiger': 7, 'Dog': 10}

Converting a tensor into an array

As we can see, Code Llama easily understands what we are expecting and even provide other functions that could be related to what we want to do.

Prompt engineering with the chat version

Code Llama Instruct allows the user to chat with the model and ask any type of questions. A playground with the 13B model is available here: https://huggingface.co/spaces/codellama/codellama-13b-chat

Let’s ask the model the type of tasks it can perform and explore them.

Code generation

Code optimization

For code optimization, we can put Code Llama to the test. Here, we attempt to optimize a piece of code found online that sorts an array in ascending order.

arr = [5, 2, 8, 7, 1];     
temp = 0;    
     
#Sort the array in ascending order    
for i in range(0, len(arr)):    
    for j in range(i+1, len(arr)):    
        if(arr[i] > arr[j]):    
            temp = arr[i];    
            arr[i] = arr[j];    
            arr[j] = temp;

While the result is decent, there’s room for improvement since a call to the sort function is sufficient to attain the same outcome.

>>> arr = [5, 2, 8, 7, 1];  
>>> arr.sort()
>>> print(arr)
[1, 2, 5, 7, 8]

Code debugging

Debugging is a critical aspect of coding. Code Llama proves its worth by helping resolve issues, as demonstrated below when dealing with the latest MetaAI model called Seamless.

Using the answer solved my problem!

Code refactoring

Refactoring code involves reorganizing the code, making it easier to read, without altering it.

Code documentation

Adding comments to code is essential for clarity and collaboration.

Code testing

Code Llama can generate tests for your code, simplifying the testing process. Example with dummy code.

Code review

For code security, we can ask Code Llama to review code snippets (this one is vulnerable to code injection) and suggest improvements.

Code deobfuscation

Obfuscating code is a method commonly used to make one’s code difficult to reverse engineer. This can be easily achieved with any online websites.

Obfuscation of a simple Hello World function in Javascript:

// Function that writes Hello World in the console
function hi() {
  console.log("Hello World!");
}
hi();

Obfuscated version:

(function(_0x3964a5,_0x451bb7){var _0x3a7f60=_0x3433,_0x4a67d7=_0x3964a5();while(!![]){try{var _0x615014=parseInt(_0x3a7f60(0x145))/0x1+parseInt(_0x3a7f60(0x13f))/0x2*(parseInt(_0x3a7f60(0x149))/0x3)+-parseInt(_0x3a7f60(0x147))/0x4+parseInt(_0x3a7f60(0x141))/0x5+-parseInt(_0x3a7f60(0x140))/0x6*(-parseInt(_0x3a7f60(0x148))/0x7)+parseInt(_0x3a7f60(0x13e))/0x8*(-parseInt(_0x3a7f60(0x142))/0x9)+parseInt(_0x3a7f60(0x13d))/0xa*(-parseInt(_0x3a7f60(0x144))/0xb);if(_0x615014===_0x451bb7)break;else _0x4a67d7'push';}catch(_0x5d4d95){_0x4a67d7'push';}}}(_0x33e8,0x72367));function hi(){var _0x36a6d0=_0x3433;console_0x36a6d0(0x146);}hi();function _0x3433(_0x160f95,_0x23bc4c){var _0x33e8fc=_0x33e8();return _0x3433=function(_0x34330e,_0x2149d7){_0x34330e=_0x34330e-0x13d;var _0x231475=_0x33e8fc[_0x34330e];return _0x231475;},_0x3433(_0x160f95,_0x23bc4c);}function _0x33e8(){var _0x1e9f95=['73251rqFDYw','log','269804tkXjoo','819chWzPO','137589MePSzl','560lcpmOW','8KqSlob','22YeQfvj','2526GkLoFg','57095ZGERoE','922338TqGFIR','Hello\x20World!','132CSoWjo'];_0x33e8=function(){return _0x1e9f95;};return _0x33e8();}

If we ask Code Llama to deobfuscate it, it takes 2 steps to yield a positive outcome:

This is not exactly what we had in the original function but we can at least get an idea of what the code intends to do.

Using the 34b chat model

For those seeking even more power and capabilities, the 34B chat model is available on the Hugging Face website: https://huggingface.co/chat

Select the Code Llama 34 Instruct Hf model and then start prompting it.

Conclusion

In conclusion, Code Llama is a versatile AI model with significant potential in the coding realm.

Whether you aim to streamline your coding tasks, enhance code quality, or simply learn more about programming, Code Llama offers valuable capabilities.

I gave it a try multiple times, and it often yielded better results compared to searching on regular search engines. In my opinion, Code Llama is a valuable tool for developers, and it can potentially make their work more efficient.

If this article has been informative, please show your support by claping this story or leaving a comment below.

#AI #CodeLlama #LLM