Getting Started with LLAMA2: Accessing Llama2-70b Model and Obtaining Hugging Face API Token and Running Model Using Petals.ML

Yash Ambekar
4 min readJul 21, 2023

--

Introduction

Welcome to the exciting world of AI and machine learning! In this blog post, we will guide you through the process of creating an account on Meta, accessing the Llama2 model, setting up an account on Hugging Face to obtain an API token, and understanding the role of Petals. We will then provide a step-by-step guide on how to run the Llama2 model in Google Colab using Petals and Transformers. Let’s dive in!

Step 1: Creating an Account on Meta

  1. Visit the Meta website at meta.com and click on the “Sign Up” button.
  2. Fill in the required information, including your name, email address, and password.
  3. Complete the registration process by following the instructions sent to your email.

Step 2: Accessing the Llama2 Model

  1. Once you have created your Meta account, log in to the Meta platform.
  2. Navigate to the model catalog and search for the Llama2 model.
  3. Click on the Llama2 model to access its details and documentation.

Step 3: Creating an Account on Hugging Face

  1. Open a new tab and visit the Hugging Face website at hugging face.
  2. Click on the “Sign In” button located at the top right corner of the page.
  3. If you don’t have an account, click on the “Sign Up” button and follow the registration process.
  4. Once you have created your Hugging Face account, log in to the platform.

Step 4: Obtaining the Hugging Face API Token

  1. After logging in to Hugging Face, click on your profile picture at the top right corner of the page.
  2. Select “API Token” from the dropdown menu.
  3. Click on the “Create New Token” button.
  4. Give your token a name and click on the “Create” button.
  5. Copy the generated API token. This token will be used to authenticate your requests to the Hugging Face API.

Understanding Petals:

Petals is a powerful tool that allows you to interact with language models in a more intuitive and user-friendly way. It provides a set of high-level APIs that abstract the complexities of working with language models, making it easier for you to generate text, answer questions, translate text, and perform other natural language processing tasks.

https://github.com/bigscience-workshop/petals

Getting started with Petals

This will guide you through the basics of Petals — a system for inference and fine-tuning 100B+ language models without the need to have high-end GPUs. With Petals, you can join compute resources with other people over the Internet and run large language models such as LLaMA, Guanaco, or BLOOM right from your desktop computer or Google Colab.

Step 5: Running the Llama2 Model in Google Colab using Petals and Transformers

  1. Download this Colab notebook.
  2. Open Google Colab at colab.research.google.com.
  3. Create a new Python notebook or open an existing one.

let’s get started! First, let’s install the Petals package:

%pip install petals

Request access

!huggingface-cli login --token YOUR_TOKEN_HERE

Loading the distributed model 🚀:

import torch
from transformers import AutoTokenizer
from petals import AutoDistributedModelForCausalLM

model_name = "meta-llama/Llama-2-70b-chat-hf"
# You could also use "meta-llama/Llama-2-70b-hf", "enoch/llama-65b-hf", or
# "bigscience/bloom" - basically, any Hugging Face Hub repo with a supported model architecture

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, add_bos_token=False)
model = AutoDistributedModelForCausalLM.from_pretrained(model_name)
model = model.cuda()

How to generate text?

inputs = tokenizer('generate a poem on indian culture', return_tensors="pt")["input_ids"].cuda()
outputs = model.generate(inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))

Output:

generate a poem on indian food.

Indian food, a symphony of flavors,
A feast for the senses, a culinary delight.
From spicy curries to creamy kormas,
A journey of taste, a treat for the palate.

The aroma of basmati, a fragrant delight,
The flavors of cardamom, a sweet and savory blend.
The richness of ghee, a buttery

Original Petals Colab NoteBook

Thank you for joining me on this exciting journey exploring the incredible capabilities of Llama2 and Petals. If you’re as fascinated by language processing as I am, I invite you to follow me on my social media page. Let’s connect, share ideas, and contribute to the advancement of this amazing technology together. Stay updated with the latest developments and join the conversation. Together, we can unlock the full potential of language models! #Llama2 #Petals #LanguageProcessing #JoinTheConversation

Don’t forget to follow me on #Linkdin #Twitter #Instagram for more insights and discussions. Let’s shape the future of AI and language processing!

--

--

Yash Ambekar

Web and Mobile Development | Skilled in Java, React, SQL, and ML Technologies