How to use Hugging Face API token in Python for AI Application? Step-by-Step 🐾

11 min readSep 23, 2023

Learn how to use Hugging Face Inference API to set up your AI applications prototypes 🤗. Vision Computer & NLP task.

1. INTRODUCTION

Hugging Face’s API token is a useful tool for developing AI applications. It helps with Natural Language Processing and Computer Vision tasks, among others. This article focuses on providing a step-by-step guide on obtaining and utilizing an Inference API token from Hugging Face, which is free to use, for tasks such object detection and translation.

The article will be presented in 5 sections, which will be described as follows:

Section 1: Brief description that acts as the motivating foundation of this article.
Section 2: Explanation of Inference API Hugging Face are.
Section 3: How to get API token?
Section 4: How to use the Hugging Face Inference API?
Section 5: Article findings emphasizing the importance of the Inference API and suggesting future ideas.

2. INFERENCE API HUGGING FACE

2.1. What’s an API?

An API, which stands for Application Programming Interface, allows to communicate software application to each other. In other words, they can ask for information and share data. This helps apps work together and do task like sending message or getting information from the internet.

2.2. How does ML / DL API works?

Figure 1 illustrates the architecture of the ML/DL API, which initiated with a request containing the input data (it can be an image, text, voice, etc.,depending on your task) from a software application to the API. The API then facilities the transmission of this input data to the ML/DL model application. Once the ML/DL model processes the input data, it generates predictions based on the input and sends a response containing the prediction as output back to the software application.

The Hugging Face Inference API works like that.

2.3. What is Inference API?

The Hugging Face Inference API is a complimentary service that enables users to execute models hosted on Hugging Face for different tasks, as demonstrated in Figure 2. It is well-suited for prototyping applications.

To utilize the Hugging Face Inference API, users only have to send an HTTP request with their input data. The API will then give back the predicted output data, as shown in Figure 1.

If you need an inference solution for production, check out Inference Endpoints

3. OBTAINING HUGGING FACE API TOKEN

To obtain your hugging face api token, first you need to create your account in Hugging Face.

3.1. Create a Hugging Face account

Step 1: Edit the following boxes shown in the Figure 3. And Next

Step 2: Edit the following boxes, check on I have a read and agree with the Terms of Service and the Code of Conduct such as the Figure 4.

Username: The name which display in your hugging face account.
Full name: Your full name (obviously! 💁).
Avatar: Your profile’s photo (optional).
GitHub username: If you have a GitHub account and want to others known it (optional).
Homepage: If you have a personal page (it could be your medium account) and want to others known it (optional).
Twitter username: If you have a Twitter account and want to others known it (optional).
Research interest: If you have a specific topic, area, or subject that you are genuinely curios about (optional).

Step 3: Finally, you need to confirm your account by clicking on the confirmation link sent to your email, as indicated in Figure 5.

The process will be completed, similar to what is shown in Figure 6.

3.2. Obtaining Access Token

Once you’ve done the previous process. You are already for getting the token.

Step 1: Click on setting.

Step 2: Click on Access Token.

Step 3: Click on New Token

Step 4: Edit the following fields, as illustrated in the Figure 10, and click on Generate a token.

Name: The name of your token
Role: This field in read because you just need running inference on a model hosted on the Hugging Face Hub

Step 5: Finally, you have successfully created your first Access Token. Congratulations 😄!!! You will use it in the next section.

You can copy your access token by clicking on the icon located to the right.

4. USING HUGGING FACE API TOKEN

In this section we are going to code in Python using Google Colab.

4.1. What models will we use?

Object detection task: We will use DETR (End-to-End Object Detection) model with ResNet-50 backbone. It was trained with 118k annotated images.
Translation task: We will use OPUS Neural Machine Translation Model trained to translate from English to Spanish.

4.2. Object detection task

Step 1: Import the libraries that we will use

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

Why these libraries?

json: It formats the predictions (output) in a user-friendly JSON format to our application.
requests: It allows us to send the data (input) to the API.
time: It helps prevent overloading the Inference API with a largue number of requests.
cv2: It helps us read the image, draw rectangle in the image and put text in the image.
from google.colab.patches import cv2_imshow: It helps us display the image because Google Colab doesn’t support the cv2.imshow function.

Step 2: Download the image of the Figure 12 as a savanna.jpg

Step 3: Set up our token access.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

We have to paste the token access in <your token access>.

Step 4: Set up the API_URL of the specific model.

We need to click on → facebook/detr-resnet-50 · Hugging Face
Next, click on deploy → Click on Inference API

Finally, copy the API_URL

Paste API_URL in our code.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

Step 5: Call the API

We need to click on → Detailed parameters (huggingface.co).
Search for object detection task using F3. Why object detection task? In the figure 15 shows us object detection above the model card view.

Copy only from the def query(filename) of the example code and paste to our Google Colab.

You can change the input image of the example code. In this case write: savanna.jpg to detect objects.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

def query(filename):

    with open(filename, "rb") as f:
        data = f.read()

    response = requests.request("POST", API_URL, headers=headers, data=data)

    return json.loads(response.content.decode("utf-8"))


data = query("cats.jpg")

How can we provide input to the API?

Using data = query( “<Our input>”)
query → The function that sends the data to the model and returns the model’s prediction (output). It only accepts image as an input.
<Our input> → An image with “.jpg” or “.png” format.

How does query(filename) function works?

with open(filename, “rb”) as f: → It helps us to read the image file in binary mode.
data = f.read() → It helps us to store the image in the variable called data.
response = requests.request(“POST”, API_URL, headers=headers, data=data) → The process in Figure 1 is about using the ‘POST’ method to send data to the API, which is referred to as API_URL. The headers define permissions, and the data is the input.
return json.loads(response.content.decode(“utf-8”)) → It helps us convert the response(output) which is in JSON format into a dictionary format.

Step 6: Manage erros.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

def query(filename):

    with open(filename, "rb") as f:
        data = f.read()
    
    
    while True:

      try:
          time.sleep(1)
          response = requests.request("POST", API_URL, headers=headers, data=data)
      
          break

      except Exception:

          continue

    return json.loads(response.content.decode("utf-8"))
    return json.loads(response.content.decode("utf-8"))

data = query("savanna.jpg")

The API Inference often sends us errors during the process. For this reason, the process will continue to run until there are no more errors.

Step 7: Print the output data.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

def query(filename):

    with open(filename, "rb") as f:
        data = f.read()
    
    
    while True:

      try:
          time.sleep(1)
          response = requests.request("POST", API_URL, headers=headers, data=data)
      
          break

      except Exception:

          continue

    return json.loads(response.content.decode("utf-8"))
    return json.loads(response.content.decode("utf-8"))

data = query("savanna.jpg")

print(data)

The results will be:

Step 8: Show the image without the results

from google.colab.patches import cv2_imshow
import cv2

image = cv2.imread("savanna.jpg")

#Show image
cv2_imshow(image)

Step 9: Show the image with the results

for result in data:
   
    box = result["box"]
    xmin = box["xmin"]
    ymin = box["ymin"]
    xmax = box["xmax"]
    ymax = box["ymax"]
    label = result["label"]

    # Draw a line between the top-left and bottom-right corners of the bounding box.
    cv2.rectangle(image, (xmin, ymin), (xmax, ymax), (0, 0, 255), 2)

    # Draw the label.
    cv2.putText(image, label, (xmin, ymin - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)

#Show image
cv2_imshow(image)

4.3. Translation task

Step 1: Import the libraries that we will use.

import json
import requests
import time

Why these libraries?

json: It formats the predictions (output) in a user-friendly JSON format to our application.
requests: It allows us to send the data (input) to the API.
time: It helps prevent overloading the Inference API with a largue number of requests.

Step 2: Set up our token access.

import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

We have to paste the token access in <your token access>.

Step 3: Set up the API_URL of the specific model.

We need to click on → Helsinki-NLP/opus-mt-en-es · Hugging Face
Next, click on deploy → Click on Inference API

Finally, copy the API_URL

Paste API_URL in our code.

import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/Helsinki-NLP/opus-mt-en-es"

Step 4: Call the API

We need to click on → Detailed parameters (huggingface.co).
Search for Translation task using F3. Why translation task? In the figure 19 shows us translation above the model card view.

Copy only from the def query(payload): and before the #Reponse of the example code and paste to our Google Colab.

You can change the input text of the example code. In this case write: “Hello I’m in Alexander’s tutorial” to translate to Spanish.

import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/Helsinki-NLP/opus-mt-en-es"

def query(payload):

    data = json.dumps(payload)
    response = requests.request("POST", API_URL, headers=headers, data=data)
    return json.loads(response.content.decode("utf-8"))


data = query(
    {
        "inputs": "Hello I'm in Alexander's tutorial",
    }
)