How to use Hugging Face API token in Python for AI Application? Step-by-Step 🐾

Alexander Roman
11 min readSep 23, 2023

--

Learn how to use Hugging Face Inference API to set up your AI applications prototypes 🤗. Vision Computer & NLP task.

1. INTRODUCTION

Hugging Face’s API token is a useful tool for developing AI applications. It helps with Natural Language Processing and Computer Vision tasks, among others. This article focuses on providing a step-by-step guide on obtaining and utilizing an Inference API token from Hugging Face, which is free to use, for tasks such object detection and translation.

The article will be presented in 5 sections, which will be described as follows:

  • Section 1: Brief description that acts as the motivating foundation of this article.
  • Section 2: Explanation of Inference API Hugging Face are.
  • Section 3: How to get API token?
  • Section 4: How to use the Hugging Face Inference API?
  • Section 5: Article findings emphasizing the importance of the Inference API and suggesting future ideas.

2. INFERENCE API HUGGING FACE

2.1. What’s an API?

An API, which stands for Application Programming Interface, allows to communicate software application to each other. In other words, they can ask for information and share data. This helps apps work together and do task like sending message or getting information from the internet.

2.2. How does ML / DL API works?

Figure 1 illustrates the architecture of the ML/DL API, which initiated with a request containing the input data (it can be an image, text, voice, etc.,depending on your task) from a software application to the API. The API then facilities the transmission of this input data to the ML/DL model application. Once the ML/DL model processes the input data, it generates predictions based on the input and sends a response containing the prediction as output back to the software application.

Figure 1: Architecture of ML/DL model

The Hugging Face Inference API works like that.

2.3. What is Inference API?

The Hugging Face Inference API is a complimentary service that enables users to execute models hosted on Hugging Face for different tasks, as demonstrated in Figure 2. It is well-suited for prototyping applications.

Figure 2: Tasks in Hugging Face

To utilize the Hugging Face Inference API, users only have to send an HTTP request with their input data. The API will then give back the predicted output data, as shown in Figure 1.

If you need an inference solution for production, check out Inference Endpoints

3. OBTAINING HUGGING FACE API TOKEN

To obtain your hugging face api token, first you need to create your account in Hugging Face.

3.1. Create a Hugging Face account

Step 1: Edit the following boxes shown in the Figure 3. And Next

Figure 3: Form part 1

Step 2: Edit the following boxes, check on I have a read and agree with the Terms of Service and the Code of Conduct such as the Figure 4.

  • Username: The name which display in your hugging face account.
  • Full name: Your full name (obviously! 💁).
  • Avatar: Your profile’s photo (optional).
  • GitHub username: If you have a GitHub account and want to others known it (optional).
  • Homepage: If you have a personal page (it could be your medium account) and want to others known it (optional).
  • Twitter username: If you have a Twitter account and want to others known it (optional).
  • Research interest: If you have a specific topic, area, or subject that you are genuinely curios about (optional).
Figure 4: Form part 2

Step 3: Finally, you need to confirm your account by clicking on the confirmation link sent to your email, as indicated in Figure 5.

The process will be completed, similar to what is shown in Figure 6.

Figure 5: Check email view
Figure 6: email address verified

3.2. Obtaining Access Token

Once you’ve done the previous process. You are already for getting the token.

Step 1: Click on setting.

Figure 7: Setting

Step 2: Click on Access Token.

Figure 8: Access Tokens

Step 3: Click on New Token

Figure 9: New Token

Step 4: Edit the following fields, as illustrated in the Figure 10, and click on Generate a token.

  • Name: The name of your token
  • Role: This field in read because you just need running inference on a model hosted on the Hugging Face Hub
Figure 10: Create a new access token

Step 5: Finally, you have successfully created your first Access Token. Congratulations 😄!!! You will use it in the next section.

Figure 11: Token created

You can copy your access token by clicking on the icon located to the right.

4. USING HUGGING FACE API TOKEN

In this section we are going to code in Python using Google Colab.

4.1. What models will we use?

4.2. Object detection task

Step 1: Import the libraries that we will use

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

Why these libraries?

  • json: It formats the predictions (output) in a user-friendly JSON format to our application.
  • requests: It allows us to send the data (input) to the API.
  • time: It helps prevent overloading the Inference API with a largue number of requests.
  • cv2: It helps us read the image, draw rectangle in the image and put text in the image.
  • from google.colab.patches import cv2_imshow: It helps us display the image because Google Colab doesn’t support the cv2.imshow function.

Step 2: Download the image of the Figure 12 as a savanna.jpg

Figure 12: Input Image

Step 3: Set up our token access.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}
  • We have to paste the token access in <your token access>.

Step 4: Set up the API_URL of the specific model.

Figure 13: Inference API in DETR
  • Finally, copy the API_URL
Figure 14: API URL of DETR-RESNET-50
  • Paste API_URL in our code.
import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

Step 5: Call the API

  • We need to click on → Detailed parameters (huggingface.co).
  • Search for object detection task using F3. Why object detection task? In the figure 15 shows us object detection above the model card view.
Figure 15: Call the API
  • Copy only from the def query(filename) of the example code and paste to our Google Colab.

You can change the input image of the example code. In this case write: savanna.jpg to detect objects.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

def query(filename):

with open(filename, "rb") as f:
data = f.read()

response = requests.request("POST", API_URL, headers=headers, data=data)

return json.loads(response.content.decode("utf-8"))


data = query("cats.jpg")

How can we provide input to the API?

  • Using data = query( “<Our input>”)
  • query → The function that sends the data to the model and returns the model’s prediction (output). It only accepts image as an input.
  • <Our input> → An image with “.jpg” or “.png” format.

How does query(filename) function works?

  • with open(filename, “rb”) as f: → It helps us to read the image file in binary mode.
  • data = f.read() → It helps us to store the image in the variable called data.
  • response = requests.request(“POST”, API_URL, headers=headers, data=data) → The process in Figure 1 is about using the ‘POST’ method to send data to the API, which is referred to as API_URL. The headers define permissions, and the data is the input.
  • return json.loads(response.content.decode(“utf-8”)) It helps us convert the response(output) which is in JSON format into a dictionary format.

Step 6: Manage erros.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

def query(filename):

with open(filename, "rb") as f:
data = f.read()


while True:

try:
time.sleep(1)
response = requests.request("POST", API_URL, headers=headers, data=data)

break

except Exception:

continue

return json.loads(response.content.decode("utf-8"))
return json.loads(response.content.decode("utf-8"))

data = query("savanna.jpg")

The API Inference often sends us errors during the process. For this reason, the process will continue to run until there are no more errors.

Step 7: Print the output data.

import json
import requests
import time
import cv2
from google.colab.patches import cv2_imshow

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/facebook/detr-resnet-50"

def query(filename):

with open(filename, "rb") as f:
data = f.read()


while True:

try:
time.sleep(1)
response = requests.request("POST", API_URL, headers=headers, data=data)

break

except Exception:

continue

return json.loads(response.content.decode("utf-8"))
return json.loads(response.content.decode("utf-8"))

data = query("savanna.jpg")

print(data)

The results will be:

Step 8: Show the image without the results

from google.colab.patches import cv2_imshow
import cv2

image = cv2.imread("savanna.jpg")

#Show image
cv2_imshow(image)

Step 9: Show the image with the results

for result in data:

box = result["box"]
xmin = box["xmin"]
ymin = box["ymin"]
xmax = box["xmax"]
ymax = box["ymax"]
label = result["label"]

# Draw a line between the top-left and bottom-right corners of the bounding box.
cv2.rectangle(image, (xmin, ymin), (xmax, ymax), (0, 0, 255), 2)

# Draw the label.
cv2.putText(image, label, (xmin, ymin - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)

#Show image
cv2_imshow(image)
Figure 16: Image with results

4.3. Translation task

Step 1: Import the libraries that we will use.

import json
import requests
import time

Why these libraries?

  • json: It formats the predictions (output) in a user-friendly JSON format to our application.
  • requests: It allows us to send the data (input) to the API.
  • time: It helps prevent overloading the Inference API with a largue number of requests.

Step 2: Set up our token access.

import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}
  • We have to paste the token access in <your token access>.

Step 3: Set up the API_URL of the specific model.

Figure 17: Inference API OPUS
  • Finally, copy the API_URL
Figure 18: API URL
  • Paste API_URL in our code.
import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/Helsinki-NLP/opus-mt-en-es"

Step 4: Call the API

  • We need to click on → Detailed parameters (huggingface.co).
  • Search for Translation task using F3. Why translation task? In the figure 19 shows us translation above the model card view.
Figure 19: Translation task
  • Copy only from the def query(payload): and before the #Reponse of the example code and paste to our Google Colab.

You can change the input text of the example code. In this case write: “Hello I’m in Alexander’s tutorial” to translate to Spanish.

import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/Helsinki-NLP/opus-mt-en-es"

def query(payload):

data = json.dumps(payload)
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8"))


data = query(
{
"inputs": "Hello I'm in Alexander's tutorial",
}
)

How can we provide input to the API?

  • Using data = query({“inputs”: “<Our input>”})
  • query → The function that sends the data to the model and returns the model’s prediction (output). It only accepts dictionary as an input.
  • {“inputs”: “<Our input>”} → The dictionary that stores our text input.

How does query(payload) function works?

  • data = json.dumps(payload) → It helps us to convert a Python data structure (in this case is dictionary) into a JSON formatted string.
  • response = requests.request(“POST”, API_URL, headers=headers, data=data) → The process in Figure 1 is about using the ‘POST’ method to send data to the API, which is referred to as API_URL. The headers define permissions, and the data is the input.
  • return json.loads(response.content.decode(“utf-8”)) It helps us convert the response(output) which is in JSON format into a dictionary format.

Step 5: Manage erros.

import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/Helsinki-NLP/opus-mt-en-es"

def query(payload):

data = json.dumps(payload)

time.sleep(1)

while True:

try:

response = requests.request("POST", API_URL, headers=headers, data=data)
break

except Exception:

continue

return json.loads(response.content.decode("utf-8"))

data = query(
{
"inputs": "Hello I'm in Alexander's tutorial",
}
)

The API Inference often sends us errors during the process. For this reason, the process will continue to run until there are no more errors.

Step 6: Print the output data.

import json
import requests
import time

token_access = "<your token access>"
headers = {"Authorization": f"Bearer {token_access}"}

API_URL = "https://api-inference.huggingface.co/models/Helsinki-NLP/opus-mt-en-es"

def query(payload):

data = json.dumps(payload)

time.sleep(1)

while True:

try:

response = requests.request("POST", API_URL, headers=headers, data=data)
break

except Exception:

continue

return json.loads(response.content.decode("utf-8"))

data = query(
{
"inputs": "Hello I'm in Alexander's tutorial",
}
)

print(data)

The results will be:

5. CONCLUSION AND RECOMMENDATION

The Inference API Token from Hugging Face is a useful tool for testing pre-trained models, which can be used to create AI prototypes. As a suggestion for future work, we recommend exploring the different tasks we have shown in this article and developing a prototype AI application. This will help gain a better understanding of these technologies and how they can be used in real-world applications.

Great, you are here! Thanks for reading this article :) You can view the full code here: https://colab.research.google.com/drive/1m_rzmAXEScichHqHBGA-IMzyblE4ekE-?usp=sharing.

If you would like to connect with me on LinkedIn, you can find me at: Alexander Daniel Roman Gabriel | LinkedIn

If you’d like to stay updated on when my next article will be published, you can follow me on Instagram: @alexander_romang

--

--

Alexander Roman

Machine Learning Engineer. I enjoy discussing about MLOps, NLP & Chatbots. Follow me at: https://www.linkedin.com/in/alexanderdroman/