Photo by Hal Gatewood on Unsplash

Fast Prototyping with Hugging Face’s Inference API 🤗

Build AI-based applications with Hugging Face in a few minutes

Marcello Politi
Published in
6 min readMay 4, 2023

--

Introduction

The boom in Large Language Models has brought many people, not just data scientists, closer to the world of AI. Everyone is using these models in their applications, and you certainly don’t have to be a Machine Learning expert to do so, you just need to invoke APIs served by pre-trained models and get their output! So let’s see how to create and use APIs that are useful for doing real-time inference on real data created by experts and that can be integrated into more complex applications.

LLMs are huge models that contain millions of parameters and have taken huge amounts of textual data (e.g., everything on the world wide web) and are able to process it and solve tasks. One of the most curious things about these models is that they generate emergent capabilities, on which they have never been trained. For example, they are able to solve small math or logic problems without having been specifically trained on them. Or they are able to interact with the user as in a chat room.

Many realities are only recently approaching the world of AI and want to use Machine Learning capabilities to create products or offer new services. Often to create a valuable

--

--