🤯AI: Embeddings

Zoiner Tejada
2 min readMar 16, 2023

--

Turn text into coordinates.

[This article is a part of the 🤯AI series]

What are embeddings?

An embedding is an information dense representation of the semantic meaning of a piece of text.

It simply a vector of floating-point numbers, such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format.

The power of embeddings

Embeddings basically convert text to coordinates that maintain relative distances between text that has similar meaning.

Applications include:

  • Semantic similarity
  • Clustering
  • Full-text search
  • Classification
  • Recommendations

Embeddings in Azure OpenAI

You can use the very powerful models from Azure OpenAI to acquire the embeddings for a given piece of text. In Python, you access this API using the openai library.

Here’s an example call:

response = openai.Embedding.create(
engine="gpt3-similarity-babbage-001",
input="A neutron star is the collapsed core of a massive supergiant star."
)

What you get back is a vector of numbers, whose shape depends on which model you selected for creating the embeddings. In the case of the above code (where we selected babbage-001 ) the returned vector has 2,048 dimensions:

len(response.data[0].embedding)
2048

Taking a peak at this vector, we get something like this for the above text and model:

[0.012890788726508617,
0.01312282308936119,
-0.006028592120856047,
-0.000760019407607615,
0.022240908816456795,
0.018734613433480263,
-0.01254703477025032,
-0.005061783362179995,
0.009289962239563465,
-0.016311144456267357,
-0.011825150810182095,
...
]

Think of this vector as a highly nuanced coordinate system, where each dimension encodes some kind of meaning such that if you encoded another piece of text about neutron stars it would be closer to the original text than say a piece of text about movie stars. 🤯

Want to try this out? Grab a copy of this notebook and import it into your Azure Machine Learning environment.

The only additional thing you will need is to deploy your own endpoint and then retrieve your OpenAI API Key. Here are the standard instructions to do that: Get started generating text using Azure OpenAI Service

--

--

Zoiner Tejada

CEO Solliance | Entrepreneur | Investor | AI Afficionado | Microsoft MVP | Recognized as Microsoft Regional Director | Published Author