Pinecone Scala Client Just Landed 🛸
Everywhere you look, it’s all about text embeddings now! For those unfamiliar with the lingo, text embeddings are numerical vectors representing words (tokens), sentences, or even entire documents. This rich representation holds the meaning of the text (its semantics) in a format that’s easier to work with compared to words — discrete, syntactic wrappings. This is true especially for neural nets and dedicated storages called vector databases. These vectors, usually consisting of hundreds or even thousands of floating-point numbers, preserve the semantic similarity, i.e., words with similar meanings, such as a dog and a wolf, are mapped to vectors that are close to each other (in a high-dimensional space).
Vector databases are specifically designed to handle numerical vectors effectively, allowing for quick storage, search, and retrieval. Besides the most common use case, which is semantic search (finding texts/entries most similar to a given one), it’s also often desired to construct semantic textual similarity, where all text pairs are “compared” to uncover global overlaps. Another step this direction is clustering, where various hierarchies and similarity clusters are constructed, for instance, to gain a visual insight into the data.
One extremely cool feature that relatively few people, even those who use vector databases on a daily basis, are aware of is aggregation (by simple averaging). Imagine you have a lot of reviews for a certain product. Now, if you want to obtain an overall idea about the product, you can aggregate these reviews by summing all their associated embedding vectors (element-wise). The parts of the vectors representing shared features will build up, whereas the parts without a clear overlap will cancel out towards zero, similar to how correlation works. Then, of course, for a new vector to be valid, we need to normalize it and potentially also weight the vectors by the lengths of partial texts (which is common if, for instance, a large document is split into uneven chunks).
All of these use cases, however, are nothing dramatically new. The main reason why the AI hype reached the shores of vector databases is that they can provide a quick memory for autonomous AI agents such as babyAGI and AutoGPT. Any information that an agent acquires while performing its task can be stored and subsequently searched very quickly to provide context for later prompts.
At Cequence we have previously implemented the first Scala client for the OpenAI API. This time we decided to target the most popular vector database, Pinecone, for which we implemented the very first Scala client as well. This is another piece of the puzzle that will benefit Scala community as well as our contract management tool! Hooray! 🚀
Pinecone’s star is rising. Just last week, the company announced $100M Series B Funding (a lot of dough to play with). Besides Pinecone, other widely-used vector databases are Chroma, Weaviate, and Faiss.
The Pinecone Scala client was released on April 27th 2023 and is completely open-source. The project is available on GitHub at https://github.com/cequence-io/pinecone-scala. Note that if you are reading this post in the future 🤨, the current version (0.0.1) is likely to be different. You can check for updates on GitHub or Maven.
In order to use the lib add the following dependency to your build.sbt:
"io.cequence" %% "pinecone-scala-client" % "0.0.1"
or to pom.xml (if you use maven)
<dependency>
<groupId>io.cequence</groupId>
<artifactId>pinecone-scala-client_2.12</artifactId>
<version>0.0.1</version>
</dependency>
The currently supported Scala versions are 2.12, 2.13 and 3.
All available endpoints as defined here are provided in two convenient services called PineconeVectorService and PineconeIndexService.
Config ⚙️
Before using the library set the following env. variables: PINECONE_SCALA_CLIENT_API_KEY
and PINECONE_SCALA_CLIENT_ENV
. Alternatively, you can provide your own config file and override the configuration options (including timeouts) as specified in the default config pinecone-scala-client.conf. If you don’t have a Pinecone account yet register here.
Usage 👨‍🎓
First you need to provide an implicit execution context as well as akka materializer, e.g., as
implicit val ec = ExecutionContext.global
implicit val materializer = Materializer(ActorSystem())
Ia. Obtaining PineconeIndexService
- Default config (expects env. variable(s) to be set as defined in Config section)
val service = PineconeIndexServiceFactory()
- Custom config
val config = ConfigFactory.load("path_to_my_custom_config")
val service = PineconeIndexServiceFactory(config)
- Without config
val service = PineconeIndexServiceFactory(
apiKey = "your_api_key",
environment = "your_env" // e.g. "northamerica-northeast1-gcp
)
Ib. Obtaining PineconeVectorService
Same as with PineconeIndexService, you need to first provide implicit execution context and akka materializer. Then you can obtain a service in one of the following ways:
- Default config (expects env. variable(s) to be set as defined in Config section). Note that if the index with a given name is not available, the factory will return None.
PineconeVectorServiceFactory("index_name").map { serviceOption =>
val service = serviceOption.getOrElse(
throw new Exception(s"Index with a given name does not exist.")
)
// do something with the service
}
- Custom config
val config = ConfigFactory.load("path_to_my_custom_config")
PineconeVectorServiceFactory("index_name", config).map { service =>
val service = serviceOption.getOrElse(
throw new Exception(s"Index with a given name does not exist.")
)
// do something with the service
}
- Without config
PineconeVectorServiceFactory(
apiKey = "your_api_key",
indexName = "index_name", // e.g. "auto-gpt"
pineconeIndexService = pineconeIndexService // index service to be used to find the index host URL
).map { serviceOption =>
val service = serviceOption.getOrElse(
throw new Exception(s"Index with a given name does not exist.")
)
// do something with the service
}
II. Calling functions
Once we have created pineconeIndexService and/or pineconeVectorService we can start actually doing something useful. Since all the calls are async they return responses wrapped in Future
.
- List indexes
pineconeIndexService.listIndexes.map(
_.foreach(println)
)
- Create an index with default settings
import io.cequence.pineconescala.domain.response.CreateResponse
pineconeIndexService.createIndex(
name = "auto-gpt-test",
dimension = 1536
).map(
_ match {
case CreateResponse.Created => println("Index successfully created.")
case CreateResponse.BadRequest => println("Index creation failed. Request exceeds quota or an invalid index name.")
case CreateResponse.AlreadyExists => println("Index with a given name already exists.")
}
)
- Upsert vectors
val dimension = 1536
pineconeVectorService.upsert(
vectors = Seq(
PVector(
id = "666",
values = Seq.fill(dimension)(Random.nextDouble),
metadata = Map(
"is_relevant" -> "not really but for testing it's ok, you know",
"food_quality" -> "brunches are perfect but don't go there before closing time"
)
),
PVector(
id = "777",
values = Seq.fill(dimension)(Random.nextDouble),
metadata = Map(
"is_relevant" -> "very much so",
"food_quality" -> "burritos are the best!"
)
)
),
namespace = "my_namespace",
).map(vectorUpsertedCount =>
println(s"Upserted $vectorUpsertedCount vectors.")
)
- Query vectors with custom settings
pineconeVectorService.query(
vector = Seq.fill(1536)(Random.nextDouble), // some values/embeddings
namespace = "my_namespace",
settings = QuerySettings(
topK = 5,
includeValues = true,
includeMetadata = true
)
).map { queryResponse =>
queryResponse.matches.foreach { matchInfo =>
println(s"Matched vector id: ${matchInfo.id}")
println(s"Matched vector values: ${matchInfo.values.take(20).mkString(", ")}..")
println(s"Matched vector score: ${matchInfo.score}")
println(s"Matched vector metadata: ${matchInfo.metadata}")
}
}
We don’t want to pollute this post with an exhaustive list of all calls/examples (18 in total), therefore we recommend you to continue by reading Usage section of the README of the project on GitHub.
In addition, we implemented two ready-to-run demos as separate seed projects:
- Pinecone Scala Demo 🔥 — shows how to use Pinecone vector, index, and collection operations.
- Pinecone + OpenAI Scala Demo 🔥🔥 — shows how to generate and store OpenAI embeddings (with text-embedding-ada-002 model— 1536 dimensional) into Pinecone and query them afterwards. Based on the official tutorial from Pinecone.