Top Truly Free and Open Source Vector Databases 2024

Francesco Cozzolino
3 min readFeb 7, 2024

--

This selection has been done base on factor like the ability to use it directly without registration or with free-but, free-for, free-until and the ability to use it in Docker containers effortlessly

What is a Vector Database?

Vectors need a new kind of database (Image Source)

How Does a Vector Database Work?

How does a vector database work? (Image Source)

1. Qdrant

Documentation

Qdrant (read: quadrant) is a vector similarity search engine. Use our documentation to develop a production-ready service with a convenient API to store, search, and manage vectors with an additional payload. Qdrant’s expanding features allow for all sorts of neural network or semantic-based matching, faceted search, and other applications.

  • Open Source & Free Forever
  • Self-Hosted
  • Full-Featured
  • Organized Community Support
  • Learning Resources & Docs

To start with Qdrant you need

docker pull qdrant/qdrant 
docker run -p 6333:6333 qdrant/qdrant

2. Chroma

Documentation

Chroma is the open-source embedding database. Chroma makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs.

Chroma gives you the tools to:

  • store embeddings and their metadata
  • embed documents and queries
  • search embeddings
  • multimodal

Chroma prioritizes:

  • simplicity and developer productivity
  • analysis on top of search
  • it also happens to be very quick

Quick How-to use it

Install

pip install chromadb

Get the Chroma Client

import chromadb
chroma_client = chromadb.Client()

Create a collection

collection = chroma_client.create_collection(name="my_collection")

Add some text documents to the collection

collection.add(
documents=["This is a document", "This is another document"],
metadatas=[{"source": "my_source"}, {"source": "my_source"}],
ids=["id1", "id2"]
)

Or you can simply use Chroma’s backend Swagger REST API docs are viewable by running Chroma and navigating to http://localhost:8000/docs

pip install chromadb
chroma run
open http://localhost:8000/docs

3. Milvus

Documentation

Milvus was created in 2019 with a singular goal: store, index, and manage massive embedding vectors generated by deep neural networks and other machine learning (ML) models.

As a database specifically designed to handle queries over input vectors, it is capable of indexing vectors on a trillion scale. Unlike existing relational databases which mainly deal with structured data following a pre-defined pattern, Milvus is designed from the bottom-up to handle embedding vectors converted from unstructured data.

Install Milvus Standalone with Docker

  • Start Milvus.
wget https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh
bash standalone_embed.sh start
  • Connect to Milvus

Please refer to Hello Milvus, then run the example code.

  • Stop Milvus

To stop Milvus standalone, run:

bash standalone_embed.sh stop

To delete data after stopping Milvus, run:

bash standalone_embed.sh delet

How to use it

from pymilvus import connections
connections.connect(
alias="default",
uri="localhost:19530",
token="root:Milvus",
)

--

--