Top Truly Free and Open Source Vector Databases 2024
This selection has been done base on factor like the ability to use it directly without registration or with free-but, free-for, free-until and the ability to use it in Docker containers effortlessly
What is a Vector Database?
How Does a Vector Database Work?
1. Qdrant
Qdrant (read: quadrant) is a vector similarity search engine. Use our documentation to develop a production-ready service with a convenient API to store, search, and manage vectors with an additional payload. Qdrant’s expanding features allow for all sorts of neural network or semantic-based matching, faceted search, and other applications.
- Open Source & Free Forever
- Self-Hosted
- Full-Featured
- Organized Community Support
- Learning Resources & Docs
To start with Qdrant you need
docker pull qdrant/qdrant
docker run -p 6333:6333 qdrant/qdrant
2. Chroma
Chroma is the open-source embedding database. Chroma makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs.
Chroma gives you the tools to:
- store embeddings and their metadata
- embed documents and queries
- search embeddings
- multimodal
Chroma prioritizes:
- simplicity and developer productivity
- analysis on top of search
- it also happens to be very quick
Quick How-to use it
Install
pip install chromadb
Get the Chroma Client
import chromadb
chroma_client = chromadb.Client()
Create a collection
collection = chroma_client.create_collection(name="my_collection")
Add some text documents to the collection
collection.add(
documents=["This is a document", "This is another document"],
metadatas=[{"source": "my_source"}, {"source": "my_source"}],
ids=["id1", "id2"]
)
Or you can simply use Chroma’s backend Swagger REST API docs are viewable by running Chroma and navigating to http://localhost:8000/docs
pip install chromadb
chroma run
open http://localhost:8000/docs
3. Milvus
Milvus was created in 2019 with a singular goal: store, index, and manage massive embedding vectors generated by deep neural networks and other machine learning (ML) models.
As a database specifically designed to handle queries over input vectors, it is capable of indexing vectors on a trillion scale. Unlike existing relational databases which mainly deal with structured data following a pre-defined pattern, Milvus is designed from the bottom-up to handle embedding vectors converted from unstructured data.
Install Milvus Standalone with Docker
- Start Milvus.
wget https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh
bash standalone_embed.sh start
- Connect to Milvus
Please refer to Hello Milvus, then run the example code.
- Stop Milvus
To stop Milvus standalone, run:
bash standalone_embed.sh stop
To delete data after stopping Milvus, run:
bash standalone_embed.sh delet
How to use it
from pymilvus import connections
connections.connect(
alias="default",
uri="localhost:19530",
token="root:Milvus",
)