Unstructured Data Service
Milvus: A big leap to scalable AI vector database
Hello, Milvus project
The challenge with unstructured data similarity search
The explosion in unstructured data, such as images, videos, sound records, and text, requires an effective solution for computer vision, voice recognition, and natural language processing. How to extract value from unstructured data poses as a big challenge for many enterprises.
AI, especially deep learning, has been proved as an effective solution. Vectorization of data features enables people to perform content-based search on unstructured data. For example, you can perform content-based image retrieval, including facial recognition and object detection, etc.
Now the challenge turns into how to execute effectively search among billions of vectors. That’s what Milvus, a database for AI, is designed for.
What is the Milvus vector database?
Milvus is an open source distributed vector database that provides state-of-the-art similarity search and analysis of feature vectors and unstructured data. Some of its key features are:
- GPU-accelerated search engine
The Milvus vector database is designed for the largest scale of vector index. CPU/GPU heterogeneous computing architecture allows you to process data at a speed 1000 times faster.
- Intelligent index
With a “Decide Your Own Algorithm” approach, you can embed machine learning and advanced algorithms into the Milvus vector database without the headache of complex data engineering or migrating data between disparate systems. The Milvus vector database is built on optimized indexing algorithm based on quantization indexing, tree-based and graph indexing methods.
- Strong scalability
The data is stored and computed on a distributed architecture. This lets you scale data sizes up and down without redesigning the system.
- High compatibility
The Milvus vector database is compatible with major AI/ML models and programming languages such as C++, Java and Python.
Billion-Scale similarity search
You may follow this link for step-by-step procedures to carry out performance test on 100 million vector similarity search (SIFT1B).
If you want, you can also try testing 1 billion with the Milvus vector database. Here is the hardware requirements.
Join us
Milvus has been open sourced lately. We greatly welcome contributors to join us in reinventing data science!