How to Connect to IBM Watsonx.data Milvus service using Milvus Command Line Interface(CLI)

Prabhu Nair
Milvus Meets Watsonx
6 min readAug 4, 2024

Note- This blog post will cover how to connect to Milvus service(provisioned in IBM watsonx.data -SaaS) using milvus-cli, will soon cover the details for Software/CPD.

Milvus is an open-source vector database designed to handle large-scale vector data and perform similarity search efficiently. It excels in applications requiring high-dimensional vector similarity, such as recommendation systems, image search, and natural language processing.

IBM watsonx.data is part of IBM’s suite of AI and data tools, offering capabilities for data integration, analysis, and AI-driven insights. It supports various data types and sources, including structured and unstructured data.

Integrating Milvus with IBM watsonx.data can enable advanced data analytics and AI capabilities on vectorised data.

Note- This is not an IBM official documentation

This blog post explores the Milvus CLI, offering a guide to its commands and functionalities to help you leverage this tool for effective vector database management such as creating collections, inserting data, search and querying.

Here’s a step-by-step guide to connect to a Milvus service using milvus_cli in a python environment.

Prerequisites

Python >= 3.8.5

Install Pymilvus → pip install pymilvus==2.4.3

Set Up Milvus: Ensure that the Milvus server is up and running , detailed information on provisioning an Milvus service in watsonx.data is available at watsonx.data-milvus .

Obtain Connection Details: Get the necessary connection details for your IBM Watsonx.data — Milvus service. This typically includes the host, port, and any authentication information if required.

Example-

If you have an Milvus service running in Watsonx.data, you can get Milvus host and port details from Infrastructure manager (Navigate to Infrastructure manager → click on Milvus service → obtain the host and port information from the GRPC host).

fig-1

Create a virtual environment using python3 -m venv path/to/venv. Then activate the virtual environment source path/to/venv/bin/activate

Getting Started with Milvus CLI

Make sure you have milvus_cli installed. You can install it via pip:

pip install milvus-cli

Usage

In a Python environment, run milvus_cli command . If successful, milvus_cli<version> will display as shown in the following figure and we are good to run all the supported commands.

milvus_cli             



__ __ _ _ ____ _ ___
| \/ (_) |_ ___ _ ___ / ___| | |_ _|
| |\/| | | \ \ / / | | / __| | | | | | |
| | | | | |\ V /| |_| \__ \ | |___| |___ | |
|_| |_|_|_| \_/ \__,_|___/ \____|_____|___|

Milvus cli version: 1.0.0
Pymilvus version: 2.4.3

Milvus-CLI Commands

  1. connect → To connect to Milvus service, run the below command

Provide the host & port details obtained from fig-1.

syntax-

connect -uri https://host:port -t user:password

Example

milvus_cli > connect -uri https://127.0.0.1:19530 -t userid:password
Connect Milvus successfully.

Note- The above connection will work for the Milvus service provisioned in IBM cloud(SaaS), will soon update the connection syntax for software.

2. Create Database —To Create Database in Milvus

Syntax

create database -db databaseName

Example

3. Use database- To use the created database

Syntax

use database -db databaseName

Example

4. list databases → To list all the databases

Syntax

list databases

Examples

5. create collection → To Create a collection

Syntax

create collection -c collection_name -f schema-field -p primary-field -a

Syntax description

To define field name 
-f <fieldname>:<datatype>:<dimofvector/primary_field/desc>

To define multiple field name
-f <fieldname>:<datatype>:<dimofvector/primary_field/desc> -f <fieldname>:<datatype>:<dimofvector/primary_field/desc>

-a Flag to generate IDs automatically

Example

6. show collection → To show the collections

Syntax

show collection -c collection_name

Example

7. create partition → Creates a partition.

Syntax

create partition -c collection_name -p partition_name -d descr

Example

8. Create Index → To create Index

Syntax

create index

Example-

milvus_cli > create index
Collection name (employee): employee
The name of the field to create an index for (id, vectorfield, phone): vectorfield
Index name: vectorIndex
Index type (FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, RNSG, HNSW, ANNOY, AUTOINDEX, DISKANN, GPU_IVF_FLAT, GPU_IVF_PQ, SPARSE_INVERTED_INDEX, SPARSE_WAND, SCANN, STL_SORT, Trie, INVERTED, ) []: IVF_FLAT
Vector Index metric type (L2, IP, HAMMING, TANIMOTO, COSINE, ) []: L2
Index params nlist: 10
Status(code=0, message=)
Create index successfully!

9. Show Index

show index -c collection_name -in index_name

Example

10. Insert Data → Insert data into a collection

Syntax

insert -c collection_name -p parition_name <file_path>
Example-

insert -c employee -p QA /Users/prabhupnair/desktop/Milvus_Testing/bulkinsert/data5.csv
Reading file from local path.
Opening csv file(26076 bytes)...
Reading csv rows... [####################################] 100%
Column names are ['vectorfield', 'phone']
Processed 11 lines.

Inserted successfully.

(insert count: 10, delete count: 0, upsert count: 0, timestamp: 451571464455323651, success count: 10, err count: 0, cost: 0)

Note- if we have provided -a flag while creating the collection then it will generate IDs automatically.

11. Load → to load the collection

Syntax

load collection -c collection_name

Example

12. Search → Performs a vector similarity search or hybrid search.

Syntax

milvus_cli > search
Collection name (employee): employee
The vectors of search data (the length of data is number of query (nq), the dim of every vector in data must be equal to vector field’s of collection. You can also import a CSV file without headers): /Users/prabhupnair/desktop/Milvus_Testing/bulkinsert/search1.csv
The vector field used to search of collection (id, vectorfield, phone): vectorfield
Metric type: L2
Search parameter nprobe's value: 10
Groups search by Field []:
The specified number of decimal places of returned distance [-1]:
The max number of returned record, also known as topk: 1
The boolean expression used to filter attribute []:
Timeout []:

Syntax description

Collection name (employee): <Enter the collection name>
The vectors of search data: <provide the file path of search vector data>
The vector field used to search of collection: <Enter the vector field>

Example

13. query → Shows query results that match the query expression

Example

14. Release Collection → To Releases a collection

Syntax

release collection -c collection_name

15. Delete partition → To delete the partition

Syntax

delete partition -c collection_name  -p partition_name

16. Delete Collection → To delete a collection

Syntax

delete collection -c collection_name

Example

Note-The CLI is especially useful for those who prefer a text-based interface and and is not officially recommended by IBM.

--

--