From the Edge to the Cloud and Back Again

Tim Spann
9 min readAug 2, 2024

--

Milvus, Edge AI, Vector Database, MQTT, Kafka, Zilliz Cluster, Python

Not connected to the cloud at all in this one.

In today’s environment, no one has time to wait for AI and similiary results to arrive especially in potentially network challenged environments. You need to have CPU, GPU, RAM, storage, LLM inference and vector database searches at instant ready because in the real world things happen instantly. Your camera is receiving immediate results that must be processed, checked, analyzed and reacted to. If there is an obstacle we must avoid it now not in five seconds.

Quick Demo

Remember over 80% of the data in the world is unstructured and you need to store, search and process it. You will have text, documents, images, audio, video, logs, sensors and more.

Why Even Use a Vector DB?

  • High-Performance Search
  • CRUD Operations: Just like traditional databases, vector databases allow you to Create, Read, Update, and Delete data.
  • Data Freshness: Vector databases ensure your data remains up-to-date, reflecting the latest information for accurate searches.
  • Persistence: Your data is securely stored and persists even if the system restarts.
  • Availability: Your data is readily accessible for search and retrieval operations.
  • Scalability: Vector databases can handle growing data volumes efficiently.
  • Data Management: Vector databases provide tools to manage your data effectively, including data ingestion, indexing, and querying.
  • Backup and Migration: Create backups of your data for disaster recovery and easily migrate your data between different systems.
  • Cloud or On-Premise Deployment: Vector databases can be deployed easily on various platforms, including cloud and on-premise environments.
  • Observability: Monitor the health and performance of your vector database to ensure optimal operation.
  • Multi-tenancy: Support multiple users or applications accessing the same database instance securely.
  • So Many Indexes: Offer a wide range of 15 indexes support, including popular ones like Hierarchical Navigable Small Worlds (HNSW), PQ, Binary, Sparse, DiskANN and GPU index

Edge AI Use Cases

  • Robots
  • Smart Cities
  • Smart Factories
  • Autonomous Cars
  • Automated Retail
  • Smart Home

Local Search on Edge Devices

  • Proprietary Document Search
  • On-Device Object Detection
  • Milvus Lite on Device

Why Even Use a Vector DB on the Edge?

  • Cloud, Docker, Standalone or On-Premise Deployment: Can send vectors and other fields to local, remote or Cloud Milvus.
  • Instant Local Search: access local unstructured data for fast search and local applications.
  • Secure Local Data
  • No Network Necessary: Especially for autonomous robots and vehicles. Make instant local decisions.
  • Local RAG and Super Charge Edge AI: enhance local image, audio, video, text data with local LLMs. OLLAMA with RPI. Generative AI
  • Local Live Video

So let’s build an Edge AI App.

The first step is to pick which edge or device, where are we running. Hardware is a big decision and it’s based on requirements, budget and availability.

Since the Olympics are going on, I will show you the Gold, Silver and Bronze level devices you can run (based on TOPS).

NVIDIA AGX Jetson Orin

Gold — NVIDIA Jetson AGX Orin — 275 TOPS, 2048-core, 64 GB RAM

Slack output
Docker Compose with Attu Showing Collection
Orin Output Results from BLiP Image Captioning of Web Camera Image
Milvus on Orin Vector Search with Milvus + Attu Display
Milvus on Zilliz Cloud Data Preview for Orin Edge Ouput

As you can see we can run locally with Milvus Lite or Milvus on Docker or send to the cloud, all with just changing a few parameters.

NVIDIA Jetson Xavier NX

Silver — NVIDIA Jetson Xavier NX, 21 TOPS, 384-core, 8 GB RAM

Despite being pretty old this previous generation NVIDIA edge device performs pretty well! We run image captioning on web camera images. We moved this demo to the new Orin.

Like all of our devices we can communicate to Milvus Lite, Milvus on Docker, a Milvus Cluser in the cloud, K8 or a local edge server with ease. We can also upgrade to the Zilliz Cloud by just changing URL and adding a token.

Raspberry Pi 5 + AI Kit for Pose Estimation Demo

Bronze — Raspberry Pi 5, 13 TOPS, 4-core, 8 GB RAM

output to Slack channel
Jupyter Notebook
Zilliz Cloud Display of Collection
Post Estimation Search Results

We can easily run this on a small inexpensive device and send the results to Slack and Milvus. This makes for easy distributed unstructured data applications.

The source code for all of these applications and some older ones are available below.

Here is a sample of what a simple Edge AI Milvus Lite application utilizing Python can be.

PYTHON INSTALLATION

We install the Milvus Python SDK and Milvus Lite so we can run backups.

pip3 install pymilvus
pip3 install milvus-lite

MILVUS-LITE BACKUP / EXPORT

milvus-lite dump -d XavierEdgeAI.db -p /home/nvidia/nvme/AIM-XavierEdgeAI/backup/ -c XavierEdgeAI

Dump collection XavierEdgeAI’s data: 100%|████████████████| 33/33 [00:00<00:00, 188.54it/s]

Dump collection XavierEdgeAI success

Dump collection XavierEdgeAI’s data: 100%|████████████████| 33/33 [00:00<00:00, 127.16it/s]

Milvus-Lite to the Cloud

For many use cases we will want to distribute our local data to another computer, cluster or cloud. We could do that at the same time, in a batch, on a delay or at some other time.

  • Milvus-Lite Dump/Export to Cloud Import at some interval
  • Dual Ingest to local and other location concurrently
  • Switch to Cloud Only
  • Send JSON via Kafka / Pulsar / MQTT
  • Unstructured Data to MinIO, S3 or Cloud Object Storage

SLIDES

SOURCE

EDGE HARDWARE SPECS

275 TOPS, 2048-core NVIDIA Ampere architecture GPU with 64 Tensor Cores, 12-core Arm® Cortex®-A78AE v8.2 64-bit CPU 3MB L2 + 6MB L3, 2x NVDLA v2, Vision Accelerator 1x PVA v2, 64GB 256-bit LPDDR5 204.8GB/s, 64GB eMMC 5.1

21 TOPS, 384-core NVIDIA Volta™ architecture GPU with 48 Tensor Cores, 6-core NVIDIA Carmel Arm®v8.2 64-bit CPU 6MB L2 + 4MB L3, 8GB 128-bit LPDDR4x 59.7GB/s, 16GB eMMC 5.1

https://www.raspberrypi.com/products/ai-kit/

13 TOPS of inferencing performance, Single-lane PCIe 3.0 connection running at 8Gbps. Broadcom BCM2712 2.4GHz quad-core 64-bit Arm Cortex-A76 CPU, with Cryptographic Extension, 512KB per-core L2 caches, and a 2MB shared L3 cache, 8GB LPDDR4X-4267 SDRAM, VideoCore VII GPU, supporting OpenGL ES 3.1, Vulkan 1.2.

REAL-WORLD EVENTS

Aug 13, 2024: Unstructured Data Meetup NYC

Aug 15, 2024: AI Camp NYC

Sept 24, 2024: Unstructured Data Meetup NYC

WEBINAR

RESOURCES

Milvus Uses Kafka

Star Us On GitHub and Join Our Discord!

If you liked this blog post, consider starring Milvus on GitHub, and feel free to join our Discord! 💙

--

--

Tim Spann

Principal Developer Advocate, Zilliz. Milvus, GenAI, Big Data, IoT, Deep Learning, Streaming, Machine Learning, NiFi, Kafka. https://www.datainmotion.dev/