Adding Java to Unstructured AI Pipelines (Java RAG)

Tim Spann

Published in

AIM (AI + Milvus)

4 min readOct 1, 2024

Milvus, Java, Ollama, LLama 3.2, ADS-B, Langchain4J, Milvus Java SDK, Vector Database, Open Source

Automatic Dependent Surveillance–Broadcast

App Walk Through

I have been trying out the Milvus Java SDK and Langchain4J to build some applications to work towards modernizing some of my old Java applications to migrate them to Milvus and AI powered applications. Next up will be Spring AI applications and some existing ones working with Apache Pulsar and Apache Kafka.

The first Java application, RagLoader, queries up to 16,384 rows from the liveplanes collection that we load with our Python notebook (seen in previous article). It filters them by “flightidentifier != ‘NA’” this will remove any planes that did not have a flight identifier, most likely these are not commercial flights.

We iterate through the results

for (QueryResp.QueryResult result : insertList.getQueryResults()) {

And create a new JsonObject to hold some of the metadata fields. These are useful ones we may want for filtering later such as ICAO, Altitude, Speed, Lat/Long, Flight ID. We then add our necessary RAG fields of id, text and vector. We then send this data to an insert and now have rows to use for our Java RAG Application.

Java RAG (LangChain4J)

We will use the following model hosted by Ollama locally on my Mac. llama3.2:3b-instruct-fp16.

First we setup our embedding store:

EmbeddingStore<TextSegment> embeddingStore = MilvusEmbeddingStore.builder()
        .uri("http://MYDOCKERORLOCATION:19530"")
        .collectionName("aircraftrag")
        .dimension(384)
        .build();

Next we setup our chat interface.

interface Assistant {
    String chat(String userMessage);
}

We then connect to local Ollama.

ChatLanguageModel chatModel = OllamaChatModel.builder()
        .baseUrl("http://localhost:11434")
        .modelName("llama3.2:3b-instruct-fp16")
        .temperature(0.0)
        .build();

We then enable our service and connect everything together.

Assistant assistant = AiServices.builder(Assistant.class)
        .chatLanguageModel(chatModel)
        .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
        .contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
        .build();

Finally, we receive our answer.

String answer = assistant.chat(prompt);

prompt is passed in from the command line as our prompt.

Which we call via a shell script like so:

mvn compile exec:java -Dexec.mainClass="dev.datainmotion.Aircraft" 
        -Dexec.args="'$1'"

We have finally built a RAG application in Java.

Stack

Milvus v2.11 Standalone in Docker
ollama version is 0.3.12
JDK 1.22 (OpenJDK 64-Bit Server VM Temurin-22.0.2+9)
Apache Maven 3.9.6
langchain4j-ollama 0.35.0
langchain4j-milvus 0.35.0
milvus-sdk-java 2.4.4
slf4j-api 1.7.30
OS name: “mac os x”, version: “13.6.7”, arch: “aarch64”, family: “mac”

Source

GitHub - tspannhw/AIM-Aircraft-J: AI - Milvus - Ollama - LangChain4J - Milvus Java SDK

AI - Milvus - Ollama - LangChain4J - Milvus Java SDK - tspannhw/AIM-Aircraft-J

github.com

Data

Example Run

***RAG against Ollama with llama3.2:3b-instruct-fp16
****Prompt=Tell me about A320

****Answer=
Based on the provided information, here are some key details about the two aircraft:

**A320 AAY1207**

* Aircraft type: Airbus A320
* Operator: Allegiant Air (United States)
* Location: 40.285282,-74.290799 (latitude and longitude)
* Altitude:
 + Barometric altitude: 35,000 feet
 + Geodetic altitude: 37,000 feet
* Speed: 510.0 mph
* Image URL: https://airport-data.com/images/aircraft/001/668/001668328.jpg

**A321 AAL1164**

* Aircraft type: Airbus A321
* Operator: American Airlines (United States)
* Location: 40.23347,-74.456281 (latitude and longitude)
* Altitude:
 + Barometric altitude: 9,875 feet
 + Geodetic altitude: 10,200 feet
* Speed: 306.2 mph
* Image URL: https://airport-data.com/images/aircraft/001/600/001600717.jpg

Resources

LangChain4j | LangChain4j

Description will go into a meta tag in

docs.langchain4j.dev

GitHub - milvus-io/milvus-sdk-java: Java SDK for Milvus.

Java SDK for Milvus. Contribute to milvus-io/milvus-sdk-java development by creating an account on GitHub.

github.com

GitHub - tspannhw/AIM-ADS-B: AI + Milvus for ADS-B Aircraft Tracking

AI + Milvus for ADS-B Aircraft Tracking. Contribute to tspannhw/AIM-ADS-B development by creating an account on GitHub.

github.com

Planes

RESOURCES

medium.com

Unstructured Data Meetup New York

Wednesday, October 23 — 5:30 PM — 8:30 PM

Unstructured Data Meetup New York · Luma

This is an in-person event! Registration is required to get in. Topic: Connecting your unstructured data with…

lu.ma

Adding Java to Unstructured AI Pipelines (Java RAG)

App Walk Through

Java RAG (LangChain4J)

Stack

Source

GitHub - tspannhw/AIM-Aircraft-J: AI - Milvus - Ollama - LangChain4J - Milvus Java SDK

AI - Milvus - Ollama - LangChain4J - Milvus Java SDK - tspannhw/AIM-Aircraft-J

Data

Example Run

Resources

LangChain4j | LangChain4j

Description will go into a meta tag in

GitHub - milvus-io/milvus-sdk-java: Java SDK for Milvus.

Java SDK for Milvus. Contribute to milvus-io/milvus-sdk-java development by creating an account on GitHub.

GitHub - tspannhw/AIM-ADS-B: AI + Milvus for ADS-B Aircraft Tracking

AI + Milvus for ADS-B Aircraft Tracking. Contribute to tspannhw/AIM-ADS-B development by creating an account on GitHub.

Planes

RESOURCES

Unstructured Data Meetup New York

Unstructured Data Meetup New York · Luma

This is an in-person event! Registration is required to get in. Topic: Connecting your unstructured data with…

Written by Tim Spann