Search Images With Vector Database Retrieval Augmented Generation RAG

6 min readJan 30, 2024

NOTE: As of of 6/26/2024 the program is now in version 6.0.0. Please see updated installation instructions on the Youtube channel located at:

https://www.youtube.com/watch?v=J1t95ecV11U

Hello, this short article will be on how to create a vector database of images that you can search and filter. It uses a program I created located at https://github.com/BBC-Esq/ChromaDB-Plugin-for-LM-Studio

First, follow the installation instructions on the github readme. Apologies if the Linux or MacOS instructions are inaccurate. I don’t own those platforms so can only try my best to provide support for them without being able to troubleshoot.

Once installed, you begin the program by running “python gui.py” in the terminal/command prompt and you should see something like this:

The steps for creating a vector database of images that you can search are as follows: (1) download a vector model by using the “Vector Models” tab, (2) choose the image files by going to the Databases tab and selecting images, (3) within the Databases tab, selecting the embedding model you want to use by selecting the specific folder for it, and (4) creating the vector database.

(1) Download Vector Model

The User Guide has a good explanation of the different types of vector models so please read that before downloading and selecting a model. Also, on the Vector Models tab there’s useful information for each model to help you choose:

Click the Download Embedding Model button to download a model.

(2) Within the “Databases” tab you can select one or more image files. For purposes of this tutorial we will not be adding non-images. I’ll do a separate article on creating a database of non-images. Remember, you can click the add files button numerous times to add multiple images that are located in multiple directories:

Click the Choose Files button to add images and the Choose Model button to select the model you downloaded.

(3) After you’ve added images click the Choose Model button to select the folder containing the model you downloaded. You will see something like this. Click once on a folder and then click the Select Folder button in the lower right. DO NOT doubleclick into one of the folders and then click “Select Folder:”

Now that you’ve selected the vector model you want to use, let’s focus on the settings for processing the images as well as creating the vector database itself. You always want to select gpu-acceleration when creating the vector database and cpu when querying it. Depending on your setup, “Create Device” will be populated with “cpu”, “cuda” or “mps:”

The program uses “vision” models to create summaries of the images you selected, and those summaries are put into the vector database to be searchable. The Chunk Size setting should be large enough to encompass the entire summary for an image. If you successfully achieve this, the “chunk overlap” setting is unneeded (more important for non-images).

There is no harm in setting the chunk size much larger than what the summaries will actually be. The model vector model will simply process the summary just the same. However, you can test the different vision models and see how long the summaries are (and their quality) by going to the Tools Tab, selecting a file, and processing a single image:

Test process a single image to determine vision model quality

To choose the vision model you want to test, and when ultimately processing all of the images, the programs uses the settings you enter here:

In the Tools Tab you can see generally the memory requirements of the various models and quantizations. After testing them, choose the one that you like the best. My personal favorite is cogvlm running in 4-bit. The accuracy if its descriptions are significantly better, but it requires more resources.

Once you’ve selected the vision model settings as well as the create vector database settings, go back to the “Databases” tab and click Create Vector Database. Within the command prompt/terminal you should see it starting to process all the images:

After that it will split the summaries into chunks (if any are longer than the maximum chunk length that you set, which hopefully they aren’t) and then put them into the vector database. Once everything is done you should see something like this:

That’s it! The vector database has been created!

It’s easy to search the images. Open LM Studio, go to the server tab, and you should see something like this:

Make sure “Apply Prompt Formatting” is “off.” Also, on the right side clear the “User Message Prefix” and “User Message Suffix” boxes since we’ll be using a prefix/suffix from within my program. The developer of LM Studio said he’s going to fix the bug that necessitates clearing the prefix and suffix boxes even though “apply prompt formatting” is “off” but I’ve yet to confirm that he has…

Select a model from within LM Studio. REMEMBER, we are not chatting interactively with a model at length. You simply need a model that can answer a single question from my program. Therefore, you do not need esoteric, creative models but rather straightforward models that follow instructions well and summarize large amounts of context well. In my program, I’ve included presets for the categories of models that I believe are the best right now:

Here are links to all of the models that I currently find worthy:

TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

TheBloke/neural-chat-7B-v3-3-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

TheBloke/Mistral-7B-Instruct-v0.2-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

TheBloke/Orca-2-7B-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

TheBloke/Orca-2-13B-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

For something extremely light, I was amazed that this model worked and would recommend it:

TheBloke/stablelm-zephyr-3b-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

WHICHEVER model you choose, I do notrecommend going below the 4_k_m quantization. If your system can’t handle a 4_k_m quantization, choose a model with less parameters (e.g. from 12b to 7b).

Once you load the model into LM Studio, set gpu acceleration within LM Studio and the context length of the model you chose (usually 4096). Then click “start server.”

Within my program, you simply need to type a question and click “Submit Questions” — e.g. “Describe all images that depict one or more children playing outside?” If you’ve done everything correctly you should get an output something like this, with citations:

You can experiment with the “Similarity” and “Contexts” settings within the “DATABASE CREATION” settings to fine-tune your results, but that’s basically it!

You can also check the “chunks only” checkbox here to actually see the chunks without connecting to LM Studio, which is useful to experiment with different chunk size or similarity settings. Remember, however, anytime you change the chunk size setting you must recreate the database.

For the many other features and uses of my program please check out my github repository at:

https://github.com/BBC-Esq/ChromaDB-Plugin-for-LM-Studio

Search Images With Vector Database Retrieval Augmented Generation RAG

TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

TheBloke/neural-chat-7B-v3-3-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

TheBloke/Mistral-7B-Instruct-v0.2-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

TheBloke/Orca-2-7B-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

TheBloke/Orca-2-13B-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

TheBloke/stablelm-zephyr-3b-GGUF · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

Written by vici0549