AI autoteger-image indexer: Local image search.
Faced in practice with the need to find the right photo among thousands of files located in different folders on my computer or phone, I started looking for solutions. There are a large number of AI services on the web where you can upload your photos and in response get an easy search for them, using key text queries, however, there are not many people who want to transfer their entire family or corporate photo archive to other platforms with unknown prospects for their use. Since I specialize in Finetune and the integration of AI models, the solution to this problem will be based on the use of AI technologies.
A couple of years ago, quite serious hardware resources were required for text generation and even more so for image processing. Relatively recently, available, albeit reduced in number of vision parameters, versions of AI models appeared that can be run even on home PCs equipped with video cards with a memory capacity of 8 GB. For full-fledged models, this VRAM volume of a home PC is certainly not enough, and the speed of work with 8GB cannot be compared with commercial versions, but in the conditions of our task, the processing and indexing time is unlimited, and if we consistently “feed” all our photos to the local AI model, receiving in return their description, which will be saved in We can use the database, and then fasten the database search, so we can get a full-fledged text search for photos, and most importantly, a local one, without uploading our photos to the network.
How does this work in practice?
1. Put all the images in the images directory and run the indexing script
2. The AI describes them, tags them, and makes them searchable.
3. To find a photo of, for example, a child’s birthday with a cake, write in the search for “Birthday, cake”
From the pros:
- Works locally!!! — no risk of data leakage, as photos are not uploaded to the network
- Does not require subscription, payment, or purchase of software
- You can search for photos not only by tags, but also by meaning.
One of the disadvantages is that in order to reduce the indexing time to acceptable ones, it is necessary to compress images to a small resolution.
(This project is publicly available with open source code at https://github.com/iximy/AI-indexer . Feel free to clone it, test it, and use it for yourself!)
So let’s get started:
Installing dependencies: Python software, ollama and os libraries, Json, ollama, chromadb, PIL Flask
Download and run our model with the ollama run llama3.2-vision:11b command, you will need about 10 GB of free space
Now go directly to the script
Creating a script file responsible for recognizing and indexing photos. Connecting the libraries
import os
import json
import ollama
import chromadb
from chromadb.utils import embedding_functions
from PIL import Image
from ollama import Client
We specify the location of catalogs with photos and databases
IMAGE_FOLDER = "./images" # Folder with our photos
DB_FOLDER = "./chroma_db" # ChromaDB database storage folder
Initialize ChromaDB
chroma_client = chromadb.PersistentClient(path=DB_FOLDER)
collection = chroma_client.get_or_create_collection(
name="image_tags",
metadata={"hnsw:space": "cosine"},
embedding_function=embedding_functions.DefaultEmbeddingFunction()
)
We specify the parameters for connecting to the API Ollama
client = Client(
host='http://localhost:11434',
headers={'x-some-header': 'some-value'}
)
We generate a text description of our photos, the prompt sent to generate the description can be edited to suit your needs
response = client.chat(
model="llama3.2-vision:11b", # Here we specify our AI model
messages=[
{"role": "user", "content": "What's in this image? 20-word response", "images": [image_path]}
]
)
return response["message"]["content"].strip()
Iterate through all the images in the folder, call generation and save their text description in ChromaDB
def index_images():
for filename in os.listdir(IMAGE_FOLDER):
if filename.lower().endswith((".jpg", ".png", ".jpeg")):
image_path = os.path.join(IMAGE_FOLDER, filename)
tags = generate_tags(image_path)
collection.add(
ids=[filename],
documents=[tags],
metadatas=[{"filename": filename}]
)
print(f" {filename}: {tags}")
Starting indexing of our photos
if __name__ == "__main__":
index_images()
is saved as ai-indexer.py
The second script is responsible for the search algorithm and the server side:
Connecting libraries
from flask import Flask, render_template, request
import chromadb
We specify the path where our database is located
DB_FOLDER = "./chroma_db"
We specify the cut-off threshold for semantic search, where 1 is a stricter search.
SCORE_THRESHOLD = 0.5
Connecting to ChromaDB
chroma_client = chromadb.PersistentClient(path=DB_FOLDER)
collection = chroma_client.get_or_create_collection("image_tags")
Initialize Flask, not forgetting to specify the folder of our images for output
app = Flask(__name__, static_folder='images')
Sequential search: first an exact match, then a semantic search using the previously set threshold
def search_images(query):
results = collection.query(query_texts=[query], n_results=10)# Limit the number of photos in the output
if not results["ids"]:
return []
exact_matches = []
semantic_matches = []
for i, description in enumerate(results["documents"][0]):
filename = results["metadatas"][0][i]["filename"]
score = results["distances"][0][i]
Checking for an exact match
if query.lower() in description.lower():
exact_matches.append(filename)
We add semantic search matches according to the previously set threshold
elif score <= SCORE_THRESHOLD:
semantic_matches.append((filename, score))
Sorting so that the most suitable images appear first in the output
semantic_matches.sort(key=lambda x: x[1])
We return exact matches, then semantic matches.
return exact_matches + [x[0] for x in semantic_matches]
Adding the server side processing for the entry point /, we send the data
to @app.route("/", methods=["GET", "POST"])
def index():
if request.method == "POST":
query = request.form["query"].strip()
if query.lower() == "exit":
return render_template("index.html ", images=[])
matches = search_images(query)
return render_template("index.html ", images=matches)
return render_template("index.html", images=[])
if __name__ =="__main__":
app.run(debug=True)
save the script as search_server.py .
Create the templates folder and place the web interface file there. index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Image search</title>
<style>
body { font-family: Arial, sans-serif; }
.container { width: 80%; margin: 0 auto; text-align: center; }
.images { display: flex; flex-wrap: wrap; justify-content: center; }
.image-container { margin: 10px; }
img { width: 200px; height: auto; border-radius: 10px; }
.form-container { margin-bottom: 20px; }
input[type="text"] { padding: 10px; width: 300px; font-size: 16px; }
input[type="submit"] { padding: 10px 20px; font-size: 16px; background-color: #4CAF50; color: white; border: none; cursor: pointer; }
input[type="submit"]:hover { background-color: #45a049; }
</style>
</head>
<body>
<div class="container">
<h1>Image Search</h1>
<div class="form-container">
<form method="POST">
<input type="text" name="query" placeholder="Enter the search query..." required>
<input type="submit" value="Search">
</form>
</div>
<div class="images">
{% if images %}
{% for image in images %}
<div class="image-container">
<img src="{{ url_for('static', filename=image) }}" alt="{{ image }}">
</div>
{% endfor %}
{% else %}
<p>No images were found for your query.</p>
{% endif %}
</div>
</div>
</body>
</html>
Now we create the images folder and upload our photos there, I recommend not uploading the original photos in full resolution, but making a global resize with the name saved to a resolution of 250–300px on the wide side, with a similar resolution processing and indexing 1000 photos at a speed of 30–40 seconds / image. it will take about 8–11 hours. Without compression, indexing the same volume of photos can take up to several days.
For the post, for educational purposes, I used a simple selection of images from the Internet.
Run the script on the command line
python ai-indexer.py
During the indexing process, a chroma_db directory will be created where the database will be located, and in the window you can immediately see the AI-generated text descriptions of our images.
At the end of the script, you can start the python search server.
python search-server.py
Enter the address in the browser: http://localhost:5000 and we see the simplest search form
We enter test queries about the queue: room, tricycle, warning, blue walls, TV
As you can see from these 6 examples, the search algorithm does an excellent job of displaying images, and the search works quite quickly. If necessary, you can adjust the SCORE_THRESHOLD value, which is responsible for the accuracy threshold for semantic search.
Conclusion:
This project demonstrates the potential of AI models of local vision in applied tasks This project shows how in a few minutes, in 120 lines of code, using modern developments in the field of AI, you can launch a project for a complex task. The possibilities of this approach reveal the prospects of using AI vision technology in personal and business tasks, perhaps there will be ideas for such use, I will be glad to discuss This approach opens up the possibilities of confidential image processing on small local capacities, which makes it possible to implement projects in the field of AI vision without using server capacities.
This mini-project is just one example of using AI technologies for everyday tasks.
The project goal has been achieved. this means that another work project is sent to the piggy bank of mini weekend projects
Here is the GIT of the project