Home Surveillance with LLMs? Ollama using LLaVa 1.6

Published in

Python in Plain English

6 min readMar 11, 2024

Maximum privacy running LLM on your own hardware

Multimodal Large Language Models can be used to interpret and describe some basic actions, relation and content of images. We can very easily pass frames from a camera feed into one of them and so design a system that continuously describes what it sees.

In this article we will go through a simple demonstration of this, and we will discuss it’s results and feasibility.

Home Surveillance with LLMs? Ollama using LLaVa 1.6

Code Walk-through

Written by Balazs Kocsis