ESP32+OV7670 — WebSocket Video Camera


Since past few weeks i have been trying to get the first meaningful clear image from OV7670. And today, finally, ended up making a mini surveillance video camera. OV7670 is the cheapest camera module available online. Hence it was my first choice to experiment with. In fact, i was more interested in making a small surveillance video camera. OV7670 has 0.3mp resolution so i was not expecting HD quality :). But if i could make out objects/people or movements from 0.3mp resolution then that would be just good enough for me. I was aware of many challenges making this project. So, lets list them as objectives.


  1. Use cheap camera module.
  2. Ability to connect to WiFi.
  3. Ability to view the video stream from a PC and Smart Phone as well.
  4. Display software should be platform independent
  5. Low power consumption. Battery operable.
  6. Portable.
  7. Easy to configure.


Raspberry Pi

When it comes to camera projects, Raspberry Pi becomes the first choice. It has lots of RAM and excellent CPU. However it requires additional accessories such as SD card, Wifi dongle (Raspberry Pi 2) to complete the setup. This adds to the cost making it moreexpensive. I wanted a cheaper alternative.

Arduino Uno

I experimented with Arduino Uno. Grabbing a picture out of the h/w is a complex task. Without Ethernet Shield, you first need to write the captured frame to SD card and transfer out via Serial Port… thats way too slow to make any video out of it. Also grabbing image at 8MHZ is dicey.


ESP32 looked promising for my usecase. The CPU is fast enough to provide the camera clock (XCLK) — clock signal above 10MHZ. It has enough RAM to capture a full 160x120x2 frame (QQVGA). And it is equipped with Wifi capabilities.

I came across that provides an excellent library to use OV7670 with ESP32.

But it lacks video capability and does not support 320x240 frame due to memory constraint.

I am not going into details about OV7670+ESP32 circuit interface and library implementation. I would rather focus on how to make a video stream from the still images captured in quick successions.

Video Streaming Prerequisite

  1. Video streaming would require high speed transfer of captured image frames.

2. Data transfer must be asynchronous to achieve low latency.

3. The data must reach the display device in a FIFO manner.

4. The display device must refresh the changed pixels very quickly to avoid flicker


  1. Websocket allows data transfer in realtime with low overheads.
  2. The client or server do not need to wait for each other to begin the conversation.
  3. It supports text as well as binary data transfer. This helps in exchanging custom commands and binary images.
  4. It allows secured communication too.

HTML5 Canvas

  1. HTML5 Canvas allows to display and directly update an image on a html web page through pixel manipulation.
  2. Canvas is an HTML element with ImageData object associated with it.
  3. Pixel manipulation on ImageData reflects changes to the HTML canvas.

Video Streaming Algorithm using WebSocket

  1. Establish Websocket connection between browser client and ESP32.
  2. Browser sends “start” message to ESP32. The start message contains the image resolution type. 80x60, 160x120 or 320x240.
  3. ESP32 begins capturing frames and sends it to browser using webSocket.sendBIN. The image format is RGB565. Hence the total frame size is (frame size in pixel) X 2 bytes. In this solution memory is allocated enough to fit in a frame of size of 160x120x2 bytes (QQVGA).
  4. In case of 320x240 resolution, frame is captured 2 times. In the first capture, first half of the frame is sent and in the second capture, the remaining half is sent over the websocket. A start flag and end flag is used to inform the browser about the partial frame order .
  5. After receiving the end flag, browser request ESP32 for the next frame. ESP32 continues as in step 3.

Implementation Details

  1. Connect ESP32 to 5V supply. ESP32 boots and configures it self as Access Point and Work Station. It connects to the best available Wifi network from among the provided options.

2. Connect PC/Smart phone to Esp32AP Access Point

3. Open Google Chrome browser and type

QQ-VGA (120x160) is the default display canvas.

4. The ESP32 acts as Web Server that serves a web page that contains javascript program to connect to ESP32 via Websocket and capture binary image data to display it on HTML5 Canvas.

The Wifi Station IP address is provided by the ESP32 when web socket is opened. ESP32 sends the station ip address to the web client.

Thus the Camera can have two IPs. Fixed when it creates AP and station IP assigned by the router when ESP32 connects to other WiFi network.

The Web Socket client is web browser. Hence our display device is cross-platform. It can be viewed in PC and Smart phone that supports HTML5 canvas. Following code shows how the Web client handles web socket.

Finally lets see how the html canvas displays the binary through pixel manipulation. The capture format is RGB565. Hence each pixel is represented by 16 bits. RGB565 is converted to RGB888 format before manipulating the pixels.

5. Lets switch to Wifi Internet network.

ESP32 Camera connected to Wifi Internet

Video in 80x60 (QQQ-VGA), 160x120 (QQ-VGA), 320x240 (Q-VGA)

Screenshot from Android Phone

PCB Board

Video Demo

In this article we achived all our objectives. A cheap hardware, cross platform display device (web browser), fast asynchronous data transfer through web sockets, wifi connected, ease of use and portability.