Deterring foxes and badgers with TensorFlow Lite, Python, Raspberry Pi, Ring cameras & ultrasonic deterrents

James Milward
13 min readSep 24, 2023

--

The first attempt at object identification using SSDLite, MobileNetV2 and COCO left a lot to be desired

A recent influx of foxes and badgers in my garden and the ensuing damage they caused gave me the perfect opportunity to consider how I could learn and utilise ML to try and reduce the number of unwanted night time visitors we’ve been experiencing.

I’ve been keeping an eye on TensorFlow and it’s object detection capabilities for quite a while and this seemed like a good opportunity to give it a try. TensorFlow Lite especially stood out as it can be deployed to low powered edge devices like the Raspberry Pi 4 or ESP-32/ESP-CAM which reduces the requirement for costly hardware running continually to provide object detection capabilities.

TensorFlow Lite provides several object detection models from the Model Zoo and I wanted to find out which one would give a good balance of accurate identification and low-powered computational requirements. The TensorFlow Lite object detection models have really advanced in the last few years and models like SSD MobileNet V2 FPNLite can provide ~80% accuracy on low powered devices, providing significant improvements over more complex R-CNN based models or EfficientNet/EfficientDet.

Firstly though, why deter foxes and badgers?

I recognise the importance of living harmoniously with our wildlife, so much so that we have an active ground bee hive in our garden which I’m keen on keeping as we desperately need more pollinators and they aren’t causing us any harm.

However, foxes and especially badgers can cause an immense amount of damage in a short space of time. I have two young children and that poses further issues as this type of wildlife bring nasty insects and diseases along with them.

So I set myself a goal of building a tool that could deter both badgers and foxes without causing them any harm.

What about off the shelf solutions?

You may be reading this thinking that machine learning is overkill for building a wildlife deterrent. You’re probably right but I love completely insane projects and I’ve tried so many humane solutions to safely stop my unwanted garden guests, all of which failed.

So what did I try first? Scent marking repellent, natural mixes of citronella/chilli and blocking up the entry points along the fences in the garden. This just resulted in our fluffy friends ignoring my initial attempts and digging more holes. However, out of everything I tried one solution offered a glimmer of hope — solar powered ultrasonic repellents.

One of the many examples of ultrasonic repellents you can purchase on Amazon. I have a few of these throughout our garden which provided mixed results.

I placed a few of these devices across the garden which successfully reduced the number of nightly visits we experienced. However -

  • The deterrent has different modes for badgers and foxes which has to be manually set, resulting in some areas not being covered for one of the types of animals
  • They run 24/7, meaning that we trigger them when we’re in the garden which results in us turning them off and periodically forgetting to turn them back on
  • They don’t always get fully charged by the solar panel (cloudy days) and can go for days with no charge before we realise that they need plugging in

Using Python/TensorFlow/Arduino to improve the ultrasonic deterrent

Ultrasonic deterrents work by detecting heat using the onboard passive infrared (PIR) sensor and triggering the speaker to bathe the target in high frequency noise (13.5— 19.5khz for foxes and 19.5–24.5khz for badgers).

Unfortunately the PIR triggers the speaker every time it detects a heat signature (i.e. me or my family), so I needed to make this more intelligent. This is where TensorFlow comes in as the Object Detection API can be utilised to detect animals with a high degree of accuracy.

If badgers and foxes could be accurately identified, the deterrent could be triggered remotely. Modification of the ultrasonic deterrent would be needed to allow for remote triggering and this will be facilitated by a microprocessor with WiFi capability (ESP8266). Furthermore, the ESP8266 could switch the deterrent to a specific mode to operate at a selected frequency range and accurately deter our targets.

Initial attempts at passing video from a Ring Stick Up camera to the pre-trained SSD MobileNet V2 320x320 provided poor results

I started off using a YouTube tutorial which walks you through passing images through a pre-trained TensorFlow model based on the COCO dataset. I built this out in Google Colab as you can execute arbitrary python code through a web browser and is especially suited to machine learning as you get free GPU time on Tensor Core GPUs.

My first results were a little disappointing. I was extremely pleased that I had been able to get some Python together to pass images through TensorFlow but I quickly understood that the publicly available models were simply not trained to identify badgers or foxes, so they ended up being detected as a sink, car, umbrella or bear (if I was lucky).

Training my own badger/fox model

An evening of Googling led me to discover this fantastic YouTube tutorial which walks you through training a TensorFlow Lite object detection model based on your own custom imagery.

The tutorial stresses the importance of getting a decent set of imagery together for the object you’re looking to detect with TensorFlow. I’d previously been able to get decent imagery from my Ring Stick Up camera which had caught the badgers/foxes in the act of wrecking my garden so off I went to the Ring portal. I spent another evening screen capturing videos until I had a few hundred images of the badgers and foxes which I dumped into Google Drive.

LabelImg was used to annotate each image and I identified the position in each image of a fox or badger (alongside other garden items/animals), outputting the labelling to a PASCAL VOC XML format which TensorFlow understands. I uploaded this to Google Drive, fired up Colab, configured my training and set the Python script to train a new model for a few hours.

Note — I’d recommend purchasing a Google Colab plan to access the Nvidia A100 Tensor Core GPU’s as this significantly decreased the time it took to train models. Roughly £9 gets you 100 ‘compute units’ and one training session consumed around 15 of these.

Training the data model and monitoring in TensorBoard — accuracy increased as the model went through 30,000 steps of image training

After roughly 2.5 hours, the model was looking good and started to show a reduction in improvement/learning so I decided to pull the plug and got my hands on a freshly baked badger/fox TensorFlow Lite model.

badger != umbrella

240 annotated images and 2.5 hours of training provided a model with decent accuracy

Success! Running TensorFlow’s object detection API with my newly trained model gave decent results against a random sample set of my images. The overall Mean Average Precision (mAP) of the model was coming out at 62.21% for 2.5 hours of training — not bad. I could improve this by training it against more images but I had enough to get a proof of concept working.

Next, I needed to progress from still imagery and implement a solution for streaming video from the garden to a local instance of TensorFlow Lite running the newly baked fox/badger model. This would be eventually hosted on a Raspberry Pi 4 but could be initially built and deployed to Docker locally to enable rapid prototyping.

End-to-end concept of the system. MQTT provides lightweight communication and allows for multiple deterrents to be deployed and issued commands from a single Raspberry Pi 4.

The above concept set a rough architectural basis for how the system would hang together. With this in place, it felt like a good time to bring the software and hardware components of the system to life. Firstly, this contraption needed a decent name and thankfully ChatGPT has a wealth of bad ideas when it comes to product names based on ridiculous concepts …

Thanks ChatGPT!

Building the Furbinator 3000

Step 1 — Camera streaming via ring-mqtt

Now that I had a working detection model and badgers were no longer being identified as an umbrella, I needed to connect a night vision camera which provided streaming capability for TensorFlow to consume.

The popular Ring camera system in theory has all of the required attributes — high quality night vision imagery and internet connectivity, plus I already had a few installed. Unfortunately Ring has no official API but has a decent unofficial API and the brilliant ring-mqtt project has been built on top of this library to provide a solid foundation for streaming video from Ring cameras via RTSP and alerting to motion events via a rich MQTT implementation. Thankfully, this library turned out to be superb with excellent documentation — thanks Tom Sightler / tsightler. Download it here.

Note — this library is designed to stream RTSP when a motion event is detected and shouldn’t be used for 24/7 feeds as this will cause issues (i.e. breaking the Ring fair usage policy).

I have configured this to run both the ring-mqtt container and an mqtt-broker container with Docker Compose, the YAML for this is in included in the docker folder of the Python app.

Step 2— Building the Python TensorFlow app

The Python tensorflow-ring-animal-detector project I’ve put together to breathe life into the Furbinator 3000

I built a Python application to provide object detection through the TensorFlow API. This application also facilitates training and testing of new models.

The app first needs to be configured via constants.py and is then run from src/tf_model_stream_analysis.py. This application does the following -

  1. Connects to a locally hosted MQTT broker and listens for Ring motion events from the ring-mqtt library
    - see the MQTTHandler class
  2. Upon a motion event being identified, starts a RTSP stream from the ring-mqtt library. I’ve ingested each stream via a multi-threaded video dispatcher to enable high performance when multiple cameras are used
    — see the ThreadedVideoDispatcher class
  3. The RTSP stream is passed to OpenCV which captures the stream frame by frame using the FFMPEG codec. I’ve added a customisable FPS limiter to this to reduce the CPU overhead which is especially helpful on the Raspberry Pi
    — see the VideoStream class
  4. Each video frame is passed to the TensorFlow Object Detection API, which is used in conjunction with the pre-trained model I have built
    — see the TFVideoAnalysis class
  5. Upon a fox/badger being detected a MQTT message is published to the MQTT broker and a snapshot is taken and saved locally
    - see the ClassTracker and FurbinatorActivator classes
ClassTracker/FurbinatorActivator taking a snapshot of a Badger

Step 3— Reducing false positives

Oddly shaped leaves/grass/cobwebs presented multiple false positives — a real issue if we want to reliably trigger an ultrasonic deterrent. That patch of gras does look like a fox too! Argh

Following some testing I started to notice a number of false positives from the system. Thankfully most of these false positives were triggered from stationary objects such as fallen leaves, grass patterns or debris caught in cobwebs.

To get around this issue, I decided to combine motion detection with object detection. To detect movement, I compared each frame of the RTSP stream to a previous one to detect any pixels that had changed, largely based on the fantastic work in this blog -

# NOTE - Run this inside a while loop which takes individual frames from a
# threaded cv2.videoCapture() function
first_frame = None
frame_diff = None

prepared_frame = cv2.cvtColor(frame_rgb, cv2.COLOR_BGR2GRAY)
prepared_frame = cv2.GaussianBlur(src=prepared_frame, ksize=(5,5), sigmaX=0)

# Calculate difference and update previous frame
if (first_frame is None):
first_frame = prepared_frame
else:
# cv2.absdiff calculates the absolute difference between two arrays
frame_diff = cv2.absdiff(src1=first_frame, src2=prepared_frame)
first_frame = prepared_frame

# Dilute the image a bit to make differences more seeable;
# more suitable for contour detection
kernel = np.ones((5, 5))
frame_diff = cv2.dilate(frame_diff, kernel, 1)

# Only take different areas that are different enough (>20 / 255)
thresh_frame = cv2.threshold(src=frame_diff, thresh=25, maxval=255, type=cv2.THRESH_BINARY)[1]
cv2.imshow("Motion Detection Analysis", thresh_frame)

# Use motion contours when an object is detected and correlate
motion_contours, _ = cv2.findContours(image=thresh_frame, mode=cv2.RETR_EXTERNAL, method=cv2.CHAIN_APPROX_SIMPLE)

Here we take each frame from a multi-threaded videoCapture() process, convert to black and white (colour adds unnecessary complexity) and apply a Gaussian Blur to smooth it out slightly. An absolute differential of a first and second frame is calculated as a new frame, this frame is dilated to even out any anomalies and then each pixel is converted to either white or black.

We then find the area that has changes since the last frame (cv.findContours), apply object detection to this frame, iterate over each object detection score for each class detected in the frame and calculate the location of the motion detection vs the location of the object detection -

# Motion detection inside object detection - determine if it's near 
# the object that's been detected. This improves object detection as
# it can mistakenly recognise stationary grass/leaves/patterns
# as animals
motion_near_object = False
if motion_contours != None:
for contour in motion_contours:
(x, y, w, h) = cv2.boundingRect(contour)

# Calculate difference between coords of detected object and
# detected motion. ymin, xmin, ymax, xmax are the coords of the
# object detection bounding box
ymin_calc = abs(ymin-y)
xmin_calc = abs(xmin-x)
ymax_calc = abs(ymax-(y+h))
xmax_calc = abs(xmax-(x+w))

# Motion detected if difference is no lower than threshold
# (20% of resolution width)
threshold = self.resolution_w/20
if ymin_calc <= threshold and xmin_calc <= threshold and ymax_calc <= threshold and xmax_calc <= threshold:
cv2.rectangle(img=frame, pt1=(x, y), pt2=(x + w, y + h), color=(255, 0, 0), thickness=2) # blue
motion_near_object = True

If the detected object is near the motion detection (the max distance I set between the two was 20% of the image width) then display the motion contour and set the motion_near_object flag. This looks a little something like this -

Note the fox being sometimes detected as a badger and an item on the floor being incorrectly identified as a bird. This is the side effect of a model that’s only been trained for 2.5 hours. This went away with further training of the model.

The code can be improved, especially the location of the motion detection bounding boxes vs the object detection bounding boxes but for now this works and significantly reduced the number of false positives.

To recap, we now have the ability to -

  • Stream video from Ring/RTSP cameras to Python
  • Analyse the video with both object detection in TensorFlow and motion detection in OpenCV
  • Trigger an event with the motion_near_object flag. In the Python app I put together, this publishes the following to MQTT which a modified ultrasonic deterrent can listen for with a bit of Arduino -
self.mqtt_client.publish("foxbadger", " **strobe")

Step 4 — A physical prototype

Now that events are reliably published to MQTT when a fox/badger is detected, an ESP8266 equipped WeMos D1 Mini Pro was used to listen for this event and control off-the-shelf deterrent.

WeMos D1 Mini Pro — ESP8266 equipped microprocessor with USB, WiFi and multiple GPIO, ideal for controlling the ultrasonic deterrent

The design is very simple — when the ESP8266 received a message via MQTT, it does two things

  • Sets GPIO4 to HIGH for 22 seconds — This turned the ultrasonic deterrent on. This was pre-set to a mode which filtered through all frequencies and fired the onboard LEDs. (note — a later design will set these as necessary via MQTT)
  • Sets GPIO5 to HIGH for 50ms, then LOW for 50ms and repeats for 22 seconds in total — creating a basic strobe
Basic circuit illustrating how the components are connected. NOTE — The IRF520 MOSFET was necessary between the 12v step up converter and 12v LED strip as the EN pin on the step up converter allowed a small amount of voltage to leak when disabled, causing the LEDs to be very dimly lit when in the OFF state. The MOSFET ensured no voltage passed to the LED unless GPIO5 was HIGH. A decent 12v step up converter would eliminate the need for this — mine was obviously very cheap.

The Arduino code I put together can be found in the Furbinator3000 repo here.

The components used are -

  • WeMos D1 Mini Pro + Antenna — buy here
  • 3.3v IRF520 MOSFET Driver Modules (x2) — buy here
  • 12v DC-DC converter step-up voltage booster — buy here
  • 12v waterproof LED strip — buy here
  • Modified ultrasonic deterrent — buy here

I’ve wired these all together on a breadboard and housed them in a box that I 3D printed with some leftover white PETG I had lying about. This has been glued onto the back of the battery compartment and I cut a hole through the cover and battery compartment with a 32mm flat wood bit. I’ll eventually put a PCB together in Fritzing which I can order with components pre-soldered but for now this works. The breadboard is especially useful at this stage as it’s incredibly easy to modify if I have any issues.

Hideous but it works
Testing the strobe, red/white LED and speaker with a manual MQTT publish

Deploying the Furbinator 3000

Success! Half way through you can see the Badger consider getting his claws into the lawn but swiftly decides to move on.

As this was developed within docker, setup of the Raspberry Pi 4 (4GB) with a 32gb SD card and deployment to Ubuntu was trivial. It seems to be handling object detection via the SSD MobileNet V2 FPNLite 640x640 model relatively well.
Note — I had to disable the directML variant of TensorFlow for this to work.

After a few nights of testing, it looks like the local wildlife have decided to quickly exit the garden when the Furbinator is doing it’s thing. I’m used to watching Badgers ripping up my garden from this angle and I cannot describe how happy this makes me that my garden will no longer look like a battlefield when I’m waking up to have my morning coffee.

What’s next?

Time will tell as to whether the badgers and foxes get used to this thing or not. I have a second version planned which is equipped with a small amplifier to play mp3’s of dogs barking but for now I’m going to sit back and see what happens. Plus I like my neighbours and I fear the dog barking variant is a sign that I’ve gone completely mad.

Watch out for a follow up blog on how I deployed this to a Raspberry Pi, improved the model with more imagery and performance of the model on low powered compute.

Thanks for reading.

— James

--

--