Computer Vision for Wildlife Monitoring in a Changing Climate

Wild Chimpanzee Foundation’s Accelerator partnership in cooperation with DrivenData is focused on automated, accurate, and accessible species detection in West Africa through the use of machine learning on camera trap video assets, providing insights into the effects of climate change on wildlife and enabling evaluation of the impacts of conservation efforts on species abundance.
By: Emily Miller & Mimi Arandjelovic

Our planet is facing two impending and inexorably linked crises: biodiversity loss and climate change. We need to be able to monitor wildlife populations to determine if, when, and how best to intervene for the sustainability of the world’s ecosystems. Camera traps are widely used in conservation research to capture still images and videos of wildlife without human interference. They’re powerful tools, but they aren’t being used to their fullest potential. Since camera traps cannot automatically label the species they observe, it takes the valuable time of teams of experts, or thousands of citizen scientists, to manually label this data in order to then filter down to the videos of interest. Weeding out videos with no animals (often greater than 50% of the captured videos), captured when the camera is triggered by wind or rain, is especially time intensive.

Although automated labeling of camera trap still images is increasing, labeling of videos is still in its infancy. Videos can better inform us about animal behavior, species interactions, and enable us to calculate population abundance measures, and thereby enable biomonitoring of wild animal populations. Using statistical models for distance sampling, the frequency of animal sightings can be combined with the distance of each animal from the camera to estimate a species’ full population size. However, getting distances from camera trap footage currently entails an extremely manual, time-intensive process, taking a researcher more than 10 minutes on average to label distance for every one minute of video. This creates a second bottleneck for critical information that conservationists can use to monitor wildlife populations.

Chimpanzees from a WCF camera trap in Guinea

Identifying animal presence or absence in a region is not only valuable for conservation, but has also been shown to be important for mitigating the effects of climate change. For example, forests with large mammal species perform more carbon sequestration than when these species are not present (for example, this work on forest elephants). Interventions that incentivize local communities not to hunt these species and thus benefit from their contributions to carbon sequestration (for example, Rebalance Earth) need cheap, effective monitoring tools — like camera traps — to be effective.

A MPI-EVA PanAf camera trap mounted to observe behaviors at a termite mound in Senegal

At the Wild Chimpanzee Foundation (WCF), we have been collaborating with the Max PlanckInstitute for Evolutionary Anthropology (MPI-EVA), German Center for Integrative Biodiversity Research (iDiv), and the incredible team at DrivenData to build one of the leading tools for species detection from camera trap videos, Zamba Cloud. Zamba Cloud is a user-friendly web application, where users can upload their camera trap videos, which are automatically processed in the cloud, and receive a spreadsheet with which species are most likely present in each video, allowing researchers to weed out false triggers and get straight to the videos of interest.

The models underlying Zamba Cloud are all open source and available through the Zamba python package. There are three default species detection models:

  • Two African models, can identify 32 common species from Central and West Africa and have been trained on over 14,000 1-minute-long videos from 13 different countries.
  • A European model which can identify 11 common species in Western Europe and has been trained on over 13,000 videos.

Zamba Cloud currently has its limits though. The default models only support a limited set of specific species groups that may or may not be species of interest at any given location. In addition, model accuracies vary across species and locations, deterring conservationists from reliably using this tool.

For species detection models to be useful, they need to be both fast and accurate. This is particularly tricky for videos. Given that there are hundreds of frames per one minute video, it would be infeasible to predict on every frame as if it were an image classification problem. Instead, Zamba uses a two tiered approach: an object detection model is run on frames from a video that has been downsampled to four frames per second, and then a small subset of those frames with the highest probability of containing an animal are passed into the classification model, which yields the species prediction for the video. Inaccurate detection of animals in the frame selection model can hamstring the accuracy of second level species classification and depth estimation models.

With the support of the Patrick J. McGovern Foundation Accelerator program, we are transforming Zamba Cloud from a single-purpose application to one that supports the diverse needs of conservationists. We’ll be building on the existing functionality in Zamba Cloud to make it easier for users to train custom models in new locations or for different species categories of interest. We’ll be adding a model on distance estimation to Zamba, which will enable population estimates. We’ll be retraining the species detection models on nearly ten times the number of videos used to date. And we’ll be retraining the frame selection model on an expanded set of nearly a million images to improve downstream accuracy and generalizability.

Curious elephants check out an MPI-EVA PanAf camera trap in R-Congo.

Our aim is for conservation organizations to have an accessible and accurate automated species detection tool that is routinely used upon collecting videos from the field as well as a tool for distance estimation so population monitoring of multiple species can be done in the same habitats. This will save countless hours of valuable time, and has the potential to increase the value of camera trap data and improve data management practices not just for us, but for everyone working in biodiversity monitoring.

--

--

The Patrick J. McGovern Foundation
Patrick J. McGovern Foundation

Inviting conversations on how AI and data solutions create a thriving, equitable, and sustainable future for all.