Detect Marine Debris from Aerial Imagery

9 min readOct 15, 2018

I have been involved with Mapping Marine Debris project and we have made great progress so far. So, I wanted to share my findings.

97%!

Going from summary of story, we are now able to detect marine debris 97% based on one of the Neural Network architecture called RetinaNet and one of the detection looks like this

And confusion matrix for detection overlapped with bounding box in test data, or not looks like the following

TL; DR — Details of how we got there.

In previous blog, I wrote simple use case of building ML model to detect JellyFish as a practice project, but in the real-world scenario, it is actually much harder to achieve the similar goal.

How did I get involved in this?

I love ocean. I always feel better after I spent time in Ocean and surf every week here in San Diego. I also just grow up respecting nature culturally in Japan. And after watching some documentary around the current situation with coral reef, global warming, ocean health, sustainability, and affect to ourselves for near future, I thought I better try to do something too.

What is the goal of this project?

The goal is accurately detect the location of marine debris (in Hawaii for now), and coordinate cleanup with organization like https://www.808cleanups.org/, and https://kokuahawaiifoundation.org/news/detail/hawaii_environmental_cleanup_coalition_hecc

After learning use case, and understand the problem we need to solve, we came up with this architecture:

Basically, given various visual sources of beaches (starting from aerial photos from RMH), we want to detect the risk from marine debris and generates report like this automatically

Learning about earth

A couple of interesting things learned about earth through this project so far.

Effect from natural disaster: This is paper they published for this project before I join. I didn’t realize the effect of Tohoku earthquake and other natural disaster can affect islands of Hawaii. Some of the boats are actually drifted down all the way to Hawaii after long time traveling. We all tend to focus on how natural disaster affect us directly, but good to think about how it can harm the environment and indirectly end up affecting us.

It’s important to note that this project was building upon work that was funded by the Ministry of the Environment of Japan through the North Pacific Marine Science Organization (PICES). The PICES project also had support from the State of Hawai’i Department of Land and Natural Resources (DLNR). These funders have given permission to share this data beyond that project’s original purpose.

Garbage patch: In pacific ocean the current looks like below, and and it creates “Garbage patch” section like below. Though current focus is on islands of Hawai’i, it’d be good to investigate on these areas. And the same current is also why some of the marine debris is floating to Hawaii. Added more detail in reference section below with other articles I have read for this project.

GIS: (Raster data,TIFF, EPSG) and geographic data can be really complex and size can be large. One Raster image can contains so much detail about the image (latitude, longitude, altitude on various location of image) that can basically create 3D representation of the location. It can be multiple layers so that it can have land, road, and car as an example. There were so much more coordinates type than Lat/Long to tune for required accuracy and they have the whole list for each location here.

I found these Jupyter notebooks from Planet did good job of writing how to analyze using Raster data and what kind of data it can have.

Building Neural Network

1. Preparing the data…

What we had was many TIFF images and geojson with location of debris for entire islands of hawaii.

Raster data is Big…Originally, each image was in TIFF format and was 150–300MB each and easily more than 1TB of data. Team from RMH has taken high quality photos. Then the team hired interns from University of Hawaii to manually created digitized points of marine debris for total of 20,658 bounding boxes.

As also done in similar challenges, we sliced the large images to much smaller and converted to JPEG. But since the debris can be in intersection that can be cut off we make sure it overlap in intersection and just assigned the first tiled image that can be found for each annotated points.

The original coordinates in EPSG 26904 looks like this

{'coordinates': [[[390494.6347470339, 2432764.6908447277],
     [390496.6347470339, 2432764.6908447277],
     [390496.6347470339, 2432766.6908447277],
     [390494.6347470339, 2432766.6908447277],
     [390494.6347470339, 2432764.6908447277]]],

pixel coordinates relative to the image since that’s what is required for most of the existing Object Detection framework.

I used rasterio and gdal to preprocess this to pixel coordinates and here is the main code below

It was tricky to install these packages with more dependencies (such as gcc) so I ended up writing Dockerfile so that I can install them to any VM quickly.

Detail of conversion from geojson to ml trainable data can be found at https://github.com/yhoztak/object_detection/blob/master/notebooks/convert_geojson_to_ml_trainable.ipynb

2. Architecture of Choice : Keras Retinanet (and others later)

I chose this because our use case doesn’t require real-time response speed, but accuracy is the most important thing. RetinaNet is combination of Pyramid architecture and Resnet and achieved the best performance based on some articles.The similar challenges for car detection achieved really good result, so first choice to try. Retinanet is well explained in here

3. Building Models

Now that I have dataset and algorithm ready, and I did some practice with JellyFish, I’m ready to start training model.

After went through building one and testing, I realize something and kept tuning up. Here are the detail of each versions:

Dataset:

7871 manually annotated data of marine debris and total of ~20k images of coastline of Niihau island collected by RMH. (we are still in process of including more islands)
Relevant labels from https://storage.googleapis.com/openimages/web/index.html

v1: Just using OpenImageDataset relevant data such as Boat and Tire.

v2: Used data only from RMH

v3: Used transfer learning from v1, then train v2.

v4.a: Included negative images and more type of labels for bounding box annotations. (Steps: 15000, Epoch 20 just to see if it does any better overtime)

v4.b : The same dataset with different parameter (Epoch: 7, steps:5000)

Between v4.a and v4.b, the classification loss is actually is better on v4.a but from accuracy calculations with test dataset, v4.b actually performed better. I guess v4.a was too bias?

Here are some quick evaluation based on test image:

It detect various one : Boat, Tire, Net, Plastic, Buoys

Some of the prediction overlap each other and hard to see, but it can detect Boat and Tire pretty good by the time of v3.

4. Testing, and measure accuracy

In order to measure accuracy, model was trained with 90:10 split (since our dataset is small. Retinanet will pick some of 90% as validation dataset to tune the model)

What does it matter for us in terms of measurement?

First priority is to detect marine debris.
Second priority is to identify a type of marine debris accurately.

Confusion Metrics of having/not having debris looks like the following

I actually found out negative images actually contain many debris so some of False-positive may be actually True positive. It actually did really well at the time of v4.b.

For each Marine Debris type, the confusion metrics looks like the following

means it was able to detect 66% of plastic as being plastic but was often actually Foam.

Boat looks great with 100%, but if you see this without normalized number, you see it’s just 1 out of 1 is correct, so need more test data for Boat.

Here are some findings

It’s actually performing pretty good with detecting Plastic. Then Net and Buoys.
It seems to confused with Plastic and Buoys, then Net with Plastic.
Net and Line are looks pretty much the same so we should group them together
on v4, I added wood, others bounding boxes. But it may rather be confusing, so may be better to remove it, and more grouping, removing of certain data may be ideal
Debris/Not debris detection is performing really well, which is critical for this project.

Here is some of the actual debris detection image looks like:

We used to get good score for boat and tire on v1, but the score is lower now. more labels and more data actual confused for these two.

Image detection notes on each models:

v1 (with OID): It finds Boat and Tire pretty good with high score

v2 (with marine debris dataset): Both accuracy and recall dropped compared to v1.

v3 (with OID + marine debris): It detects Net pretty good now. With transfer learning from v1 with OID, it recovers accuracy and recall

v4 (with negative image): It detects more objects than before after added negative images, but didn’t see too much improvements of accuracy itself.

TODO: Need to re-run each of these model and measure actual accuracy data points in the exactly the same way I did for v4, but here is the numbers

v2: {'wrong_label': 18, 'correct_label': 76, 'missed': 584}: 13%v3: {'wrong_label': 91, 'correct_label': 362, 'missed': 225} : 66%v4: {"wrong_label: 271, correct_label:386, missed: 20} : 97%

What’s next?

All of my work has been committed at https://github.com/yhoztak/object_detection and it’s been good learning and good progress so far

Yet, further tuning and work is needed such as

Evaluate new detection of marine debris on non annotated areas as we found so many and some looks decent.
Fix bounding box on original dataset (more detail below)
Platform building for application end to end
Blog about reviewing process came up with and others and document
Trying out different models & Hyperparameter tuning and noise reduction
Build a demoable web app

I will try to write more as we make more progress.

Problem with training data… Need to fix bounding box on original dataset

While investigating more on what we have in the dataset manually created, we found out that some of the bounding boxes has problems such as the following.

If we are including too much sand in box, the model would assume that sand is part of patterns, so better to tune up the box aligned to the actual object.

Great tools

While researching for this project, I run into a couple of cool tools

Fixing bounding box seems like common problem, yet so I spend some time researching and was harder to find one. Finally, I came across to https://supervise.ly/ seems like great option and free to try. It was a bit tricky to understand how the whole thing works but it’s working impressively good so far and below is quick demo of how I’m using it and discussing with team about the best way of involving more annotators. They allow agent from my machine to upload images so it was so much easier to upload them.