Whale Above the Stars

Emma Bernstein
7 min readMar 27, 2019

--

A whale shark can be uniquely identified by its markings. Whale sharks have been reclassified as endangered, and therefore, learning more about the species is as important as ever. One part of this is to track their migration and growth patterns over time.

This blog will introduce you to the two main algorithms used for pattern recognition with digital image processing on whale sharks (one of which was originally used to track stars), and how they are being used to help save the species. Whether you want to learn more about the conservation efforts, or are interested in the algorithms behind the pattern matching, keep reading!

Marine conservation biologist Brad Norman founded ECOCEAN in 1995, based in Western Australia, with efforts to assist in marine scientific research, conservation and education.

While the traditional satellite tagging method is effective in understanding migration patterns and keeping track of individuals, the process is invasive and whale sharks tend to shed their tags over time. This also limits the set of whale sharks being tracked to those encountered and tagged by research professionals. Luckily, Jason Holmberg and Zaven Arzoumanian collaborated to create a software that identifies whale sharks by the unique spots on their skin. All that is needed is a photograph!

Whale Shark Collaborative Efforts

ECOCEAN collaborated with WILDBOOK to form an online matching system. This effort has become the largest whale shark monitoring program in the world, allowing divers to submit their own photographs of whale sharks that they have encountered. Currently, there have been 56,269 reported sightings and 9,791 identified whale sharks!

Example of a good photograph of the whale shark’s pattern

The matching process is very simple for the spotter:

  1. Photograph: Take a photograph of the spot pattern behind the whale shark’s gills. You can read more about the photo requirements here.
  2. Submission: Submit the photograph or snippet from a video, along with other information such as the location where the shark was spotted, the sex and any scarring. You can submit the photograph or video still here.
  3. Verification: A researcher will verify and add any additional information.
  4. Match Algorithms: Two spot matching algorithms are run on the image (an adapted Groth algorithm and I3S).
  5. Match Results: Researchers compare the match results to the initial submission. They then either visually confirm a match to one of the whale sharks in the database or start a new whale shark’s profile.
Picture of shark and the detected spots
Possible matches returned from the Modified Groth algorithm
Variables used for the Groth Algorithm match

The spotters are updated throughout this process via email. If there is a profile match, then the spotter can read about the whale shark’s previous sightings and will be updated on any future encounters.

Adapting Groth’s Star Tracking Algorithm

In 1986, a pattern matching algorithm was developed to track stars, called the Groth algorithm, which was to be used in the Hubble Telescope. The Groth algorithm matches lists of two-dimensional coordinates.

The Groth algorithm starts with lists of coordinates, representing locations of a whale shark’s spots. The contrast between the white spots on the colored skin makes it easy to discern the shark’s features, so the ‘blob extraction’ image manipulation works well for detecting these and recording their (x,y) coordinates.

Image processed for spotting coordinates
  1. Refine coordinates: The first step in the match process is refining the coordinate lists. After each ‘point’ is recorded and added to the list, we normalize the coordinates from their original units to the unitless interval [0,1], while preserving the aspect ratios.
    We define a ‘tolerance parameter’ and eliminate one of the points if the distance between any two points is less than a fixed multiplication of the tolerance parameter/uncertainty.
  2. Form triangles: Form triangles between every point in the list.
    Each triangle is indexed such that the shortest side is between vertices 1 and 2, the intermediate being between 2 and 3, and the longest between 3 and 1. Each triangle also has an orientation, which dictates whether the intermediate side is clockwise or counterclockwise from the shortest side. Other properties are calculated from the triangles and recorded.
    This will then give us a list of triangles of length ni = n(n -l)(n -2)/6, where n is the number of coordinates in the list.
  3. Filter triangles: We now have a list of all possible triangles between the points.
    Triangles with large length ratios can produce large tolerances (uncertainty around coordinates), and therefore can be falsely matched with many other triangles. To avoid this, we remove triangles whose length ratio is above a set number (often times R=10 or R=8).
    We also impose a constraint on the C-value of each triangle. This is because spots on whale sharks typically lay on arcs. The triangles created between spots on the same arc are ‘flat’ and can be incorrectly matched. These matches do not give us much useful information about the shark’s unique patterns. We therefore typically set the C-values of triangles retained for analysis of C <O.99.
  4. Matching: An image rotation is applied so that the whale shark’s spinal cord would be effectively horizontal. The triangles are further analyzed and give a list of potential matches.
Example of a whale shark match

You can read more about the mathematics behind the Modified Groth algorithm here.

I3S Software for Whale Shark Identification

I3S requires more human interaction to help with the identification process. Once a photograph is uploaded, the researcher selects three points for where the reference point will come. From there, the 12–40 most prominent spots are selected within the reference area. The advantage of this process is that human selection removes the uncertainty of mis-markings from photo impurities.

The three reference points must be uniform from picture to picture. In the case of whale sharks, the reference points are the origins of the two dorsal fins and the origin of the pelvic fin.
After these are selected, 6 linear equations are derived that will ultimately transform all of the markings into the same coordinate system as the comparative photographs. Each coordinate is transformed into the second photo’s coordinates, which then defines the matrix M. M can be applied to all of the first photo’s markings which would then be used to compare against the other photographs in the system.

The algorithm then projects the images on top of one another and calculates the distance from the spots, ultimately leading to a list of most likely matches. A lower match score is better for the I3S identification.

You can read more about the I3S software here.

Not only are the image processing approaches more reliable than tagging whale sharks over long-term periods, but this allows for a more democratic approach to keeping tabs on this species. Any whale shark observer can take a photograph of the animal and submit it to the online database that aggregates all photos of whale sharks. Community helping community!

What else?

The Groth algorithm is great at identifying and matching unique patterns, but the whale shark’s rigid structure made it easy to adapt the original star tracking algorithm.
While it would be great to have a similar system for matching other animals like manta rays, it becomes more difficult when spots can be contorted as the animals move. For example, as the manta ray swims through the water, their belly compresses and the shapes of the spots and distances between the spots no longer appear uniform, making the matching process less reliable with the algorithms described above.

Manta ray identification
Polar bear identification by unique whisker patterns

Every animal has a different obstacle to overcome in terms of matching their unique patterns. However, algorithms have been adapted to take those factors into account. So no worries, we are similarly able to track manta rays, polar bears and many other animals!

The idea of pattern recognition in digital image processing spans beyond just conservation efforts. It can be applied to almost any type of image by using key features to classify patterns in the input data. One of the most popular uses of this is with facial recognition.

Nonetheless, using pattern recognition is a great, non intrusive way to keep track of animal’s migration patterns and population count over time, and has proven very effective and popular in conservation efforts.

Keep spotting and uploading! Thanks for reading!

📝 Read this story later in Journal.

🗞 Wake up every Sunday morning to the week’s most noteworthy Tech stories, opinions, and news waiting in your inbox: Get the noteworthy newsletter >

--

--

Responses (3)