Recently, Spatial Vision was invited to an open tender process which featured a hackathon to assess our capabilities/offering. It was the first tender process that we had been involved in which had this component so we were very excited to showcase our development capability.
One of the hackathon challenges was to create an app for recreational fishers which would identify the species of fish they caught in order to determine whether take it home or release it.
- We only had 5 days with 2 developers to build all challenges (2 apps)
- After an initial estimate, we could only spend 2 days for the app
- The app must be an native and runs on both Android and iOS
Beginning to Build
The user story we were given was: Given I have caught a fish in the beach, I want to know the species so I know whether I can keep it or release it.
We decided to build a feature based on the user story which first captures a photo of the fish then shows a list of species sorted based on the score (%).
Choosing Machine Learning Platform
First of all, our method of identifying fish was to use ‘machine learning’, more specifically image recognition/classification. It’s the hottest topic in IT industry and we had played around with some of them, such as Google Dialogflow and Vision API.
There were many services available including Amazon Rekognition, MS Computer Vision etc, however their APIs are more broad and focusing on ‘to recognise objects in a photo’ (object detection) or ‘if a person is happy or sad’ (face/sentiment detection) and not ‘what kind of fish is this?’.
We wanted this API done without having to implement it ourselves so we could focus on the user experience. We searched around and found a service called Vize.ai. It was still in beta at the time but provided exactly what we needed, which was to classify, train and call it as API.
Custom Image Recognition API - Vize.ai
Power your app with artificial intelligence. Highest accuracy with deep learning. Train on your specific images…
Tasks and labels
Vize.ai defines an image recognition/classification as tasks and labels. In our case, a task is ‘identify fish’ and labels are ‘Barramundi’, ‘Flathead’, ‘Snapper’ etc.
Creating Labels (the fish species)
We created a total of 8 labels, covering the most common fish caught in Australia. (in the context of fishing in a bay in Queensland).
- Bartailed flathead
- Dusky flathead
- Other flathead
- Sand whiting
- Trumpeter Whiting
- Yellowfin Brim
Collecting data (photos)
The next step was to collect sample photos of each species as a training data set.
Since we had a strict time constraint, we used Google image search and simply searched by species names. We then picked up a variety of photos including:
- Taken by recreational fishers, who are holding the fish that are just caught
- Showing the fin or head that are unique to the species
- Showing the colour of fish clearly
The reason we included the recreational fishing photos was we wanted to see if we could classify well with photos that contain not only the fish but also surroundings such as a person, fishing gear, ocean, clouds and water.
We collected 20 ~ 30 photos per species and uploaded them to Vize.ai.
The next step was to train our model with the the photos uploaded.
Vize.ai splits your images into training images and testing images. The majority of the images are used for training, while the remainder are used to test the accuracy of the model.
This process took just over half hour and we eventually got 72% accuracy as the result of training.
Classify (Test the model)
Vize.ai provides ‘Classify Image’ feature, which allows you to test your model (label + photos) easily. We simply collected some test data and uploaded to classify them.
The test result — Good
The 1st photo to test was actually a photo of a snapper taken by a guy and the relevance (score) was 98.83%! It was astonishing to see how we could easily build an image classification API with so little number of photos!
The test result — Not Good
The 2nd test was little harder. We picked up a sand whiting and see if our model could distinguish between sand whiting and trumpeter whiting.
They looked very similar in shape and only noticeable difference was trumpeter whiting had a unique line pattern in the body, which we were hoping our model could recognise, given all of Trumpeter Whiting photos did have the pattern.
Unfortunately our model thought it was Trumpeter Whiting. Even worse, the second score was given to Snapper!
Unfortunately we couldn’t decide why this went wrong and how we could improve as we didn’t have any clue on what ML algorithms Vize.ai used and what kind of errors they had in their test.
The test result — Bad
The next test was to classify fish that was not in our model. We tested a photo of tuna, which did not look anything like the ones in our labels.
We imagined it’d give very low scores for all of labeled fish but our model thought it was a snapper… (with a fairly confident score of 96.54%).
The test result — Ugly
Given the result of tuna test, we decided to test if a photo was not even a fish. our model thought this koala was “Other flathead” with a score of 67.4%.
Findings so far
- We didn’t need too many photos to train a model if a photo to classify was obvious or distinguishable
- The API still gives a higher score (over 95%) even the fish does not look like any of the existing labels
- The API gives lower scores (60%) if it is not fish (could be used to determine whether there is a fish in the photo)
- It was not easy to re-train our model based on the test result as we didn’t have any knowledge on how the model made the decision (Algorithms) in the first place and how we could refine them
- Nonetheless, it was good enough for us to create an app using this API so we could showcase a prototype
Building a prototype APP
We didn’t have any more time to find alternative ML service so we built a native app that takes a photo of fish; call Vize.ai API; List the result by relevance. The following are the screenshots of the application prototype.
We created 2 apps successfully within the time and we’re happy with our outcome.
We also internally discussed a possibility of further refining our prototype and potentially release it as a product. The following are some of our thoughts on how to refine:
- Combine with another classification API such as Vision API and identify whether there is fish in a photo, prior to our model so we can avoid classifying a photo that is not a fish
- Further research alternative recognition APIs/Services and compare the implementation and insights
- To distinguish a family of the same kind (dusky flathead, trumpeter whitings), we might require photos that only contain fish by extracting the fish from a photo
- We need experts to classify appropriately when introducing a new label so our training data set has good characteristics of the fish
- Automate retraining
- Collect GPS location and time (another meta data) as meta data to help identifying the fish on top of image recognition
Overall, it was a great exercise to attend this hackathon as we learned that using a ML service (API) in an application was very easy and we could think of many features that could utilise it.