Team R&D 2018 — Cloud Vision API

Hiroki Gota
Code道
Published in
3 min readMar 16, 2018

Overview

At Spatial Vision, we’re trying a new approach for R&D, Team R&D (or simply hack day). Traditionally our R&D process looks like:

  1. There is a piece of technology/topic we’d like to understand based on our potential work
  2. Creates a set of criteria/goal for R&D
  3. A senior developer research/reads some documentations, creates a prototype
  4. A senior developer presents the outcome to the management and team

While this approach most likely produces a better outcome from a financial point of view, it does not scale well and there are a number of people who are interested in working on R&D works.

So we’ve decided to switch it to a team based approach and and produce a working result like a hackday.

Our basic rule is to form a team (3 ~5 people) and each person has a role and work collaboratively.

Problem: Find appropriateness of uploaded photos

We’ve built a number in-field mobile applications used by general public, as a part of citizen science programs (http://www.spatialvision.com.au/citizenscience/).

In the last DevFest Melbourne 2017 (http://gdgmelbourne.com/), we’ve discovered Google Cloud Vision API (https://cloud.google.com/vision/) has evolved so much and it seems very easy to ‘Detect inappropriate content’.

Epic user story

As a citizen science program administrator, I want to see how many of the photos uploaded are actually appropriate for scientific purposes so I can filter inappropriate photos

An app user can report ‘a bird’ found in a field with a selection of species and some photos.

Solution model

We have a number of photos with some meta data, such as species name, category (Bird, Frog, Fish etc.). We can create an API that fetches the photos and metadata from a data source; use Cloud Vision API and annotate them; and finally visualise the annotation results to see the quality of reported photos.

Photo annotation API and visualisation

The Cloud vision API provides annotate API, which provides an annotation of a given photo. (You can test this online: https://cloud.google.com/vision/)

The team

Given we’ve got a solution the next step is decide who does what. We’ve decided to use a role lotto instead of the skills set so people could work on an area they never worked (or maybe you might get what you are good at).

  1. Elnaz: API Developer, building a Node API to get an annotation for each API
  2. Parham: Data specialist, preparing the test data including the jpg files and meta data
  3. Ryan: Quality Assurance, documenting user acceptance criteria and performing exploring test
  4. Craig: Web Developer, visualising the annotation data
  5. Hiroki: Journalist, presenting the outcome on our blog

The result

--

--

Hiroki Gota
Code道
Editor for

I am a Software Architect at Lapis, Melbourne Australia. I enjoy building end to end custom solutions with a Kaizen spirit in a team environment.