Team R&D 2018 — Cloud Vision API

Hiroki Gota
Mar 16, 2018 · 3 min read


At Spatial Vision, we’re trying a new approach for R&D, Team R&D (or simply hack day). Traditionally our R&D process looks like:

  1. There is a piece of technology/topic we’d like to understand based on our potential work
  2. Creates a set of criteria/goal for R&D
  3. A senior developer research/reads some documentations, creates a prototype
  4. A senior developer presents the outcome to the management and team

While this approach most likely produces a better outcome from a financial point of view, it does not scale well and there are a number of people who are interested in working on R&D works.

So we’ve decided to switch it to a team based approach and and produce a working result like a hackday.

Our basic rule is to form a team (3 ~5 people) and each person has a role and work collaboratively.

Problem: Find appropriateness of uploaded photos

We’ve built a number in-field mobile applications used by general public, as a part of citizen science programs (

In the last DevFest Melbourne 2017 (, we’ve discovered Google Cloud Vision API ( has evolved so much and it seems very easy to ‘Detect inappropriate content’.

Epic user story

As a citizen science program administrator, I want to see how many of the photos uploaded are actually appropriate for scientific purposes so I can filter inappropriate photos

An app user can report ‘a bird’ found in a field with a selection of species and some photos.

Solution model

We have a number of photos with some meta data, such as species name, category (Bird, Frog, Fish etc.). We can create an API that fetches the photos and metadata from a data source; use Cloud Vision API and annotate them; and finally visualise the annotation results to see the quality of reported photos.

Photo annotation API and visualisation

The Cloud vision API provides annotate API, which provides an annotation of a given photo. (You can test this online:

The team

Given we’ve got a solution the next step is decide who does what. We’ve decided to use a role lotto instead of the skills set so people could work on an area they never worked (or maybe you might get what you are good at).

  1. Elnaz: API Developer, building a Node API to get an annotation for each API
  2. Parham: Data specialist, preparing the test data including the jpg files and meta data
  3. Ryan: Quality Assurance, documenting user acceptance criteria and performing exploring test
  4. Craig: Web Developer, visualising the annotation data
  5. Hiroki: Journalist, presenting the outcome on our blog

The result


The Tao (Chinese: 道; pinyin: Dào; literally: “the Way” ) Code道— is a publication about all things app development. Our way — Learn through sharing

Hiroki Gota

Written by

I am a technical lead at Spatial Vision, Melbourne Australia. I enjoy creating apps and automating processes in a team environment.



The Tao (Chinese: 道; pinyin: Dào; literally: “the Way” ) Code道— is a publication about all things app development. Our way — Learn through sharing

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade