Thanks David! Google’s Cloud Vision API is pre-trained and (so far as I am aware) cannot be retrained with new data. The Vision API does a really good job, then, of recognising a beer bottle, but it can’t tell me what brand of beer it is looking at. With our home-trained model we were able to provide our own training data to allow the algorithm to distinguish between Lagunitas IPA and the Crazy Mountain Pale Ale, instead of just recognising them both as beer. In addition, because we were working with a realtime video feed, the latency of hitting a web-based service would have killed our performance.

I strongly suspect that the Cloud Vision API is also running Inception-v3 on the backend.

