Ok google, what do you see on my picture?
I like my google home and I am constantly trying to create serious use cases with dialogFlow and Actions on google but because it’s christmas period I wanted to accomplish something a lot more fun.
The result is the following: I can take a picture on from iphone and then immediately ask my google home to describe what’s he can see on it.
Below the basic steps in node-red:
- I take the pictures from my phone using the google drive app. The first part of the flow is in charge to connect to my google drive account and download the last picture from a specific directory.
- The second part sends the raw data of the picture to the microsoft azure vision api. This is the only public api where the natural description of a picture is exposed.
- Because I am french, the english output is translated into french.
- The text is sent to my google home which is considered as a cast device.
I had to install the following components which are not installed by default with node-red:
- node-red-contrib-viseo-ms-oxford: to query the Microsoft Cognitive Services Computer Vision API to get informations about an image.
- node-red-contrib-file-buffer: buffer containing the file contents.
- node-red-contrib-image-output (optional): Simple image output node.
- node-red-contrib-google-translate: simple google translate node.
- node-red-contrib-cast: to send text to speech to my google home.
The full sequence of events can be triggered via an endpoint. I have configured a simple ifttt applet where the trigger is the google assistant. The output calls the endpoint in node-red. After a dozen of seconds, my google home give me the result.
This is extremely easy to setup. At the end it is just a series of elements/bricks you have to connect together.
Hope you will appreciate the result and will give you ideas to setup such fun flows.