Putting the AI in API

In September 2016 I attended DevFestDC. Google had a pretty big presence and showcased a few awesome new Google Cloud APIs for artificial intelligence applications. I was particularly impressed with how easy they make it to try out the APIs through a simple web UI for each.

Vision

If you use Google’s Photos app, then you might have noticed that the search feature is AMAZING. It automatically detects the objects in your photos, and indexes them, so you can search for practically anything.

The technology behind this is now available through the Google Cloud Vision API. Upload a photo, and it tells you what’s in it. Here’s an example:

The tabs along the top display the information from the JSON response, e.g. the Faces tab shows you the face detection:

Natural Language

You can try Google’s Cloud Natural Language API for free, by just entering some text to analyze on their website:

As you can see, it detects entities and provides links to them.

The Syntax tab is particularly cool — it shows you exactly how it parsed everything:

Speech

Their Cloud Speech API also looked impressive. You upload an audio file, and you get back the text. To demonstrate that it supports over 80 languages, the presenter picked a random person from the audience, who tested it (successfully) in Hindi.

One key feature is the ability to provide custom hints along with each audio input. As you can see from my example (I repeated the one above), “DevFestDC” isn’t automatically recognized:

{
“results”: [
{
“alternatives”: [
{
“transcript”: “I attended fscc hosted by Capital One and McClain and I learned a lot about new API is available from Google”,
“confidence”: 0.83363587
}]
}]
}

But if you provide hints like “DevFestDC” along with the audio input, then it does detect it perfectly (even when the input language is Hindi)!