How to detect image contents from Ruby with Amazon Rekognition

Rekognition is a new Amazon Web Service that “makes it easy to add image analysis to your applications.” It can detect faces and objects, and even let you store libraries of faces for future recognition.

If you’ve ever used an AWS service from Ruby before, doing some simple image rekognition (sic) is straightforward.

Create a .env file with your AWS credentials

AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=[put that key here]
AWS_SECRET_ACCESS_KEY=[and the other one here]

Get the credentials from AWS, as you would for any other service. (For extra security, use IAM to create credentials solely for Rekognition.) Note that it’s currently only available in US West, US East, and EU Ireland regions.

Create a Gemfile to get a simple project going

source 'https://rubygems.org'
gem 'dotenv'
gem 'aws-sdk'

The dotenv gem pulls in the .env file as environment variables for your program. The aws-sdk gem is Amazon’s official Ruby SDK for AWS.

Write a Ruby program to query Amazon Rekognition

require 'dotenv'
Dotenv.load
require 'aws-sdk'
client = Aws::Rekognition::Client.new
resp = client.detect_labels(
image: { bytes: File.read(ARGV.first) }
)
resp.labels.each do |label|
puts "#{label.name}-#{label.confidence.to_i}"
end

First, we load our libraries and load in the environment variables from .env. The AWS SDK will use your credentials automatically.

Next, we create a client object and call its detect_labels method (really, Rekognition’s DetectLabels method) with the raw data of a file whose name is passed in via the first command line argument (ARGV.first). Finally, we print out the labels and confidence levels returned.

Do some detecting

If the above file were called detect.rb for example, you could run this:

ruby detect.rb myimage.jpg

Let’s say that myimage.jpg looked like this:

The above Ruby script would produce:

Freeway-64
Road-64
Dirt Road-63
Gravel-63
Asphalt-56
Tarmac-56
Intersection-55

The labels of what’s been detected are on the left, with the percentage ‘confidence’ of the detection algorithm on the right.

Have fun and remember Amazon will charge you $1 per 1000 images processed unless you’re on the free tier.

Random terrible ideas for this stuff if you’re bored

  • Scan Twitter/Instagram for pictures, detect what’s in them, then automate responses using the labels detected, such as “I love [label]!” or “Oh wow your [label] is [superlative]!” and make Twitter/Instagram even worse than they already are. But you’ll get more followers..
  • Crawl any source of photos and pick out all those with a high confidence match on both ‘cats’ and ‘pizza’. Then create a Facebook account or email newsletter that only posts pictures of cats that are near pizza.
  • The detect_labels method returns whether the orientation of the image had to be corrected to do the detection, so you could use it as an expensive image orientation detector.
  • Scan avatars your users upload and don’t allow kangaroos or dogs in tuxedos to use your app. They’re hard to monetize.
  • Use a webcam plus a digital door lock and automatically unlock the door when the cam sees your children. OK, it might fail some of the time, but your kids will find sleeping outdoors to be a great adventure!
  • Do evil stuff with the facial detection and recognition stuff I didn’t get into covering above. I’m sure I saw Amazon suggest using this tech in ‘digital billboards’ to detect things about the demographics of people walking past or something — that sort of stuff surely only has positive outcomes.
  • I’m bored now, go and listen to Poppy, she’s going to be a big star.