The Markable Update: The Demo

Markable
Markable.AI
Published in
6 min readApr 13, 2017

So, about that “we’ll be posting every Tuesday and Thursday” thing….. Well, we got busy. But we’ve been busy with some really cool stuff, which we’re excited to announce is available today for you to try out.

Knowing is Half the Battle

At the end of January, we posted about the intricacies of “multi-object detection” in photos and videos and believe it or not, that’s actually the easy part of what we work on here at Markable. You see, teaching a computer when to call a collection of pixels a “dress” or “boot” is a pretty generalized concept. You can, with relative ease, show a computer vision program enough blobs of pixels labeled “dress” and it will generally get the idea of what shape can be a dress and what cannot.

The difficult part is teaching the computer what it means for two “dress blobs” to have some sense of “sameness”. In other words, to have the computer first: find a dress in a photo and then search against a collection of hundreds of thousands of other dresses it has seen to return the ones that look the most similar — or optimally, the exact dress. And then perform that search for every object detected in the photo (check it out at the bottom of the article).

The Long and Short of Features

For this section, let us consider a car. I know you were expecting clothing, but our exact feature models are still a bit of a secret which we’re not ready to discuss quite yet.

How do we know it is a car and not a motorcycle or bus? Because it has a very distinct set of attributes — or as we call them features — and a limited range of values that each feature can have that make it a car. A limited feature set for a car might be:

Consider this car
  • Wheels
  • Doors
  • Seats
  • Windshield
  • Trunk
  • Paint color

A motorcycle too has:

  • Wheels
  • Seats
  • Paint color

And a bus:

  • Wheels
  • Doors
  • Seats
  • Windshield
  • Trunk
  • Paint color

The separation between a motorcycle and the “car + bus” group is easy to make. With the number of wheels making the biggest difference. But, if cars and buses share the same feature set, how do us humans — and in turn our computers — know with certainty which is which?

From that small collection some important features emerge. Namely the number of people the vehicle carries (seats) the number of doors it has, with windshield and trunk as less contributory features. The difference — or distance — between the values of these features is the basis for defining “similarity”.

The Nitty Gritty

So the car to bus comparison is pretty easy. There are large distances between quite a few of the features, so wide in fact that we have different names for the two — car and bus — like we would for pants and leggings, or coats and blazers.

So you know that you probably shouldn’t respond with items categorized as “bus” when the your friend asks about a car, so what do you return?

The Search Image
The Results

Which is more similar? Make your decision and then we’ll walk through the logic.

On the left we have the same brand and model, but it is a different color from what your friend asked about (assume color is fixed per result) and it is a hatchback, so a different kind of trunk, and also might be a plane, given the size of the wing slapped on the back.

On the right, we have a different make and model, but the same color, trunk style and, although it’s not a feature we account for, price.

Which do you give back to your friend as the “best match”? Is your personal valuation of brand and model the same valuation your friend has? Do you try to interpret if they’re asking for “the same item in a different color, with some slightly different features” or if they’re asking for “an item with the same features, but it doesn’t need to be the exact one”? Since they’re your friend, you can probably make a reasonable guess as to which of those two options they want.

But now suppose the person is not your friend, and there are thousands of requests coming in per minute, and the person doesn’t just want to know about the car, but also what the wheels on the car are. On top of that, instead of two results cars to consider, there are 20,000, and instead of a nice photo like the one above, your friend gave you one like this:

Shot with iPhone

But you don’t have any results photos from that angle (but you saw a bunch like that in training, so you kind of know what you’re doing) but you’re expected to do just as well on that photo as a glamour shot, and you can’t take much longer with that result than other results because the user will give up if you take too long.

A little overwhelming, right? We thought so too, so we programmed a computer to do it.

Quantifying Qualitative Values

So the big reveal is that our models consider “similarity” just like we do. There are no exact, hard and fast rules about which feature distance combinations mean something. That’s the benefit of using Convolutional Neural Networks, throughout their tens of layers and millions of points of decision making, associations spring forth.

Once those associations are learned, they can be related to when the model is shown new data, and other products with the same associations of features can be returned as results.

In Real Life

So we were going to put a bunch of screenshots here, but instead, have one:

Still Ironing out a few kinks

And go to demo.markable.ai to try it yourself!

Conclusion

Search is really a broader topic than this post has time to dig into. But essentially, it involves identifying, scoring and combining features for each of our result items, so that when a user makes a search call, we can look for feature sets and scores that most closely resemble those that the user passed us.

If you like what you’ve seen and want to learn more about getting our APIs running on your own applications, signup.

If you’d like to get into more detail about features, as usual, we recommend Computerphile.

--

--

Markable
Markable.AI

Visual search technology for fashion. Tweet us @MarkableAI