Magic? Probably — Markable Cracked the Hardest Shell in Computer Vision AI

Joy Tang
Markable.AI
Published in
5 min readJun 23, 2017

Q1: What is Markable?

“We turn any photo and video into a virtual shopping mall through computer vision AI technology,” but that isn’t entirely correct, or the best description of Markable as a company.

Domain-specific AIs Will Rule the Enterprise Space

For a lot of people, hearing the words “Artificial Intelligence” immediately makes them think of names like: HAL 9000 (2001: A Space Odyssey), Bishop (Aliens), and — incorrectly (flexing nerd muscles here) — Cortana from the Halo series. That is to say “AI” for most is strictly tied to “general AI” — systems which are capable of learning in multiple domains of expertise and distinctly capable of becoming experts in domains in which they have no prior experience.

While general AI is certainly a lofty and aspirational goal, for the time being and probably for a long time into the future, AI-as-a-business won’t see much development of general AI. Rather, it is our belief that domain-specific AIs will rule the enterprise space. Exceptional “generalist” companies or even companies with more than one very narrow area of speciality are extremely rare.

Q2: Why Fashion? Why Video?

That’s the second most common question that we get (after a bit of confusion as to what exactly a computer vision company is). Thankfully the answer is simple:

Reason 1: Talent Acquisition. We only want to hire talents who are excited of solving an unsolved problem. Fashion is considered one of the most difficult sub-domains of computer vision. Clothing, unlike cars, is deformable and has a tendency to take the shape and pose of its wearer, meaning that a shirt has a minimum of 7 billion distinct possible shapes. It is 100 times harder than facial recognition and rigid objects that never deforms. Pretty daunting stuff, which is why our assessment of the market looks like this:

Reason 2: Consumer Sensation. An average American spends 5.5 hours watching video online everyday. The highest emotional conversion point in videos are characters, what the characters look like in the video, and what they are wearing to make them who they are. Video is consuming the internet and will soon be the single largest sub-set of traffic on the web and we don’t want to miss that wave.

Make Something Your Customers Love

Traditional media companies making their transition to the internet don’t want to be beholden to the whims of other companies to distribute their content and provide technologies to drive consumer interactions.

This is where the first bit of magic happens. Our customers are content companies, but our users are their developers. Being developers ourselves, we know that accessing our underlying AI technologies needs to be easy and seamless, and we have noticed that not many current solutions are. Our customers are here for the industry leading AI, but their developer’s are here because we spend as much time wrapping our AI in a developer friendly UX as we do building the AI itself.

Make Something Your Customer’s Customers Love

Here’s the second bit of magic, and probably the most difficult to pull off. Since our customers are consumer facing companies, their customers are the ever fickle “consumer”, who have such a wealth of video content available to them, they’ll lose attention if even the slightest hair is out of place.

So we asked ourselves: how do we add value for our customer’s customers without requiring any additional input? Input actually happens to be the answer in this case — we simply removed it. Detection and Search at Markable require no viewer interaction at all. They just happen. You do not need to draw a bounding box like on Pinterest. Dozen of boxes just appear. Knowledge is suddenly at the end-users’ fingertips.

The secret isn’t particularly fancy; we just start the computer vision technology running as soon as the viewer’s content loads, and then we run it a lot until they navigate away from the content. That’s it — magic to them, screaming GPU fans to us. Our customers love it to because it doesn’t distract their users from engaging with the actual content our customers have laboriously produced.

Conclusion

Exemplary new products take the best of current top products and make it better. Making the best AI in the world doesn’t mean anything if it’s only accessible to other AI researchers; it needs to be accessible to people who just want that kind of awesome to exist on their platform.

Henry Ford understood this when he created the Model T — widely regarded as the first car t0 open travel to the everyday middle-class American. Unlike other cars of its era, the Model T was not a race car (it was a general utility vehicle), it didn’t require 10 years of engineering experience to maintain, and it came in at a price point that most everyone could stomach. The trick Ford figured out was how to streamline his production lines to make the necessary features possible for every American to own and operate a car. Like Markable, Ford understood the value of optimizing the backend to create a revolutionary product. On the industry side, Ford’s production systems quickly became best practice, and on the consumer side, the world fell in love with the automobile.

So what does Markable do? We lose sleep over the delicate fusion of AI and User Experience so that our customers can focus on making their user that much more happy.

--

--