Fireside chat with Ramzi Rizk, CTO of EyeEm

Earlier this year we hosted a DataSeries fireside chat at the most well-known matchmaking startup event in the Nordics and the Baltics: Arctic15.

Mike Reiner discussed lessons learned from EyeEm with Ramzi Rizk zooming into data labeling challenges and the relevance of tapping into communities.

By means of background, Mike Reiner is a Venture Partner at Open Ocean and the co-founder of City AI. Mike invests in innovative startups from Europe and beyond, in several areas and industries, including AI (which he is particularly fond of).

Ramzi Rizk, a Lebanese-born entrepreneur, and founder and CTO of EyeEm. EyeEm, for those of you who don’t know, is the Berlin and San Francisco-based photography community and marketplace. EyeEm’s distinct blend of talented photographers sharing beautiful and authentic photos, and unique photo indexing and search technologies have made it one of the fastest growing photography platforms in the world. The EyeEm Community now boasts 13 million photographers in over 100 countries. EyeEm’s yearly Photography Award and Festival is the largest photography competition in the world, and the company’s apps and website are regular features and award winners in both Google and Apple’s app store. Furthermore, EyeEm has regularly won various European and global awards for design, product, and technology. Before starting EyeEm, Ramzi was a Ph.D. candidate with years of research and lectures on the social, economic and technological aspects of privacy in Social Media. Ramzi is a passionate software architect, photographer, and pianist.

“Good photographers know what the rules of photography are, and great photographers know how to break them”

Key Takeaways

  1. Reduce reliance on massive datasets. We are evolving in a direction where algorithms are being built in a semi-supervised or even unsupervised fashion. It is important to remove the research problem and transform it into a more engineerable problem. A stated mission for every AI-oriented company should be: how to make your problem a deterministic problem? For EyeEm, it had to do with reducing reliance on a massive data set. Don’t forget someone does have to clean massive datasets, at the end of the day.
  2. Make use of all the resources at your disposal. EyeEm encountered a data labeling problem early on with user-generated content. The problem they faced had to do what seemed to be a simplistic problem revolving around the word ‘kind’ which means two different things in two different languages (‘child’ in German, and ‘being affectionate or loving’ in English). Defining ‘k-i-n-d’ therefore produced photos of both “old women being nice” as well as “little children running around.” The solution? Well, as Ramzi puts it, EyeEm found a solution to their problem in Amazon Mechanical Turk, commonly referred to as “M-Turk”. This helped with the evaluation and the clean-up in the early stages.
  3. Tap into your community (and create a community if you don’t already have one). EyeEm was both fortunate and wise in tapping into the global community of photographers and photo enthusiasts. At times when they had no information, they would run content sourcing campaigns under the mandate of photography missions. Upon making a request for “happy families on the beach” or “modern work environments”, which people across the world would then accordingly deliver on. And here, thanks to the ability of EyeEm’s models to train and learn on small datasets (yes, small datasets), the initial problem was gradually transformed into a solution of curating anywhere from 30–40 photos that fit the topic and trained their internal classifier on the fly.
  4. Focus on building an actual solution and don’t build a solution based purely on the “buzzwords” that exist today. What does this mean? Don’t build a company with the hope of exiting by simply “doing AI”. Instead, find an actual problem and aim to solve it with the required techniques, data, and hardware are likely already available and at your disposal. Furthermore, it is likely (relatively) cheap! Indeed we are further incorporating deep Learning which is becoming the de facto approach, as is reinforcement learning. But ultimately it is dependent on what data you have, and what goals you are trying to achieve. Pick a focused goal and get to work!

Feel free to share your comments on the lessons above with our growing DataSeries network. And if not, I would highly recommend watching the full interview — definitely worth 17 minutes of your time!



