Data Science, Machine Learning and Artificial Intelligence for Art

Meet the Thread Genius team at Sotheby’s

Jun 30, 2018 · 6 min read

This article was originally posted on the Towards Data Science publication.

Data Science, Machine Learning and Artificial Intelligence are fields from computer science that have already penetrated many industries and companies around the world. Their adoption is almost certainly correlated with the rise in “big data” over the last decade.

Advanced data analytics has the potential to transform the way companies understand insights, organize activity, and create value. Progress with programming languages, open source libraries and cloud computing have also made it easier for these methods to be applied effectively to data.

The Art market remains a sector where the data analytics revolution has yet to properly begin, until now.

This blog post will explain how state-of-the-art data science, machine learning (ML) and artificial intelligence (AI) methods are being used in the art market by Thread Genius, a firm acquired by Sotheby’s, the oldest international auction house in the world (Est. 1744). I’ll give you some insight into our team dynamics, the problems we are solving and how we are doing it.

An auction at Sotheby’s in 1957

What exactly is machine learning and artificial intelligence?

That’s a good question. Firstly, data science is a discipline where data is used and analyzed to test hypotheses, answer questions and understand insights.

Machine Learning is when computational tools and statistical techniques are leveraged to give computers the ability to learn from, and with, data. Yufeng G from Google Cloud uses a more refined definition in his article: “using data to answer questions”.

Artificial Intelligence is when computational tools start to possess cognitive abilities — for the purposes of this post, AI will refer to “deep learning” techniques that use artificial neural networks.

Who are Thread Genius?

Thread Genius is an artificial intelligence startup founded by Ahmad Qamarand Andrew Shum in 2015 and was acquired by Sotheby’s in January 2018. Both founders used to work at Spotify prior to starting Thread Genius. The main use of its technology was a visual search engine that applied deep learning techniques using artificial neural networks for the fashion industry.

Thread Genius using deep learning to identify similar handbags

By training artificial neural networks, Thread Genius was able to recognize clothing from images to find visually similar ones. Read their Medium article, “Robo Bill Cunningham: Shazam for Fashion With Deep Neural Networks” to learn more.

Interestingly, Thread Genius also applied this technology to art: read “Art Genius: Discovering artwork with visual search” to learn more.

Art Genius — Thread Genius using deep learning to identify similar art

How is our team broken down?

Now, Thread Genius is a growing team of machine learning engineers, software and data engineers, data scientists and designers based at Sotheby’s Headquarters in New York City.

Our initial efforts involve software development of large scale data pipelines for cleaning and standardizing the troves of historical Sotheby’s data so that we can undertake data analysis and apply ML and AI at scale.

Read this article in Fast Company about us for further detail.

What we are trying to solve?

So, what are some of the hypotheses we’re trying to test and questions we’re trying to answer by using and analyzing data?

The galleries at Sotheby’s New York

Sotheby’s has some of the best data in the art market related to historical transactions, individual’s preferences for art at every price point, images, object and artwork information, and much more. By utilizing this data effectively we hope to achieve the following missions:

Leveraging the Sotheby’s Mei Moses database. This embodies our efforts around analyzing art-as-an-asset. The Sotheby’s Mei Moses dataset is a unique database of over 50,000 repeat auction sales in eight collecting categories — the earliest recorded auction sale was in the early-17th Century! It was first developed in 2002 by New York University Stern School of Business Professors Jianping Mei, PhD and Michael Moses, PhD — read the academic paper here.

The dataset uses the purchase prices of the same painting at two distinct moments in time (i.e., repeat-sales) to measure the change in the value of unique works of art. We plan to use this information to analyze how the value of unique objects have moved through time and to compare the investment performance of art-as-an-asset to that of other asset classes.

Unlocking supply. We want to make it easier for our clients to sell their works of art if they choose to do so. Our aim is to use data to provide a lower barrier to help people sell their art. We are currently developing products to provide price transparency through various machine learning techniques.

A recommendation engine. Before the acquisition, Thread Genius specialized in taste-based image recognition and recommendation technologies using convolutional neural networks. By using Sotheby’s data, we will recommend artworks or objects coming for sale to our clients using deep learning.

Building the best data products. By bringing all three of these missions together, our aim is to improve operational efficiency and build the best data products in the art market so that our clients can get the best experience and transparent information when engaging with art at Sotheby’s.

How are we going to do this?

We primarily use Google Cloud Platform for all of our work — everything from data cleaning in Dataprep, from data processing and standardization in Dataflow, to data storage in Big Query, data analysis in Datalab, and finally, ML and AI using GCP’s whole suit of machine learning capabilities.

We primarily code in Python, but our software developers are using Node and Ruby for backend development. We will be building custom applications for some of the missions laid out above.

Various products and software offered by Google Cloud Platform

Why this problem is hard

Although we are using advanced data analytics to understand insights from images and data, art is fundamentally subjective — both in value and in taste.

Whenever we discover insights from our analyses, it is vital to validate them against the domain knowledge our Specialists possess. It cannot be overstated how important it is to have human involvement throughout this process. We are fortunate to have the best art specialists in the world at Sotheby’s who can help us along the way.

Moreover, this will be the first time anyone is doing anything like this in the art market — we’re essentially working on a blank canvas. A challenge like this is hugely exciting and we’re glad to be guiding the way for future developments.

Auction of Contemporary Art at Sotheby’s New York

Want to help us out?

Sotheby’s aim has always been to be a leader in innovation and technology in the art market and to support the future of art and technology.

We are excited to be applying advanced machine learning and artificial intelligence in the art market and working directly with our Specialists at Sotheby’s so that we can create the best data products in the industry.

If you have a background in data science, machine learning, NLP and/or AI and are interested in changing the world, feel free to reach out to us for a chat, we’re always interested in speaking with you, our audience.

As we continue on our journey towards data science, we will continue to write about our projects on this publication and in more detail, so stay tuned!

Thank you for reading!

Before you leave…

If you found this article helpful, hold down the👏 button below and share the article on Facebook, Twitter or LinkedIn so that everyone can benefit from it too.


Supporting the future of art and technology


Written by

Supporting the future of art and technology. Sotheby’s auction house, Est. 1744



Supporting the future of art and technology

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade