From Old Masters to AI Masters: Identifying AI-Generated Artwork with DataRobot

Published in

DataRobot

7 min readAug 9, 2023

Can AI produce a masterpiece like an Old Master? Depending on who you ask, the answer will either be a resounding yes or a vehement no. Regardless, AI-generated art has quite literally taken the world by storm. It has allowed anyone without the knowledge of painting or photography to start creating art using just language prompts. This advancement in deep learning has spawned a new discipline called Prompt Engineering.

Generative models like DALL.E, ChatGPT are transforming workloads in organizations. By generating assets, they are helping improve productivity for organizations resulting in tangible benefits like savings and revenues. However, not all developments are positive. News of artists boycotting AI-generated artwork across the globe has been making rounds for the past few months. Issues like usage of copyrighted material for training, creative integrity, and industrialization of art are all at the forefront of the discussion.

Regardless of which camp one is in, it’s important to be able to detect if a creative asset is man-made or generated by an AI. This will allow organizations and artists to identify such art and act accordingly. Pinpointing how an asset is created will allow improved user interaction with it w.r.t attribution, revenue/pricing, moderation, etc.

So, can we use machine learning to identify if a creative asset like digital artwork is produced by a human or generated by an AI? Yes, we can. I’ll show how you can do it using DataRobot Visual AI without in-depth knowledge of deep learning and computer vision.

Training Your Computer Vision Model

One of the approaches for building this classification model is through supervised learning. We need to collect a bunch of images which are man-made and AI-generated. Computer vision models work better with more data: the more varied the training images, the more generalized the final model is. We will be using this HuggingFace dataset by Abhishek Thakur to build our image classifier model. The training dataset consists of a folder of images and a .csv file with file names and corresponding labels identifying whether the image is generated by a human or by an AI. Below is a sample of the images from the dataset.

For more information on how to prepare your image dataset for Visual AI in a few simple steps, please refer to this blog post.

Once the dataset is ready, simply connect it to DataRobot to start building computer vision models. You can do this via the low-code UX, or hosted notebooks, or you can pull DataRobot into your favorite notebook via an API, kind of like a modern data science library.

DataRobot Low-Code UX

DataRobot Notebooks

DataRobot runs multiple experiments, builds multiple computer vision models, and then recommends the best among them. DataRobot Visual AI uses the latest and greatest pre-trained architectures to accomplish this task of image classification through supervised learning. And all of this is possible with low cost and compute, because DataRobot models are using smart parameter search and optimization, tested over hundreds of datasets, along with the ImageNet pre-trained weights. The architecture of the recommended model is shown below.

It uses a DataRobot customized (pruned), pre-trained MobileNet architecture as the Convolutional Neural Network, which extracts features from the training images. The MobileNet’s weights correspond to the ones pre-training on ImageNet, which is relevant for the scope of this exercise, as MobileNet will encapsulate the characteristics of images generated by humans. And this knowledge is leveraged by DataRobot.

AI-generated images often contain artifacts that can give away the fact that the images are not real. The old joke that many artists fail at drawing hands somehow has translated to our AI models as well. DataRobot is able to pick these artifacts from the images to accurately classify them.

In computer vision, we use Activation Maps to understand what features are being used by the model to make a prediction. Activation maps help us validate if the model is using relevant features from images or is running into issues of overfitting or leakage.

To understand the concept of target leakage in computer vision, please refer to our blog post on target leakage in medical imaging.

Below are the activation maps from DataRobot. Let us start by looking at the activation maps of AI-generated images.

In the above images generated by AI, DataRobot models are able to identify artifacts that show faulty anatomical features in the images.

DataRobot models are able to understand the semantics of inanimate objects with respect to positional perspective and physics.

Let us now look at images generated by humans. DataRobot models are able to identify key features and compositions that are native to human generated images/artwork.

In the above images generated by humans, DataRobot models are able to understand and validate anatomy in the representations of living things. Because of the presence of photographs in the training data, DataRobot models are able to classify non-artwork images as well.

In the above images generated by humans, DataRobot models are able to understand the semantics of inanimate objects with respect to positional perspective and physics.

From the above samples of activation maps, it is clear that DataRobot Visual AI models are able to leverage popular, pre-trained deep learning computer vision architectures to identify if an image is created by a human or an AI.

Here are some examples where the model fails to predict the target correctly. The images seem to be devoid of any of the above mentioned faults and features, and they seem to confuse the model.

So, what can we do to improve the model performance? You can add:

More training data for the model to generalize on.
Additional multi-modal features related to the image like author name, image source, license details, etc.

DataRobot Visual AI models will utilize the multi-modal features included in the training data easily.

As for the model performance metrics, the recommended model from DataRobot has an accuracy of 94% and doesn’t show leakage. Other performance metrics are shown below.

DataRobot can also run Comprehensive Autopilot, which executes more complicated blueprints and supports advanced custom tuning that can provide even higher performance. Comprehensive autopilot is able to achieve 99.8% accuracy on this dataset.

This model can be quickly deployed into production as a restful webservice via a single click.

The recommended model uses MobileNet, which is a very efficient architecture for training and inference purposes. However, DataRobot also allows the users to select other architectures. Below is an example of how we can select EfficientNetV2-S architecture inside a DataRobot blueprint and run an experiment to see if it can improve the existing model performance.

In a world where content is being created at an unimaginable scale, you can use DataRobot Visual AI to train and deploy models that can identify AI-generated images in order to make fair and data-driven decisions, leading to the best experience for your users and stakeholders.

From Old Masters to AI Masters: Identifying AI-Generated Artwork with DataRobot

Written by Abdul Jilani