Introducing “Inspect” by Truepic, and why Detection of Photo Editing is a Losing Game

Published in

Truepic

8 min readDec 6, 2019

Today — we’re releasing a free, publicly available tool called Inspect. It’s a JPEG metadata viewer & the cumulation of one year of research on post-capture photo forensics (editing detection). Inspect is meant to educate the public on two key image quality factors — pixel quality & metadata quantity. Below is the story of how our team landed on the creation of this tool & our position regarding detection of still image editing.

In Inspect, digital images are assessed for image quality, and metadata quantity. Users of Inspect can learn more about the fundamental flaws of JPEG imagery & how pixel & metadata level edits are undetectable.

Last year, Truepic acquired Fourandsix Technologies. The goal of the acquisition was to explore post-capture photo forensics technologies alongside world-renowned digital forensics leader, Dr. Hany Farid. In the year that followed, we hired a specialized team, developed upon the technology, processed hundreds of thousands of images from real clients, and iterated upon core forensic concepts. It was a very valuable experience that helped affirm our belief in Controlled Capture technology, as we ultimately found that post-detection of digital image manipulations will have a limited near-term impact on the world.

While Truepic’s flagship technology, Controlled Capture, can & has been used as a tool to photo document with extremely high trust, it is fundamentally limited to images captured through our specialized camera application. As with most software, there are significant scaling challenges that can only come with time & brand recognition. So, while those leveraging Truepic’s system can capture images with verified contents, time, date, and location, such as this…

Controlled Capture image taken by a journalist of high-profile Hanoi Summit between leaders of the United States and North Korea; February 27,2019.

…the other ~2 trillion images a year without verifiable provenance will still be called into question. This is true in society and business alike, as Airbnb, Uber, & countless other organizations have recently realized.

There are two key discoveries by our team that have driven our conclusion that the detection of edits and manipulations to still photos will be a losing game: (1.) The fundamental way the JPEG file format works, and (2.) the modification and compression caused by uploading and sharing images across the internet.

The JPEG* file format is fundamentally flawed — making metadata immune to detection

Open your iPhone, adjust the date & time in the Settings App, and snap a photo. Guess what? — the camera original content metadata attached to your photo has the newly altered time & date, not the accurate time & date (altered metadata fields include: Created, Modified, CreateDate, DateTimeOriginal, SubSecCreateDate, SubSecDateTimeOriginal, etc). No detection mechanism on the planet will be able to detect this simple time and date adjustment. The root of this issue is that the standard approach of attaching EXIF data to a JPEG file is broken, and even knowing that an image has not been modified after capture does not mean it is accurate even at the time of capture.

What does that mean for the trillions of photos on the internet? Well, if they were captured on a smartphone, or most any other capture device with simple setting adjustments, their date & time, and thus provenance, is impossible to re-establish through any automated method — even if it’s determined to be “camera original” data.

The same goes for location details. By making manipulations to the location perceived by the native camera on a device, the same problem exists — modified location will be stamped as the camera original on a JPEG file. Turning on Developer Mode on an Android device allows anyone to perform this location adjustment in a couple of minutes.

What does this mean? — anyone can make simple, novice level adjustments on their device, snap a photo, upload it to the internet, and not a single programmatic detection technology will accurately locate the potentially malicious changes to the metadata. This is an incredibly challenging problem & has been the focus of much discussion around disinformation, fraud, and online visual deception.

*We should note that this problem is not specific to the JPEG/JFIF file format, as other formats (such as PNG) suffer from these same issues. That said, JPEG/JFIF is by far the most widely adopted file format for sharing still photographs across the internet.

Almost every internet service and image upload pipeline modifies & compresses images

When a digital image is uploaded to the internet, or sent through standard messaging applications, two critical changes are made to the file:

The metadata is often completely stripped for privacy reasons.
The image is compressed to preserve bandwidth & improve upload speeds.

Looking at a very common example of this… here’s a table showing Apple Mail’s standard compression options & the resulting photo sizes:

Here’s a digital image that has been edited in photoshop to include a surfer, surfing a wave backwards:

When uploaded or sent through the internet, the image is compressed & resized, which alters the fidelity of the image, and the available pixels to analyze. Based on our table above, the resulting image ratios appear as follows:

With each subsequent compression & size adjustment, the necessary pixel fidelity to perform programatic detection of the manipulation is reduced, and eventually completely lost:

Which has led our team to the conclusion that: manipulated pixels in images are hard to detect, and often nearly impossible to detect if the image has been heavily compressed and/or substantially resized. Which, happens to almost every image uploaded to the internet (or sent over SMS, email, etc).

As one real-world example of these fundamental flaws — Truepic ran a proof-of-concept with one of the nation’s largest insurance carriers, processing 150k images through our detection system. After analysis — we found that the average image quality score was just 272 (1250 represents a modern smartphone), and the average metadata quantity was 62 (100 represents an average unmodified & untouched image). Which means — for a single representative enterprise — the average quality & metadata degradation to the images they receive from their customers negates over 10 years of camera advancement, resulting in images that are the visual equivalent of a smartphone produced in 2009. So, detection algorithms are at a natural disadvantage to helping them root out image-based fraud & deception, as they are forced to process images with the visual quality of a 10-year-old digital photo.

With nearly every internet service stripping digital image metadata, compressing, and resizing their images, it becomes apparent that detecting false images online is not a practical approach. Until we move to non-destructive file formats, and change how we treat & respect these file types during transmission across the internet, it will be difficult, if not impossible to do still image manipulation detection accurately & at scale. Even if internet services adjust their platforms to better address compression and resizing, the first problem — testing for altered time, date, & location will still remain a fundamental issue. It is worth noting that these image detection issues also highlight the critical challenge of detecting “Cheapfakes”, rudimentary image & video manipulation, which are still the most common type of visual deception. Furthermore, stripped & inaccurate metadata exacerbates the problems of misattributing images, which we’ve seen recently everywhere from California to Syria.

Social media-fueled fake news in India last year highlights the worst-case scenario that these issues present. Misattributed images combined with false narratives can spread very quickly, undetected, and can lead to violence:

“As Patil, a veteran officer with a flat-top haircut and thick mustache, speaks, he begins swiping through gruesome crime scene photos stored on his phone before coming to a video that officers believe helped stoke fear and unease. It shows photos of pale, lifeless children laid out in rows, half covered with sheets. A voiceover warns parents to be vigilant and on the lookout for child snatchers.The photos are not fake, but they don’t show the victims of a murderous Indian kidnapping ring either. The pictures are of children who were killed in Syria during a chemical attack on the town of Ghouta in August 2013, five years earlier and thousands of miles away.” — Timothy McLaughlin — Wired

This has led us to our final conclusion — establishing data integrity at the source is the definitive solution to trust in media. Experts refer to this as “the provenance approach”. You will continue to see our team leading efforts in establishing the standards around this technology, and working hard to democratize access & drive tangible, real-world impact for both business & society. This includes work around data integrity from source cameras, non-destructive file formats, and collaboration with the entire internet ecosystem on shared standards.

The Inspect tool that we are releasing today, conversely, will help provide education to the public on why traditional photos, and the data contained in them, are not a source of high trust information.

What’s next?

Through the lessons learned during our research, our team will focus our energy in three areas:

Controlled Capture Technology— and Truepic Vision available for everyone: Our team has recently made it possible for any organization to sign-up through our website & utilize Controlled Capture technology to capture high trust images & videos from third parties. Our goal is to distribute this technology as widely as we can, and empower every organization with high trust capture tools to make better image-based decisions. Available here.
Truepic & Social Responsibility: We recently announced a grant program, that allows any organization with a social impact oriented mission to use our trusted controlled capture tools at reduced or zero cost. We are committed to helping organizations that are on a mission to improve the state of our world.
Inspect as an educational tool: We will continue to improve upon the public Inspect tool based on usage and feedback. Our goal is to educate those looking to learn more about the fundamental challenges in detecting edits to imagery. This free metadata viewer also serves as a natural extension of our social responsibility values. If the landscape changes in the future, and post-detection of image editing becomes possible at internet scale, we will look to re-invest in these efforts.

For any questions — our team can be reached at info@truepic.com — for inquiries on the grant program, please contact mounir@truepic.com.

A very special thank you to Justin, Amy, Nick, Ryan, Oliver and everyone else that worked on the Inspect project.

Introducing “Inspect” by Truepic, and why Detection of Photo Editing is a Losing Game

The JPEG* file format is fundamentally flawed — making metadata immune to detection

Almost every internet service and image upload pipeline modifies & compresses images

What’s next?

Written by Jeffrey McGregor