Fighting Deepfakes, protecting your privacy and not leaving anyone behind — a view from the trenches
A recent article published by Axios highlighted the crucial importance of ‘verify-at-capture’: a new approach to verify photos and videos at the precise moment they’re taken, leaving no room for doubt about their authenticity.
Historically ‘digital forensics’ were used to determine the authenticity of a media item: looking at artifacts an editing process may have left behind. Buy the sheer scale of photos and videos uploaded to the web daily along with the increasing sophistication and accessibility of AI generated media, the article highlighted the need for a more sustainable solution.
“I don’t believe forensics can work in the long run,” says Pawel Korus, a professor of engineering at NYU. “It was never reliable enough to begin with, and it’s starting to break as cameras are doing more and more interesting things.”
As these issues are at the heart of what we do at Serelay and ones we care deeply about (and I am well aware our friends at Truepic mirror these concerns) I thought it’s worth sharing our current outlooks on these.
My goal in sharing these thoughts is not to suggest we have everything sorted, but rather to open up the discussion. That being said, I do believe that verify-at-capture not only to restore our faith in visual media, but also do so while maintaining inclusiveness and end-user privacy.
Let’s look at it point by point.
“The people who will be de facto excluded in a system of authentication will be people who are in the Global South, use a jailbroken phone, probably are women, probably are in rural area,” — Sam Gregory, Program Director, Witness
From a simple pricing perspective I don’t see the economic argument of charging the end user as one that currently exists in our nascent industry or that players in the ecosystem are banking on.
For example, Serelay will launch Idem, its end-user app, later this month in partnership with a global media organisation. The app will allow users to share in a verifiable manner on social media and also to submit photos and videos to our media partner(s). This allows media organisations to receive photos and videos from around the world, and verify authenticity of content, time and location in zero time (any submitted media will have been verified by the time it lands on the editorial desk).
So end users don’t pay. Media organisations do, but a fraction of the cost required for staff time associated with legacy content verification and with a much higher level of confidence in any user generated content they end up publishing and far better protection in case of dispute.
Geographic and Technological Exclusion
Of course one can be also excluded from a free service. If that service required advanced infrastructure (like a 4G data connection) expensive hardware (a high-end device), or consumables (mobile data allowance).
When we look at these issues, though, I think we need to look at the evolving state of verify-at-capture in line with the evolution of technology in general (i.e. automobiles were not affordable pre Model T) and software/hardware in particular (i.e. those jailbroken phones mentioned in quote above will significantly outperform a 1960s IBM mainframe).
The first verify-at-capture solutions to come on the market followed a chain-of-custody approach — i.e. as soon as you captured the photo or video on your mobile device it was replicated on the verification provider’s servers (and can be used as a reference point).
While this approach requires a robust Wifi or mobile connection, there are other emerging approaches which do not require this. Serelay, for example,has developed an Integrity Vector approach – the computation of over 100 mathematical attributes relating to the media file at the point of capture, and it is only these values, alongside rich sensor metadata which can be securely transferred off-device. This means that verify-at-capture capabilities can now be delivered not only to areas with a patchy mobile network, but in some instances where one is altogether absent (when they’re there we’ll use less than15kb per photo or video, which should accommodate even the most modest mobile data allowance).
In terms of hardware, I think it’s fair to say most startups, at least in the US and Europe, tend to start development with a bias towards high-end devices. However a solution with poor device proliferation support or an outsized SDK size is unlikely to succeed outside some very narrow industry verticals. At Serelay we currently support Android all the way back to Lollipop 5.0 (API 21) which means we can support a 6 years old Nexus 4 or a new phone you can buy for as little as £10 in the UK.
Lastly, we do acknowledge that some geographies may have a higher prevalence of jailbroken devices. But, while detecting whether a phone is jailbroken (or rooted on Android) is an important part of what we do (and in fact just this week we found out we able to detect rooted phone’s in cases Google’s own Safety Net misses out) there are 2 subtleties here:
We never disable a capture event, i.e. we will always let the user capture the event and it is up to the party performing the verification to decide on its approach to rooted or jailbroken devices.
As checking whether a device is rooted/jailbroken is not 100% conclusive, some of our newer checks, and I’m sure this is the case for other startups, are mitigating their reliance on them. For example — we can now model in real time the pattern of a GPS reading to determine whether it is a genuine physical one or a mocked one, and do this with no reliance on any device integrity checks.
Data and Privacy
The Serelay solution does not do any background monitoring of a users device, the the best of my knowledge neither do other startups in this space.
We strongly believe that content authentication should be agnostic to user identity and so we do not have any user accounts, and any metadata we store relating to a media item will never be connected to and end user.
But aside from our specific privacy-by-design approach, I don’t think verify-at-capture providers have a strong economic case to monetise data. The nature of photo and video capture — even for a relatively prolific camera user is non continuous. It’s hard to see companies in this space compete with consumption apps such as social networks, or navigation services.
Lastly, data used for verification tends to be less granular than the data it verifies. For example we may use cell-tower information to double-check the GPS reading of a specific photo. Notably, the cell tower doe not provide us with more knowledge about the user, GPS is far more accurate, but rather relates to the veracity of it.
The De-facto alternative — Verification by Reputation
In absence of a universal standard for verification, we tend to apply a ‘reputation filter’ — we trust media based on the individual or organisation that captured or published it. This means we are often inherently biased towards trusting people that look like us or share our political or ideological beliefs. With all the challenges relating to verify-at-capture, I think at the end of the day it is our best chance of proving a better levelled playing field and it is important we maintain an open and transparent discussion while doing so.