Why Watermark (Everything)?

A cryptographer’s perspective.

Jiaxin Guan
Trufo
5 min readFeb 19, 2024

--

When browsing the internet, have you ever wondered if an image you saw is an actual photo taken by a camera or an artificially generated or altered image? Well, I certainly have.

An AI-generated image showing an explosion at the Pentagon went viral on Twitter In May 2023. (Image: Twitter/@WhaleChart)

The Internet is so filled with misinformation that distinguishing useful, authentic information from others has almost become a necessary survival skill for us 21st-century people. To be fair, misinformation did exist way before the Internet; but with the advent of modern-day technology, especially generative AI models, it has become increasingly difficult to identify such misinformation purely based on empirical evidence. As a quick exercise, which of the following images do you think are real photos, and which ones are not?

Quick Exercise: Identify which photos are taken by a camera, and which are generated by AI. (Image Sources: Shutterstock; Getty Images; Dornob.com; various generative AI models)

That doesn’t seem like an easy task, does it?

Answer to the Exercise Above: Only the highlighted ones are captured by a camera. The rest are all AI-generated. How many did you get right?

Modern generative AI models like Stable Diffusion and DALL-E can produce photo-realistic images within seconds, making misinformation both easier to produce and harder to distinguish. A natural challenge arises: what mechanisms can we use to facilitate distinguishing AI-generated content?

One practice that the AI industry has been exploring is to embed watermarks into AI-generated images. Specifically, generative AI models have been developing their own watermarking schemes, so that whenever an image is generated by these models, an invisible watermark is attached to it. Later on, the image can be checked for the existence of such watermarks to determine if it is generated by AI or not.

While this approach does work in an ideal world where everyone abides by the rules, unfortunately, it does not quite work in an adversarial setting.

Most importantly, this approach only works if all generative AI models, big or small, implement this approach. And by “all”, we require not 99% or 99.99%, but an absolute 100%. As long as there exists one model that does not perform proper watermarking on its generated content, we cannot rely on these watermarks. It is true that if you see an image with a watermark, you would know it is “AI-generated”, but what about an image without such a watermark? You cannot tell if it is from a camera, or from an evil gen AI model that simply refuses to watermark. An analogy here is that by using this approach, we are effectively embedding trust in the absence of a watermark.

Additionally, from a usability point of view, in order to check the validity of the watermarks, private information from the gen AI model that produced the watermark is required. As a result, each model must have their own separate verification process. In this way, to check whether an image is AI-generated, the user must query all of the generative AI models in the world for verification of the watermark. What happens if the user only queries, say only 50%, of all the gen AI models? Then it reduces to the same scenario that we discussed in the previous paragraph: in order to know for sure that the image is not “AI-Generated”, you have to query 100% of these models, which is hugely impractical.

Finally, if you want to trust an image without a watermark to be “camera-captured”, robustness, i.e. watermarks should persist after any tampering with the image, is crucial. However, it is very challenging, if not impossible, to make watermarks robust. With the watermark being invisible and hence having a negligible impact on the original image, it, roughly speaking, can only exist in the noise of the image. Therefore, there are negative results [1,2] showing that by applying denoising or other slight perturbations to the image, one can pretty easily remove these watermarks and consequently change an image from “AI-generated” to “camera-captured”, thus breaking the robustness requirement.

But in real life, the more common-sense practice is to embed trust in the existence of a watermark. For instance, think about the paper money that we use. We put all the counterfeit measures like watermarking into the real money, which leads to trust, instead of requiring counterfeit money manufacturers to put watermarks into fake money. In short, trust should come from the existence of a watermark, not the absence of it.

At Trufo, we imagine an alternative approach: watermarks are placed into non-AI-generated content. If an image contains such a Trufo watermark, then we know it can be trusted. We will talk about more technical details of Trufo in future blog posts.

References:

[1] Zhang, Hanlin, et al. “Watermarks in the sand: Impossibility of strong watermarking for generative models.” arXiv preprint arXiv:2311.04378 (2023).

[2] Zhao, Xuandong, et al. “Invisible Image Watermarks Are Provably Removable Using Generative AI.” arXiv preprint arXiv:2306.01953 (2023).

--

--