Realtime Adaptive Watermarking for Millions of Images
Watermarking content has a lot of challenges. It’s especially tricky on a dual-sided digital marketplace like Creative Market where there are several perspectives to consider. First and foremost we have to protect content from being stolen, so the watermark has to be clearly visible and difficult to bypass. We also want to make sure the consumer has a friction-free browsing experience to get the best possible representation of what he or she is going to buy. And of course we want to give the content creator (seller) confidence in the process while not occluding or distorting important parts of their creation.
At Creative Market we resize all of our images on the fly. Not only does the watermark have to play nice with many different textures and color conditions, but it has to be quick enough to serve in realtime. This gives us the freedom to change any watermark or any thumbnail parameter and see the changes almost instantly, but this comes with steep performance requirements.
Here’s everything that needs to happen the first time a user requests a thumbnail (subsequent requests will be pulled from a CDN):
- Load the input image from disk
- Crop and resize it to the desired thumbnail size
- Apply sharpening
- Load and apply watermark to image
- Compress the JPG
You may say that’s a tall order for a realtime system, but luckily we have OpenCV and lots of optimized C++ at our disposal! We use Arion, an open source C++ library which makes heavy use of OpenCV, to perform all of the image manipulation procedures (steps 2–4). We interface the image manipulation procedures in Go via our Goarion wrapper. Lastly we package everything up into our custom service written in Go which serves up the http responses and handles configuration and logging. We love open source at Creative Market, and we make an active effort to contribute back to the community - check out our work here.
Now that you have a good overview of the full system we’ll focus on just the watermarking portion.
The intuition behind our adaptive watermarking approach has to do with ensuring high contrast on brighter parts of the image. This comes from the fact that watermarks typically have predominantly light bodies. The watermark will show up great on dark parts of the image, but will look completely faded on light parts, especially on high-frequency (busy) textures.
While we don’t explicitly focus on the frequency content of the image, we adjust the blend of the watermark based on the perceived brightness of the image beneath. Lets take a look at our approach.
This is an apples-to-apples comparison of a constant blend approach vs our adaptive approach. Both have a baseline of 10%, but the adaptive one (right) is allowed to go up to 50% on lighter parts of the image. Both versions look very similar on the top left region, but start to really differ as we get into the lighter regions. Even though the adaptive watermark can go up to 50%, it’s not too overbearing. Note that on the bottom left it’s still very difficult to see the adaptive watermark, but this can be easily fixed by including dark regions in the watermark itself.
Here is the magic formula for determining how much to blend:
blend = blendDelta * log10(1 + normFactor * brightness) + blendMin
blendDelta is just blendMax - blendMin (the maximum we are willing to blend minus the minimum). Adding 1 in the log function makes sure we always have a valid result and the normFactor constant (which is 9/255) ensures the final output does not exceed blendMax.
As can be seen in the plot, the log function applies a more aggressive blend at lower brightness levels without introducing any discontinuities.
For the brightness value we use a formula for perceived brightness: 0.299 * r + 0.587 * g + 0.114 * b where the inputs represent the red, greed, and blue components of a pixel. For speed we use the following approximation:
(r + r + r + b + g + g + g + g) >> 3
This trades three floating point operations for five additions and a bitwise shift operation. If you recall, shifting right by 3 bits is the same as dividing by 8, so the approximation is equivalent to: 0.375 * r + 0.5 * g + 0.125 * b (not too far off from the original).
Lastly, we only do any of these computations if the alpha channel on the watermark (which always comes from a PNG file) is greater than 0. In other words: if there is nothing to blend we move on to the next pixel.
Real World Samples
While the position and sizing of the watermark changed a little bit since the originals were collected (in fact the new watermark is smaller), the difference in contrast can be clearly seen on both light and dark backgrounds with high frequency content.
All of these samples and formulas are great, you say, but how about a performance benchmark? Well, I’m glad you asked because we have a 40 core machine at our disposal, so lets see if we can make it break a sweat.
In this synthetic benchmark we use the same input image (667x1000 px) and output three different thumbnail sizes: 100x100, 640x480, 1024x768. The output is cropped and then down-sampled (or up-sampled as is the case with the third output size).
Adding the adaptive watermark increases the minimum by 2.3ms and decreases throughput by about 14%. However even at 572.7 thumbnails per second our 99th percentile with the watermark is less than 140ms.
You still might be skeptical since this is just a synthetic benchmark… How does this system behave in the real world? The service has been operating in production for quite some time now and generating a wide range of thumbnails (note that not all images get watermarks). Over a 2 week period we hit 94ms (75th percentile) and 280ms (95th percentile).
tl; dr (aka the summary)
At Creative Market we use an adaptive watermarking technique that gets applied to images on the fly. Our infrastructure allows us to quickly change any resizing or watermarking parameter and see the results in near-realtime (there is no need to batch-resize all of our images). Our adaptive technique adjusts the watermark blend based on the brightness of the image beneath to give higher contrast where needed.
We use a standalone service written in C++ and Go to achieve tight performance requirements. In synthetic benchmarks we achieve sub 140ms times in the 99th percentile at 572 image resizes / sec (this includes loading the image from disk and generating the final JPG). The full system also performs well in production where we serve a wide range of image sizes and settings (not all have watermarks) at 280ms (95th percentile).