Hidden data in your image files

If you had a chance to catch my Google IO talk : Image compression for Android Developers, then you’ll be familiar with this little friend, the 16x16 pixel image:

This image is 16x16 pixels, and a single color. Its name has been hidden to protect its identity.

This image was used to show how Photoshop’s “SAVE” and “SAVE FOR WEB” produce strikingly different file sizes due to what types of optimizations are performed on the image data.

The small image proved its worth, and helped me illustrate the changes… However, after the talk, something began to dawn on me: For a 16x16 pixel image, those sizes are still waaaaayyyyy too big.

So, I decided to dig in a bit more to figure out what was going on.

Exported PNG file — 947 bytes

Now, I would expect that any image palletizer worth it’s salt, should be able to compress the target image to < 512 bytes: If you consider 8bpp, plus a single palette entry of 24bits, you’d end up with ~67 bytes. Even if you used 200 bytes for header information, you’d still be well in the ballpark of acceptable.

But when you export the target image from Photoshop to an Indexed PNG, using the Save For Web feature, you get an image who is 947 bytes in size.

Opening up the file in a hex editor quickly shows off why:

I’ve seen some horrible things on the internet.. but this might be the worst…

Right in the middle of my optimized PNG, sits a massive block of XML data. Take a second to let that soak in.

Turns out, that even in the Save for web option Photoshop can’t resist an opportunity to insert a block of XML data that identifies where the image came from, and how fancy it is. The result is 840 bytes just for XML data that has nothing to do with my image content. And these images are being sent around, all over the internet…

JPG isn’t any better — 1,333 bytes

Do the same thing with a JPG file, exported in the lowest quality “save for web” setting. You end up with a 1,333 byte file chocked full of XML lovin.

Use a better tool

Listen, XML is cool and all (wait. No it isn’t..) but I don’t want it inside my image data.

The point here is that even if you’re using “Save For web” it’s important to also use a 3rd party tool to strip out this kind of un-needed data.

Remember, users have to pay to download every bit of content from your app, and any bloat, no matter how small, costs them something.

For PNGs, check out a list of tools and suggestions here
For JPGs, check out a list of tools and suggestions here.

For more on data compression, and everything you should know about it, check out this awesome book.