Serving compressed SVG files

Over the years, Scalable Vector Graphics (SVG) has become the de facto standard for vector images on the web. One of its disadvantages however is its verbose XML-based format, with relatively large file sizes as a result. Luckily, the plaintext nature of XML lends itself well to compression. The SVG standard reserves the .svgz-extension for gzip-compressed SVG files, which can typically achieve about 20–50% of the original file size. When not actively editing them, saving SVG files as svgz makes sense: it saves disk space, the compression is lossless and it spares the time needed for on-the-fly compression when serving them over the web. Serving compressed SVG files correctly from a web server takes some care however.

HTTP headers for .svgz-files

SVG viewers conforming to the standards support svgz-files, as long as they are served with the correct HTTP headers:

Content-Type: image/svg+xml Content-Encoding: gzip

It may seem strange that both .svg (plaintext SVG) and .svgz (compressed SVG) files are served with the same MIME type. The difference is in the Content-Encoding header, which combined with the Content-Type describes the type of content being served. In this case they distinguish between ‘SVG, plain’ and ‘SVG, gzipped’, i.e. a readable XML format or a bunch of binary gibberish.

Content-Encoding vs. Transfer-Encoding

There also exists an HTTP/1.1 header Transfer-Encoding, which looks similar to the Content-Encoding, but is actually fundamentally different. The Content-Encoding indicates the ‘natural’ encoding of the content (in which it is stored and handled), whereas the Transfer-Encoding only describes a temporary encoding applied for (faster or more reliable) transport. A client that receives an HTTP response with

Content-Type: application/sql Transfer-Encoding: gzip

should thus decompress the content and offer plaintext SQL to the user. However, when the response headers are

Content-Type: application/sql Content-Encoding: gzip

the server actually meant to present gzip-compressed SQL. The client should not decompress such content and probably offer to save the binary data as a .sql.gz-file. For .svgz files, this means that the client does not immediately decompress the content when received, but rather pass the compressed byte stream to the SVG viewer (which will then do the decoding itself).

Part of the confusion stems from the fact that transfer encoding is an HTTP/1.1 concept and browser support for gzip transfer encoding (as should be indicated in a TE-header) is very limited. As a consequence, people looking for an efficient way to transfer their JavaScript, CSS and HTML started to serve these files with a compressed body and a Content-Encoding header (which was already present in HTTP/1.0) to indicate the encoding used. Browsers happily contributed to the widespread propagation of this kludge by decompressing this content, even when downloading to disk, as users surely did not expect to find binary gibberish on their disks when saving the source of a thus compressed JavaScript file. This leaves us in a state of widespread misuse of the Content-Encoding header as a substitute for Transfer-Encoding that will probably not be fixed in HTTP 1 anymore.

Serving compressed files for download

This poses a problem when serving .svgz-files for download (with Content-Disposition: attachment), e.g. to a user who clicks a download button in a filemanager-like application. In such a case, the user probably wants to end up with the compressed (binary) content of the .svgz-file on disk, not with a plaintext file with a .svgz-extension. However, due to the abuse of the Content-Encoding header described above, users expect other compressed file types to be saved in readable plaintext, so clients have a hard time to decide what to do. In an attempt to escape from this squeeze, most browsers implement some kind of guessing logic involving the extension of the downloaded file. This logic does not always yield the desired result however, especially when there is a custom filename parameter in the Content-Disposition header.

To work around these problems, we should prevent serving content that has a ‘natural’ compression with a Content-Encoding header if the file has an attachment Content-Disposition. Just leaving the Content-Encoding header out for .svgz files is not an option, as it would make it look as if we're serving plaintext SVG. A simple solution is to serve these files (only when downloaded) with a Content-Type of application/octet-stream ("Arbitrary binary data; just save these bytes to disk, please") and no Content-Encoding:

Content-Type: application/octet-stream Content-Disposition: attachment; filename="image.svgz"

When serving .svgz files for inline purposes (display in a viewer) we still send the recommended headers to allow correct rendering:

Content-Type: image/svg+xml Content-Encoding: gzip Content-Disposition: inline

Conclusions

  • Saving and serving SVG content as .svgz-files is worthwhile, but takes some care
  • Know the difference between Content-Encoding and Transfer-Encoding
  • Serving naturally-compressed content for download with a Content-Encoding header is unreliable, serve it as a non-encoded octet-stream instead

Originally published at www.moxio.com.

A single golf clap? Or a long standing ovation?

By clapping more or less, you can signal to us which stories really stand out.