Make User Agents great again!

Long ago, someone introduced a useful feature to the world wide web, which enabled webmasters to get information about the applications their visitor used to access websites. This feature was called “user agents” and was intended to enable the creation of better statistics and tailored responses for special devices.

Over the course of the years, user agents continuously became more absurd, due to everyone trying to impersonate everyone else, up to the point where we are now:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36

Look at it. This is my current user agent. It contains the following values:

  • Mozilla
  • AppleWebKit
  • Gecko
  • Chrome
  • Safari

If you have ever fiddled around with user agent parsing, you’ll probably know I’m using Chrome on an Apple device. If you haven’t you’re probably thinking “What on earth is this abomination?”

This collection of different browsers, rendering engines and sparse operating system details originates from the earlier days of the internet, when things were more complicated — you’d always need to make sure the client would understand the response you sent it, if it supported those advanced features you’d like to use. So webmasters (yes, being a nerd was less sexy a few decades ago, dear React hipster) checked for the browser name and delivered specially built sites for each of them.
When new browsers appeared on the stage, they faced a serious issue: How do you make sure the web server sends you their shiny, bleeding-edge version with all those fancy features you support?
So people started to reuse user agents from other browsers and appended their own names to it, until the UA became what is now: A complete and utter mess.

Why should I care?

The user agent is transmitted as an HTTP request header. Let’s look at a typical HTTP request again:

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36
Accept-Language: de,de-DE;q=0.8,en-US;q=0.6,en;q=0.4
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Connection: close
Connect-Time: 0
Accept-Encoding: gzip
Upgrade-Insecure-Requests: 1

See this? Around 36% of each request is used just to send every possible HTML rendering engine to web servers. This means both client and server waste bandwidth and energy with every single request.

Additionally, apart from being already pretty verbose, the user agent has historically been abused to submit other details about the client, including security patch level, third party applications installed on the machine or even to assign static user IDs to identify them across all websites, making it a privacy issue, too.

What to do about it?

Look at these examples of clean user agent strings:

Chrome/55.0.2883.95 (MacOs/10.12.0)
Firefox/45.7.0 (Linux.Ubuntu/16.10)
Firefox/51.0.1 (Windows/10.1607)
IExplorer/11 (Windows.32/6.1.7601)
Facebook/78.0 (Linux.Android/6.0.1)

These contain all values possibly required by web servers: Browser name and version, OS name and version, optionally distribution name or bitness.
This is enough to collect visitor metrics and offer matching downloads, while excluding rendering engines and compatibility levels — by now, any developer checking these to deliver browser hacks is doing things wrong and should be discouraged to developing applications this way.

This is a plea to all browser vendors and app developers: Please start switching to sensible user agents again.