SSL certificate revocation and how it is broken in practice
Explore certificate revocation solutions: CRL, OCSP, OCSP stapling, must-staple, CRLSets. Check out server implementation issues and browser support
The Public Key Infrastructure (PKI) is the software system that allows to sign, validate certificate, keep a list of revoked certificates, distribute CA public key. The goal of PKI is to enable secure communication among parties who have never met before.
The most common use case of the PKI are myriad of the websites secured with TLS/HTTPS and using SSL certificates to establish trust for particular domain name(s) and authenticate the server side. Once the certificate is signed by the CA (certificate authority), it remains valid for a specific duration. When it’s about to expire, usually you renew it or buy a new one. But sometimes you need to revoke a certificate beforehand, usually due to a private key compromise.
Certificate revocation is a process of invalidating an issued SSL certificate. Ideally, browsers and other clients should be able to detect that the certificate is revoked in timely manner, show the security warning, that certificate is no longer trusted, and prevent user from further consuming such a website.
Let’s explore various approaches to address certificate revocation.
Certificate Revocation Lists (CRL)
Original design was for CAs to manage and publish lists of revoked certificates, so browser/clients can download them and compare against to check for certificate status. This worked great in the past when there were few websites and certificates, but given today’s scale of the Internet, it’s practically infeasible for CA to manage and for client to download these huge lists, whenever they need to check for revocation status.
CRL architecture introduces the dependency between client and CA infrastructure, making it prone to the CA server’s availability issues and downtimes.
Nowadays original CRLs are effectively ignored by just end clients.
Online Certificate Status Protocol (OCSP)
OCSP is an improvement to CRL and is a protocol for checking if a SSL certificate has been revoked. Instead of client downloading the complete big list of revoked certificates, it can just submit a request to a CA server, that returns a signed response with certificate current status. OCSP is much lightweight, as only one record is retrieved at a time, and it can provide more accurate information, as opposite to CRLs lists, which are downloaded and cached on a client for some time.
Still, it suffers from many issues:
- Additional dependency and query between client and CA servers during TLS handshake, that adds up latency.
- Poor, unreliable CA infrastructure. Prone to availability problems. The CA servers are targets for DoS attacks. Slow OCSP response adds up latency too.
- Privacy compromise. Browser leaks what website is being accessed and who accesses it to CA servers.
Soft-fail behavior
Given dependency to poor and not-reliable CA infrastructure, browsers/clients usually tend to use soft-fail (ignore) behavior, when they don’t receive OCSP response in a timely manner or encounter some errors, assuming that certificate is valid and allowing to access the website. Some browsers just show warning, that user can bypass. Chrome, for example, does not use OCSP at all, and use its own proprietary mechanism, called CRLSet. The reason for such soft-fail behavior is because unavailable CA servers should not block access to all websites, using their certificates.
Soft-fail behavior gives us false sense of security — it’s OK when you get revocation warning, but when you don’t — you’re in doubt if cert is valid indeed or if there is an OSCP infrastructure related issue. For example, attacker can block OCSP traffic and cause revocation checks to pass.
Here is firefox reaction with SEC_ERROR_REVOKED_CERTIFICATE
error when it gets revoked status from OCSP responder.
Chrome does not use OSCP at all, saying cert is OK with a green “secure” badge, but if you dig a bit deeper, it tells certificate is revoked 😕
Right now there is no reliable way to switch to hard-fail behavior. Essentially, revocation is broken. There are couple of attempts to address this issue, like proprietary mechanism (Chrome CLRSet, Firefox OneCRL) or OSCP must-staple extensions, but there is still no 100% working solution.
OCSP Stapling
OSCP Stapling moves the querying of the OCSP server from the client to the https server. The https server periodically polls OCSP server for revocation status of its own certificate(s), and sends OCSP response along with certificate (staples) to the client during TLS handshake in aServerCertificateStatus
message.
OCSP responses are short-lived (around a week). They are signed by CA, so client can trust them.
OCSP stapling approach solves several issues, inherent to regular OCSP approach:
- Removes dependency between client and CA servers. No additional query, faster TLS handshake.
- Protect website visitors privacy. Since browsers don’t talk to CA servers any more, they don’t leak browsing activity.
- More resistant to CA server availability issues, since web server caches OCSP responses which are valid for several days.
- Less load on CA servers, since number of https servers is less than number of clients/visitors.
Still, the biggest problem with OCSP stapling is that stapled response is an option and not mandatory. Clients don’t know whether to expect/require stapled OCSP responses from a website or not. If an attacker has a stolen revoked certificate it can be used without stapling. Browsers will fallback to regular OCSP, which can again be blocked, and browser will accept the certificate. We still have soft-fail behavior with a false sense of security.
Nginx configuration
ssl_stapling on;
ssl_stapling_verify on;
Configure DNS servers so Nginx can resolve OCSP server IP address:
resolver 127.0.0.11 valid=300s ipv6=off;
resolver_timeout 5s;
Also, some folks point that you need to supply root and intermediate certificates chain via ssl_trusted_certificate
, but I’ve tried and it works fine without it — I just have a ssl_certificate
directive pointing to a chain of website certificate plus intermediate one (without root CA cert).
ssl_certificate /var/ssl/foobbz.site/certs/fullchain.rsa.pem;
Nginx issues
The bitter truth is that Nginx is not that good at handling and serving OCSP stapling. 😞
First request handled by an nginx worker process never has a stapled OCSP response. Nginx initiates a lazy OCSP query afterwards, and subsequent requests will most likely include OCSP response. Note, that OCSP stapling cache is per worker process, meaning you can get several initial requests without OCSP stapling, as soon as they processed by different worker processes with a cold OCSP cache.
There are workarounds like warming up OCSP cache beforehand, but that’s too much crap.
You can check OSCP stapling on your own with a following command:
openssl s_client -host foobbz.site -port 443 -status < /dev/null
Valid OCSP stapled response should look like:
OCSP response:
======================================
OCSP Response Data:
OCSP Response Status: successful (0x0)
Response Type: Basic OCSP Response
Cert Status: good
This Update: Jan 4 12:00:00 2018 GMT
Next Update: Jan 11 12:00:00 2018 GMT
There is yet another issue . When OCSP responder reports revoked certificate status, Nginx does not staple it at all, and they tell it’s by design. Rather confusing to me 😕, because it completely breaks “must-staple” solution, described below.
So you see, that current server-side implementation is far from being robust, and making otherwise good idea quite useless in practice.
Chrome CRLset and Firefox OneCRL
A CRLSet is Google’s own list of revoked certificates that it compiles and embeds inside Chrome. Lists are auto-updated by regularly crawling the CRLs from the major CAs around the world. Google does not use OCSP servers or CRL lists, instead Chrome simply checks its own CRLSet for certificate status when visiting a secure website.
It’s like regular CRL approach, except that browser does not need contact CA’s servers and download a list, instead it already has the list embedded right into the browser, which is updated in timely manner.
Surely, such CRLsets cannot encompass every possible revoked certificate on the Internet. Instead of targeting end-server leaf certificates and DV certificates, they focus on high value intermediate CA certificates. This helps to quickly block intermediate CA certificates in case of emergency, when the private key is compromised to prevent an attacker to impersonate any site they like by signing their own child certificates.
Also, such lists might include high value EV certificates.
Firefox has analogue solution, which is called OneCRL. In addition, Firefox use regular OCSP approach.
Must-staple extension
As said before, OCSP stapling is good because it offloads OCSP requests from browser to the server, but it’s optional — browser have no idea if stapled response is expected or not, and therefore they use soft-fail behavior, which is a seat belt, that pretends to protect you, but breaks in case of emergency. So, meet “must-staple” extension.
Must-staple is simply a flag in the certificate, that puts a mandatory requirement on OSCP stapling presence and instructs the browser that the certificate must be served with a valid OCSP response or the browser should hard fail on the connection.
This flag is set when CA generates certificate for you. If you’re using LetsEncrypt CA, clients like certbot or acme.sh support issuing certificate with “Must-Staple” extension:
Example with acme.sh (ocsp-must-staple
flag):
$ acme.sh --issue --ecc --keylength ec-256 -d foobbz.site -d www.foobbz.site --standalone --staging --ocsp-must-staple
Example with certbot (must-staple
flag):
certbot certonly --non-interactive --cert-name foobbz.site -d foobbz.site,www.foobbz.site -m admin@foobbz.site --agree-tos --preferred-challenges http-01 --rsa-key-size 2048 --standalone --staging --must-staple
To check if certificate has “Must-Staple” flag, look for 1.3.6.1.5.5.7.1.24
extension ID:
$ openssl x509 -in /var/ssl/foobbz.site/certs/cert.ecc.pem -text -nooutX509v3 Subject Alternative Name:
DNS:foobbz.site, DNS:www.foobbz.site
1.3.6.1.5.5.7.1.24:
0....
Alternatively, use Qualys SSL server test:
Now, given the certificate with “Must-Staple” extension, if I turn off stapling altogether in the Nginx, browser should block me with error failing to find OCSP stapled response during TLS handshake.
ssl_stapling off;
Firefox reports a cryptic error MOZILLA_PKIX_ERROR_REQUIRED_TLS_FEATURE_MISSING
as expected. But Chrome tells cert is good — recall, Chrome does not follow OCSP standard, even when it comes to stapling and must-staple stuff 😞
Must-staple idea is great and allows to switch to hard-fail behavior. Also, solution scales well and does not introduce client-side performance hit. And it makes impossible for attacker to use stolen revoked certificate.
Despite being a substantial improvement over regular OCSP, it’s not a silver bullet and not a 100% working solution. Primarily, it suffers from server-side implementation issues and lack of widespread client support. If server fails to reliably staple the OCSP response, or use corrupted/erroneous response, or client is not happy with stapled response, whatever goes wrong — you’d lock out the website completely due to browser hard-fail behavior. This is a huge risk, and web servers like Nginx and Apache are not mature at OSCP stapling yet.
Note, there is a experimental Except-Staple HTTP response header, which helps you to monitor how reliable you as a site owner can staple good OCSP responses, and how clients are fine with those responses, before switching to hard-fail must-staple behavior.
Conclusion
Given everything said above, there is no ready-to-go 100% working and reliable solution to make browsers detect revoked certificates in a timely manner and refuse connecting to such websites.
OCSP must-staple is great idea, but not practical due to server-side implementation issues, and puts a risk of blocking a website completely. Chrome’s CRLSet solution is good, but addresses only high-value intermediate CA certificates.
When it comes to end-server certificates, you might decide to give up with revocation stuff like OCSP stapling, must-staple altogether. Just follow security best practices. Reduce the validity period of the certificate and renew it more frequently, to reduce the time-frame for an attacker to use stolen certificate. And yes, it sounds trivially, but keep your private keys safe. Do not allow CAs to generate private key for you, protect it with a password, etc.
Resources
Revocation is broken — https://scotthelme.co.uk/revocation-is-broken/
The current state of certificate revocation (CRLs, OCSP and OCSP Stapling) — https://www.maikel.pro/blog/current-state-certificate-revocation-crls-ocsp/
HTTPS Certificate Revocation is broken, and it’s time for some new tools | Ars Technica — https://arstechnica.com/information-technology/2017/07/https-certificate-revocation-is-broken-and-its-time-for-some-new-tools/
OCSP Must-Staple — https://scotthelme.co.uk/ocsp-must-staple/
The Problem with OCSP Stapling and Must Staple and why Certificate Revocation is still broken — Hanno’s blog — https://blog.hboeck.de/archives/886-The-Problem-with-OCSP-Stapling-and-Must-Staple-and-why-Certificate-Revocation-is-still-broken.html
ImperialViolet — Revocation checking and Chrome’s CRL — https://www.imperialviolet.org/2012/02/05/crlsets.html
Google Chrome will no longer check for revoked SSL certificates online | Computerworld — https://www.computerworld.com/article/2501274/desktop-apps/google-chrome-will-no-longer-check-for-revoked-ssl-certificates-online.html
Damn it, nginx! More bugs, this time with SSL OCSP stapling. — https://blog.crashed.org/nginx-stapling-busted/