Sometimes not technical people selling IT services or products answer the question “How about the reliability of your system?” as follows: “We have everything protected by https”. If on the other hand the same not technical person, the question automatically closes, and everyone is satisfied with the level of security. I’ve heard such conversations many times. It was funny.
HTTPS is actively promoted by the Internet community and the main idea is to transfer the whole Internet to a certain year for encrypted traffic, and modern machines allow it. HTTPS is always good. But you need to know and the pitfalls associated with it.
The task of this article is to show the ability to listen to HTTPS user traffic (let’s call him Mr X) and that he does not notice it.
We will not take the latest research and exploits in the area of hacking HTTPS. Let’s go better to the basics and consider the long-known and simple ways.
HTTPS is HTTP + SSL. Http is an open data transfer protocol, open means that data is transmitted in clear text. SSL is a protocol that provides a secure connection through encryption. That is, our task is to intercept the pure traffic of our Mr X and find out what XXX sites he visits in the evenings. But we’re not like our Mr X and do not visit such XXX, so for example, take the search engine bing, which can still work both on https and on http.
Below is an example of what an intercepted traffic looks like using WireShark for the same request in bing for HTTP and for HTTPS.
HTTPS does encrypt all data including the URL addresses that the client generates. But HTTPS is built on the basis of TCP/IP, that is, information about where the traffic is sent can be obtained in plaintext. We are talking about Mac addresses, IP addresses and ports.
Using online tools (for example, whois.domaintools.com), you can find out what the IP address is, who it belongs to, and a simple query in bing allows you to find out which sites are on this IP address (for example, https://www.bing.com/search?q=ip:22.214.171.124).
Let’s continue and consider the following fact. A web server can host several sites, and each can have its own SSL certificate. Therefore, having a simple IP address is not enough. The web server, when the first request comes to it, should know with which site it is necessary to establish a connection. There is no encryption yet, because it still needs to be installed.
So, even before encryption begins, the client must somehow transfer information about the domain of the site, so that the web server can forward the client request to the required resource. Therefore, it is necessary to look at the very first request from the client to the server, which initiates the beginning of the encryption itself. Again, take our beloved WireShark and see.
Here we can find something interesting:
1) The first request actually contains the domain name of the site in an unencrypted form, with which the HTTPS connection will be initiated
2) The second request returns the certificate itself in an unencrypted form on the client machine, which contains the information for which domain it was issued. In the case of bing, the certificate also includes the extended Subject Alternative Name field, which lists the domains for which the certificate can be used (in the Bing certificate, you can find even addresses on the staging environment)
HTTP proxy of Fiddler type (in the picture above) has already know how to extract this information from traffic.
It would seem that this is all, but there is another way to find out what resources Mr X visited without hacking HTTPS. When in the browser it types the site address, the first request does not go to the server, where the site is located, the first request goes to the DNS server to get the IP address for this domain. DNS queries are not encrypted, so only listening to DNS traffic, you can make a history of what resources Mr X visited, and by the IP address of the DNS server you can determine even where he was at this time.
It’s time to sum up the results:
1) Without hacking traffic we can still track which XXX resources (whether protected or not) our Mr X visits in the evening.
2) You can write a script to filter and maintain the full story of what Mr X visited, for example, the last year (we can not say what he did there, but with certainty we can say that he was there for sure).
3) If I have direct access to traffic (as I am the administrator who controls routing, or I am an Internet provider through which traffic passes), then I do not even need to make any actions on the client machine to direct traffic to me.
4) Wi-Fi or satellite Internet can be poorly protected, and knowing the address of the client, you can determine what resources it works with.
5) And the most important thing. Mr X did not notice anything, and we are already gathering information about him.
Denis Koloshko — CISSP, CEO at Dhound