Understanding Tor And Its Anonymity

Published in

Systems and Network Security

10 min readApr 10, 2021

By Yice Jiang and Jiameng Duan

Introduction

Tor, short for “The Onion Router”, is a free and open-source project to enable anonymous communication by redirecting the Internet traffic through a worldwide, volunteer overlay network, which consists of more than seven thousand relays. It conceals the users’ locations, IP addresses and activities from network monitoring. The goal of Tor is to protect users’ personal privacy and the freedom to conduct confidential communication. With Tor, web services can also stay anonymous and therefore the dark web. However, Tor is not completely invulnerable and it has weaknesses that could defeat its anonymity.

Usage of Tor

Tor is good for anyone who wants to keep their online activities private. It includes but not limited to walking around the censorship restrictions from the local government, hiding the IP addresses and getting rid of the browsing habits linkage. The Tor network is also able to host websites that can only be accessed by other Tor users, which is known as the Dark Web. You can find everything on the Dark Web, from used cars to drugs, and maybe even worse as long as you know the specific URL that directs you to those websites. Browse at your own risk.

If you are reading news on MSN and watching videos on youtube, you probably don’t need to use Tor. It will only slow down your connection without providing any benefits. If you are using a public WIFI, you should make sure you are using HTTPS for all the websites you visit and probably utilize a VPN to encrypt the traffic rather than considering Tor, as you need to secure your traffic rather than anonymize it.

Tor is better than nothing if VPN is not involved. But it is suggested that you should not login to any websites, especially financial related ones, such as bank accounts etc. you don’t know who controls the relays in the middle and typically the important exit node.

As a summary of the above, don’t bother with Tor if you don’t really need to be anonymous.

Basic concept

Onion Routing is the infrastructure of Tor. Onion Routing implements encryptions at the application layer, nested like an onion. Normally, when a user requests/sends data to a destination, it requests/sends data directly to the IP of the destination. With Tor, however, it is totally different. The Tor browser will first build a circuit by randomly selecting X nodes from the Tor network. The Tor browser encrypts the data, the next node IP address, a random number X, and routes it through the tor network following the circuit. The packet will reach the first node’s IP address, decrypts a layer of encryption to reveal the IP address of the next node, and passes the remaining encrypted data to it. It will go through X relays before arriving at the destination. The last relay is considered as the exit node, it decrypts the innermost layer and forwards the packet to the destination without knowing the source IP address. The destination thinks the request/data comes from the exit node and will respond to the exit node. The response will go through the same process from the exit node to the source IP address. Because the routing of the communication was partially revealed, it eliminates the possibility that any single node in the middle of the communication can be determined by network surveillance knowing the data’s source and destination.

The traffic inside Tor network using the unit called cell. Each cell has a fixed size of 512 bytes to protect it from traffic analysis. Similar to packets, cells also consist of a header and a payload.

When using Tor, it is common that the website greets you in a foreign language, as it presumes your country and language from your IP address, which is the IP address of the exit node.

This is an example of using Tor. Even though I am physically located in Canada, it displays the website in Dutch.

Hidden services

Figure 2. Connection From A Client To A Hidden Service

With the Tor network, not only the user can hide their identities, but also the servers. A web server is hosted inside the TOR network and clients who request this service do not know where the services are hosted, i.e., the IP address of this service. Hidden services also known as onion sites, the address of a hidden service is usually in the format of xxxdomain.onion. When a hidden service is set up, it will choose and contact several relays as its introduction points. The hidden service also generates an onion service descriptor which contains the introduction points and its public key. The descriptor is then signed by the service’s private key and published to the directory servers. To connect to a hidden service, a client first checks the tor directory to obtain the introduction points that are associated with the hidden service. The client also chooses a rendezvous point and asks it to send a request to one of the introduction points including a one-time secret. Once the server receives the request, it will create a circuit in the Tor network to the rendezvous point and send the one-time secret. The rendezvous point compares the one-time secret and notify the client that a connection is established. From the point of time, the rendezvous point will act as a middle node. The connection could be much slower because the number of hops involved is larger. There are at least two hops from the client to the rendezvous point and at least three hops from the server to the rendezvous point (Tor Project.). The screenshot below shows the circuit created to connect to a hidden service.

Figure 3. Circuit Shown In The Tor Browser

Vulnerabilities of Tor

Vulnerability based on bandwidth. As mentioned above, a Tor client needs to select three nodes to build a circuit in the Tor network. Tor uses the Bandwidth-Weighted Random Selection algorithm to select nodes. The client first contacts the directory server to download a list of the available nodes in the Tor network. In this list, some nodes are marked as Guard and some nodes are marked as Exit, which means the entry node can only be selected from the Guard nodes and the exit node can only be selected from the node with Exit mark. The Bandwidth-Weighted Random Selection algorithm selects the node based on its bandwidth. The larger the bandwidth, the higher the probability of the node being selected. This will ensure a better performance but also introduces security risks. Consider the circuit created in Figure 2, if an attacker can take control of both the entry node and the exit node, then the whole Tor circuit is defeated. An attacker can utilize the weakness of this selection algorithm by setting up high bandwidth nodes in the Tor network to gain a great probability of controlling both entry and exit nodes. Researchers measured the bandwidth of a group of Tor nodes and found out that the bandwidth differences are significant and the nodes with a high bandwidth (> 4000Kb/s) are only very few (Shen et al., 2018).

Figure 4. Bandwidth Distribution (Shen et al., 2018)

Attack On Exit Node. Tor nodes are supported by volunteers and the Tor network could be attacked if an entity controls a significant fraction of the Tor network.

In January 2020, there is a group of threat actors started adding nodes to the Tor network. By May 2020, the highest number of exit nodes under their control is 380. The attackers perform an SSL-strip attack at the exit node to intercept the request from the victim machine and forward it to the web server. After receiving the HTTPS link, the attacker downgrades the HTTPS to HTTP if the webserver does not have HTST enabled, and sends the HTTP link back to the victim machine (Cimpanu, 2020).

Weakness In Traffic Flow. The fingerprinting attack is low-cost and can hurt user’s anonymity. A typical Tor circuit has three types of nodes, entry node, middle nodes, and exit nodes. As shown in Figure 5, when building the circuit, the client first establishes a connection with the entry node and does a key exchange. Then the entry node connects to the middle node and the middle node connects to the exit node. It is obvious that the entry node has more traffic flows compared to other nodes. Therefore, it is possible to figure the type of a node when we capture and analyze the traffic flow through it.

Figure 5. Packets Flow in a Circuit Creation Progress (Shen et al., 2018)

In addition, when a client and a hidden service establish a connection, the circuits built in this process have different characteristics. Experiments have shown that circuits established between the Hidden Service and its Introduction Point are long-lived compared to other circuits. It also has exactly three outgoing cells. Also, the first three cells when establishing the circuit built by a client to a randomly chosen Rendezvous Point have the pattern of two outgoing packets followed by one incoming cell. The circuit built by a hidden service to the rendezvous point chosen by the client has more outgoing cells than incoming (Kwon et al., 2015). Based on these characteristics, it is possible to train classifiers to distinguish these circuits. Attackers can learn about the client’s privacy if they control the entry or/and the exit nodes. With these methods, attackers have a better chance to locate the entry or exit nodes in a link.

TOR related products

Tor browser is the flagship of the Tor Project. The “onion routing” theory began in the mid-1990s and the Tor project browser was announced in early 2008. It consists of a modified Mozilla Firefox ESR web browser, the TorButton, TorLauncher, NoScript, and HTTPS Everywhere Firefox extensions and the Tor proxy (Mike et al. #). It can work under different OS systems, such as Windows, Linux and macOS. The Tor browser starts the Tor background processes automatically and directs traffic through the Tor network. The browser will delete privacy-sensitive data after closing a session, such as Cookies, username/password pairs and browsing history etc.

Tor Messenger is another product of the Tor project. It is a messaging software based on Instantbird with Tor and OTR (Off-the-Record Messaging). It supports different communication protocols and implements all chat protocols in the memory-safe language, which is JavaScript. It was initially released on October 29th, 2015 and stopped maintenance on September 28th, 2017 due to the support discontinuation of Instantbird. The development team of Tor Messenger stated that it would be impossible to handle the vulnerabilities found in the future as Tor Messenger is based on discontinued software.

There are also other third-party software that utilizing Tor as a part of them, such as BitTorrent client, Bitmessage, TorChat and OnionShare etc.

Using Tor with VPN

To maximize security and anonymity, the best approach is to use VPN and Tor simultaneously. There are two ways of using them: one is Tor over VPN and the other one is VPN over Tor.

Tor over VPN means you use a VPN first and then access the Tor network. When you use Tor over VPN, your IP is hidden by the VPN because VPN encrypts the data and routes the traffic to one of its secure servers before connecting to the Tor network. As a result, all of your traffic is protected and the Tor volunteer’s nodes will not be able to get your IP address. The drawback of this method is that it doesn’t protect you from compromised exit nodes, because the traffic between exit nodes and the destination is unencrypted.

VPN over Tor means you connect to the Tor network first and then use the VPN. It is more complicated than the previous one as you may need to configure the VPN manually. By using this method, the traffic leaving exit nodes will be redirected to one of the VPN’s secure servers and then arrives at the destination. There is a big advantage of this method that it protects you from compromised exit nodes, as the exit nodes cannot know your IP addresses or any other information. Meanwhile, it brings a disadvantage that the ISP or other network surveillance will know you are using Tor, even though they don’t know what you use Tor for.

Conclusion

In this blog, we introduce the background knowledge of Tor and its possible weaknesses. Tor protects its user from tracking and the Tor network is run and supported by volunteers. However, Tor has weaknesses that could hurt its users’ anonymity. The weakness in the node selection algorithm might be used by attackers to possibly gain control of the entry and exit nodes of a circuit. Compromised exit nodes might leak the traffic by SSL-strip attack. Traffic flow patterns when establishing a circuit could help attackers learn about the type of nodes. Using Tor over VPN can hide your data on its way to the entry node while using VPN over Tor can protect your data from the exit node to the web server.

References

Mike, Perry, et al. “The Design and Implementation of the Tor Browser [DRAFT]”. Tor project, 2013.

“Tor Project: How Do Onion Services Work?” Tor Project | How Do Onion Services Work?, community.torproject.org/onion-services/overview/.

Shen, Shiyu, et al. “Weakness Identification and Flow Analysis Based on Tor Network.” Proceedings of the 8th International Conference on Communication and Network Security — ICCNS 2018, 2018, doi:10.1145/3290480.3290481.

Cimpanu, Catalin. “A Mysterious Group Has Hijacked Tor Exit Nodes to Perform SSL Stripping Attacks.” ZDNet, ZDNet, 10 Aug. 2020, www.zdnet.com/article/a-mysterious-group-has-hijacked-tor-exit-nodes-to-perform-ssl-stripping-attacks/.

Albert Kwon, Mashael AlSabah, David Lazar, Marc Dacier, and Srinivas Devadas. 2015. Circuit fingerprinting attacks: passive deanonymization of tor hidden services. In Proceedings of the 24th USENIX Conference on Security Symposium (SEC’15). USENIX Association, USA, 287–302.

Understanding Tor And Its Anonymity

Written by 783Cybersecurity Group Post