Resident Evil: Understanding Residential IP Proxy as a Dark Service

This article aims to present you our work on residential IP proxies and their security issues. A paper for our work has been accepted by IEEE Symposium on Security and Privacy 2019, the premier conference for computer security research. You can find more technical details from our research paper.

Web proxies are commonly used to serve various purposes such as identity anonymization, and accessing digital content only available in limited countries. However, traditional web proxies such as VPN, and newly emerging ones such as Tor network, all suffer from a common limitation: their IP addresses are known to be non-residential as most of them are deployed in data centers. Because IP address ranges of data centers are commonly known and some proxy networks such as Tor even keep published the list of their IP addresses, traffic destinations such as Netflix can still easily distinguish visits using those proxies and degrade services to those traffic. Actually, as reported in its blog, Netflix is already employing measures to block visits from web proxies. And previous research also shows that visits from Tor networks are being degraded by many web services.

In our work, we identified an emerging and increasingly popular web proxy service: residential IP proxy as a service, and we call it RPaaS. Since the starting of 2017, we have observed more and more companies showing up to provide such kinds of service, starting from three, and increasing to tens by December 2017. And many of them claim to have millions of residential IP addresses to proxy your traffic and you will never get blocked.

To use their service, you just need to subscribe their services and configure your clients with the addresses (either IP address or domain) of their gateways, then your traffic will be routed through millions of residential IP addresses to the destinations, their common service model is shown as below in a backconnect manner. Everything looks good, right?

RPaaS service model from the outsider’s perspective

However, as we all know, residential IP addresses are typically assigned directly from ISP to home owners, if those services’ claims are true, how can they harvest so many residential IP addresses? Besides, what those services are currently used for, any malicious activities involved?

In our work, we aim to answer this those simple but difficult-to-answer questions leveraging a series of techniques of infiltration, traffic fingerprinting, and machine learning. Here, in this article, we will focus on presenting you our answers to those questions, for research details, please go to read our paper.

What is the scale and distribution of residential IP proxies ?

We subscribed 5 popular RPaaS services as common customers, and infiltrated them by sending traffic from our controlled clients, through their services, to our servers, through this way, we successfully collected what IP addresses are used as proxies to relay our traffic.

In a five-month time window, we have collected 6.2 millions IP addresses from those five services, and their distribution in the IP and geolocation spaces are shown below. Those IP addresses are distributed in 238 countries and regions, 28,035 /16 network prefixes and 52K+ ISPs.

Global distribution of the 6.2-million IP addresses used as web proxies
IP space distribution of the 6.2-million IP addresses used as web proxies

Are those IP addresses truly residential ones?

To answer this question, we trained a classifier with high precision and recall to decide every captured IP address to be residential or not. Among 6.2 captured IP addresses, 5.9 millions (95.22%) are predicted as residential ones.

How can those services recruit so many residential IP addresses?

Among the five RPaaS services, Luminati was found to have volunteer recruitment program named hola, through which, used can install their software and proxy others’ traffic in exchange of free relaying of his/her own traffic. For the other four, we didn’t find any volunteer recruitment programs.

Furthermore, we conducted realtime device fingerprinting (inside and outside, see our paper for details) when we captured each IP address. And we have successfully identified the device type and vendor information for 547,497 IP addresses. What surprised us is that 237,029 of them turned out to be IoT systems, such as web camera, DVR, and printer.

More importantly, all 5 service providers had non-gateway IoT devices (IoT devices that are not gateways such as NAT or router) identified, including Luminati. Considering Luminati didn’t provide software for users to install on IoT devices and other services don’t even have volunteer recruitment channels identified, it is quite suspicious why those IoT devices show up as residential proxies.

The following table shows the identified devices and their vendor information.

Device type distribution of residential IP addresses used as proxies

What are those residential proxies used for?

With help from a leading security company, we got access to a realtime traffic dataset of suspicious malwares installed on users’ devices. Through precise correlation, we have identified 67 various PUPs (potentially unwanted programs) used as residential proxies, 50 of them were identified by Virustotal with multiple alarms. The following table shows top 10 PUPs and the service they belong to. You can see all 5 services got PUPs identified.

Further analysis of traffic logs show that residential IP proxies are used with a mix of legitimate and malicious purposes. Among top 1000 traffic destinations, ads accounting for 75%, searching engines for 8%, shopping for 7%, malicious websites for 5%, and social networks for 2%.

And, 9.36% of the destination addresses were reported to be malicious by VirusTotal (68.92% are labeled as malware sites, 29.97% being malicious sites and 2.24% being phishing sites).

Top 10 PUPs and Services where they were identified. PO for Proxies Online; GS for Geosurf; IP for IAPS; LU for Luminati; PR for ProxyRack.

Other findings and implications

We also find the collusion between different services. For example, IAPS was likely to be a reseller of Luminati, while Proxies Online and Geosurf are two brands of the same company. One important indicator of those relations is the overlap of IP addresses between services, as shown below. Please refer to our paper for more evidence.

Overlap of IP addresses between different services. PO for Proxies Online; GS for Geosurf; IP for IAPS; LU for Luminati; PR for ProxyRack.

We also identified some IP addresses were blacklisted by detectors of IoT botnet campaigns on the same day when they were serving as proxies. We also found that residential proxies show different traffic pattern compared to traditional bots, which indicates new challenges for detecting them.

Conclusion

Although claiming to anonymizing customers, those services were found to have security issues residing in two aspects. One is the suspicious recruitment of such a large number of residential IP addresses, the other is the use of those services as infrastructure for malicious activities.