Incentivizing Censorship Measurements via Circumvention [SIGCOMM 2018]

Can we join forces to FINALLY free the Internet?

Sean Choi
Computer Science Literature Review
7 min readSep 2, 2018

--

Our goal is to provide an easy, general and high-level context of this paper’s contributions and our take on the implication of the paper. Please refer to the actual paper for more details if this post interests you and please feel free to contact us for errors, changes and suggestions!

I am sure one has heard stories about how you can’t access Google, Facebook and YouTube in China or Google Translate in Saudi Arabia. These stories are results of what is called the Internet censorship, which is a term to describe the control or suppression of what can be accessed, published, or viewed on the Internet. Internet censorship began as more and more people started to share information on the Internet. Since it’s inception, Internet censorship has grown to be quite prevalent across many countries (around 70 countries in 2017 to be more exact). Surprisingly, it’s not only China, Saudi Arabia, North Korea who censor what can be viewed on the Internet. This list includes South Korea, United Kingdom and even United States, a country that built the Internet in the first place. In fact, United States and United Kingdom is named as one of the enemies of the internet. This shows how prevalent censorship and Internet surveillance is across the globe.

FIG. 1. Internet Censorship in China is often called the Great ‘Firewall’ of China

However, this doesn’t mean that everyone censors the same contents and feels the same about the government censoring what they see on the Internet. Although censoring something from the public sounds quite oppressive, a survey performed in 2012 among 10,000 users across 20 countries showed that around 71% of the people agree to having some form of censorship. Although I believe information should flow freely between people, I also agree with having some form of censorship, because I think there are things in the Internet that should not be shared by anyone to anyone else, e.g., child pornography, terrorist propaganda, and illegal gambling. The list of things that people believe should not be shared is quite different between everyone. Thus, the extent of censorship varies greatly between each country. Some countries only ban contents that are illegal, whereas some countries ban contents that may expose sensitive information that can question the government or the religion that the country is based on. Some countries may publicize as if they do not censor anything, but can be snooping into what you are viewing online. In any case, there are censorship in many corners of the globe today and people react very differently to what and how the government censor contents.

Background

Now let’s talk about the fun stuff. There are many ways you can censor something online. First, there are the non-technical and boring(?) methods, such as putting regulations in place, sending bunch of people to jail for sharing banned contents, bribing publishers to post contents that meet the government’s standards and much more. Aside from these methods, there are the technical methods that are used quite often. Most of the technical methods deal with how your computer talks to the server. One of the most widely used methods has to do with filtering and/or altering the IP address of the server. IP address is a set of number (which can be 32-bits or 128-bits) that tells everyone where your computer sits in the Internet. This is how you can find someone online and how someone can find you back. Now, if you type in an web address, e.g., Youtube.com, (which is also called a Uniform Resorce Locator (URL)) on your browser, there is something called a Domain Name System (DNS) that takes the URL and gives you the IP address of a computer that gives you the content you want to see. Now what governments do frequently is to block certain set of IPs or URLs that serve sensitive contents, or alter the DNS to point to a different IP address when sensitive URLs are requested. There are series of other complicated methods that the governments use, but this method covers a large amount of Internet censorship that happens today.

If you think you can do it, you can. -John Burroughs

If there is someone trying to block you from doing something, there are always people trying to fight against it. Here is where Internet censorship circumvention comes in. There are many ways to bypass the Internet censorship system and gain access to the censored material. Some of the widely used methods are using alternate DNS servers, using proxy sites that display blocked materials, ToR that allows anonymous connections and Virtual Private Network (VPN), which is shown in FIG.2., that allows users to securely connect to relay servers outside of the country. There also are a category of cute products called pocket VPN, such as this or this, that allows you to plug a USB stick that reroutes your Internet to an external VPN server.

FIG. 2. VPN Overview (From Wikimedia Commons)

To combat these circumvention methods, governments and Internet providers continually make improvements(?) to block more contents. Then the people fight back with finding ways to circumvent around the latest changes. There are many publications, such as this, that study how governments and people react to changes in circumvention methods. This is an ongoing, and probably never ending, fight between the people and the government.

Which Problems is this Paper Trying to Solve?

This paper is aimed to improve the performance of circumvention methods using crowdsourcing and more data. Current circumvention methods take longer to adapt to the latest changes from the Internet providers. In fact, the paper shows that the different Internet providers can implement different blocking methods and even with the same Internet provider, blocking methods can differ across URLs. Thus, it is critical for the circumvention system to gather this information for each Internet providers and URLs quickly and adapt to the latest censorship methods as soon as possible.

How is this Paper Solving these Problems?

This paper chose crowdsourcing to collect data across various Internet providers and URLs. The built a system called C-saw, which continuously gathers censorship measurements through crowdsourcing and uses these measurements for making adaptive data-driving circumvention. The users who participate in the crowdsourcing program is incentivized by improving their circumvention performance. The user can choose to opt-in for the system to take measurements on some of the contents you visit. C-saw also takes security seriously. It makes the best attempt to keep users of C-saw anonymous, so that the users can be free from worrying if they are traced back. The paper shares more in-depth design of the system if you are interested.

What are the Results?

The results show that C-saw is capable of loading blocked pages much faster than some of the other censorship tools available (Tor or Lantern). It also showed that C-saw is capable of loading unblocked pages faster than using other circumvention tools. Of course, unblocked pages would be loaded faster if no circumvention tools are used at all, but it is mostly unclear which pages are blocked or unblocked in the first place.

C-saw was also deployed to 123 real users in Pakistan and showed that it was capable of circumventing around the government censorship system and made it possible for the users to look at Twitter or Instagram (a really HUGE and meaningful win for @realdonaldtrump followers and pro-IGers in Pakistan. Humanity is saved, Internet is freed!).

My Take on this Paper

Internet censorship and circumvention methods are always a sensitive topic to discuss. Looking only into the technical aspects are quite fun, but I think some of the technical methods C-saw implements can be still quite dangerous in practice. Although the paper mentions that it made the best attempt to make it hard to trace the users of the system, if C-saw ever gets really popular, the government can trace users who visit C-saw download pages and track all connections from those users. Also, the government may be able to reverse engineer C-saw create fake version of C-saw that sends the same set of data that C-saw collects currently, while tracing users who use this fake version of C-saw. As much as I want to find ways to avoid oppressive laws, I would really be cautious in simply downloading and using these experimental tools.

We are looking for passionate writers from all fields of CS!

I believe that one of the main and most important habit that a grad student must have is to read papers at a regular fashion. I thought that being a paper reviewer with a light amount of responsibility gives incentives to read more papers. So, I personally started this blog for me to record what I was reading and I found it really really helpful. I hope that more can join my experience and have a great learning experience in the process! Feel free to email me at yo2seol@cs.stanford.edu if interest!

--

--

Sean Choi
Computer Science Literature Review

Stanford, SF, SV-based educator & researcher & engineer writing about interesting technical things. seanschoi.com