American Muslism Tracking Exposed: behind the curtain of

Amnesty International released a report in February few weeks after the Muslim Ban. This project takes the matter from a technical perspective to show how the current Internet business model can easily lead to religious profiling.

Based on this report, keep in mind the following key points:

  • a big market of users profiles exists, even with ethnicity based segmentation
  • the tech industry has voiced criticism of Trump’s travel ban. In the first week of February, 97 companies , joined forces by filing an amicus in support of a lawsuit against Trump’s immigration orders
  • nearly 3,000 tech professionals have also signed a pledge to — among other things — refuse to cooperate for building discriminatory databases:

In this pledge, the signing parties made this important promise:

we are choosing to stand in solidarity with Muslim Americans, immigrants, and all people whose lives and livelihoods are threatened by the incoming administration’s proposed data collection policies. We refuse to build a database of people based on their Constitutionally-protected religious beliefs. We refuse to facilitate mass deportations of people the government believes to be undesirable.

Our question is: “how can the modern online ecosystem be used to create religion-based registries?”

spoiler: with the collaboration of the advertising firms above — note: website under analysis are associated a color; the brown just has much more trackers inclusions than the others.

① Someone apparently collects harmless data

Every website has trackers, many trackers, for different reasons. All of them have some reasons to have those trackers ( sometimes due to ignorance, often due financial or technical motives). In fact, Princeton University has been running a test keeping track of the phenomena:

Trackers keep profiles; it is their business. When companies provides a third party service (and a website for some reasons enables it), it is because they want to link users to the content visited, using technical identifiers hidden in the device.
If your website is for example selling loans, the tracker’s records would look like this: “user X-12425, on 1st of July 2017, is looking for a loan”.

Frequency defines who you are

Pick a content. A user access once? It happen. Twice?, it happens too. Thrice? You’re looking for it. More? You like it. Five days per week? You have to be down on the subject. More than 40 weeks per year? Hey, User, you live for it.

② A rich grey market to sell and remix

These are the companies involved in the data business. We can find confirmation of their presence analyzing trackers in the mobile apps or looking at the websites which relay on them. The companies we can verify are the ones displayed in color, on the right. On the left (the grey market) you can see the companies which buy, enrich, analyze, and resell personal or group profiles. They offer business to business services.

And that’s the market of the data brokers. A fast and dynamic market, where databases changes ownership and resell data while companies bid on a user profiles. There is, in fact, an entire economy based on attention. Companies pay to appear on your device or run a tracking script on it. And as you can see, the grey area, aka companies we can’t see directly, is bigger than the one we know. What we see is see only the tip of this iceberg:

Do you remember the words from the pledge at ? I feel they are completely true. They condemn the Muslim ban and refuse to build any religious databases. And if somehow it happens, I believe Some courageous person will blow the whistle to some media, or report it to the police. It is illegal, but even more, it is unethical to create such segmentation in the society. Had it been done in the past, such grave crimes shall not be repeated.

③ Can a registry be an oppression tool in 2017? Nope: it’s just an algorithm

In a tracking company, they aren’t actually doing anything wrong. The daily goal is to “appear as much as possible in the user’s browsers,” to study where they are navigating and thus attribute some interests to them. This is for instance what could be collected:

The log is an apparently valueless set of records with limited privacy and ethical implication

It is harmless isn’t it? But, among the many services displayed above in the grey area, you can even buy webpages content analysis. For example, you can acquire tools that allows you to classify websites. Their assets are based on associations like the following:

And now? Just pick your non private sensitive logs, buy an access on the service above, and run an algorithm like this:

An example of an algorithm full of political meanings.

The aforementioned algorithm it is far less accountable than a database; it decides based on parameters who is a Muslim and who it is not. It could looks accurate, but can’t be 100% right: religion is pretty intimate. Still, an algorithm do not accept appeal, also if you are not Muslim.

The example above is just one among many. Similar attacks can be done using GPS coordinate obtained from mobile apps with access to the location data. The data sources are many, even if such database can’t exist, but algorithms with political meanings can be run by many of the companies described above.

Considering the complexity of such market, we can just aim for reducing as much a possible all the third party trackers, polluting their activity when it is possible, and disincentivizing their business.

❎ A matter of corporate responsibility

Is the website responsible for their users? Maybe. It is a matter of corporate responsibility and a standard answer do not exist.

The users adopted some solutions in the past such as installing ad-blocker. It is the basic self-defense techniques, but it is a viable only for those who know about this technology. It is not the political settlement I seek.

Law and policies? They are slow, easy to be infiltrated by lobbyists, and despite all the weird policies we saw that:

we can’t live on an Internet without trackers

Advertising and user profiling constitute the primary business model online. — and when someone tried to change it with more than words: they got sued

A company producing content has, among its corporate responsibilities, the one of making benefits and sustaining the business. Changing that business model could be a long term goal, but this project addresses an urgent condition and offers a viable solution.

We isolated four groups of websites which their usage can profile the users as Muslim. None of these categories (I’m making a bit of generalization here) produce content because their business model is different. Ideally, they have no reason to use third party trackers.

For all, there is a list and some basic visualizations intended to make the phenomenon of web surveillance more concrete. The websites owners haves not been reaching out. They are just a small sample of all the sites intended for a Muslim target audience. Besides, not all the web admins or the companies behind those websites are aware of the third party trackers problem (you can help, especially if you know them)

If you can do your own data analysis, there is the API description here:

They have trackers, they are intended for a sensible audience, and they can, maybe, remove their trackers from their pages.

▻ Next steps?

The website contains more details and links, all the actors in the chain can do something, but this is more of a collaborative analysis. Researchers and analysts can reuse the data, debate the potential data abuses. Going towards a cleaner web ecosystem is the goal of the tracking-exposed initiative. The analysis can be extended in new categories, feel free to proposes your changes on the Github repository.

Thanks to

Rahma Sghaier, Andrea Raimondi, IACAP meeting 2017; written by Claudio Agosti