Mitmproxy:

Your D.I.Y. Private Eye

Part 2/2: How To Use A Free Tool To Catch Companies Collecting Data About You Without Your Consent


What if I told you that every time you go to a website or open up an app, there is a company that you have never heard of that is collecting information about you? What if I told you that this website or app that you are trusting, is likely knowingly sharing your personal information? What if I told that the majority of websites and apps have no idea how to implement basic security practices that leave them hackable to anyone?

My name is Max and I’m a Computer Science & Public Policy major. I spent the summer of 2015 learning about privacy and security. After diving into the weeds and seeing the state of the field, I was shocked at how little the average consumer knows about what companies can and are doing to their customer’s data. Think about those questions I asked above again. What if that company has your Social Security Number? What if it has your personal health data?

An illustration from Lifehacker.com

I personally believe that consumers should have a right to own their own data and should have a right to know who is using their personal information and for what. It is for that reason that I would like share a free and open source tool called Mitmproxy that can be downloaded by anyone to investigate the privacy and security practices of companies. If you have not already installed Mitmproxy, go back to Part 1: How To Install Mitmproxy first.

Here in Part 2: How To Analyze Mitmproxy, I will show you how use Mitmproxy to the fullest to catch companies collecting data about you without your permission. I will also give you some tips and tricks for how to check out some of the security practices of these companies.


Mitmproxy Use & Analysis — How to Examine, Save and Edit Packets

Congrats on completing the set up!

To begin analyzing information on Mitmproxy, set up your browser settings on your computer to local host if you want to capture a session on your computer or your HTTP proxy settings on your phone to point to your computer’s IP address if you want to capture a session on your phone.

Now type into Terminal/Konsole:

mitmproxy --host

(that is “mitmproxy(space)(dash)(dash)host” in case the rendering is still funky)

And let’s begin! Now that we have Mitmproxy up and running we can play around and learn about how to use this tool to our advantage. A good introduction guide to what you see within Mitmproxy is here.

When you’re experimenting with Mitmproxy I would recommend you browse one app, save all that information, quit and then restart it to browse another app. Same thing goes for websites. Otherwise you might receive noise from different apps and websites and not know what is coming from where.

The most important thing to remember is to save your captures before you quit! Hit “w” to save, “a” to save all of the capture, and then name it whatever you want and hit enter. Then “q” and “y” to quit. I recommend creating a folder on your desktop called captures. On Terminal/Konsole you will need to navigate to that folder to save things there. Look up on Google how to navigate to a folder with the “cd” command. To reopen a capture type in Terminal/Konsole:

mitmproxy --host –r filename

and then Enter in the correct folder.

But to begin, start by going to an app or website and playing around for a few seconds. You should see Mitmproxy begin to populate with information.

Essentially, Mitmproxy is receiving a bunch of HTTP and HTTPS bursts of information called packets. Each packet contains some stuff that whoever it is directed to, needs to receive in order to show you, the consumer, a beautiful display on your phone or computer. The packet is broken down into some distinct parts, but for our purposes only a few are relevant. The first is the host, this is who the packet is being sent to. The host can usually be determined from the URLs of each packet. You can view hosts in the main display of Mitmproxy as they are in the packet summary of the packet. Let’s look at a few:

The names at the beginning of the URL indicate who is receiving information

As you move your arrow keys up and down you can toggle between packets. Type a capital f — “F” to have the arrow follow your packets in real time as they populate. The information on display is the packet summary. You can see the names of the companies receiving information in the red box in the beginning of the URL. Look in the middle for tracker.marinsm — yikes!

Example of the status code 200 which means “successfully delivered”

The colored number below the URL to the left of the red box is called a status code, this tells us if something was delivered successfully or not. Learn more about them here.

To the right of the status code, again also in the packet summary of the packet, is the type of file.

Type of files that are being sent back and forth behind the scenes when you interact with a webpage or app

JSON, JS (for Javascript) and Image types are particularly interesting. JSON is where sensitive data is usually transferred. Javascript is where the code of a company loads, and certain image types are used for third parties to place tracking pixels and cookies to monitor your behavior.

What the normal consumer sees when browsing a webpage

By the way, here is what I’m looking at on my browser when I’m getting the above packets on Mitmproxy.

By hitting Enter on any of the packets you will be able to look inside the packet to see what information is being sent. Hit “q” to go back and “space bar” to get to the next packet without going back to the main packet summary screen.

The information inside of a packet in Mitmproxy

Inside a packet we the header (below the packet summary) and the body (below the header). Sometime we’ll see gibberish and not know what to do with it, that is okay. Sometimes we will see gibberish that we can decode, called base64. We’ll talk about that later.

Each packet header and body has a Request and Response. Hit tab to go between them.

Request vs. Response inside of a Mitmproxy Packet

The request is your phone or computer saying “Dear Host, please send me some information” and the response is the Host replying back to your phone or computer saying “Loud and Clear Capitan, here is the information.” Juicy information can be sent in either.

For the host to receive any information about us, it must be sent in a Request. Many times, the response will contain lots of our juicy information. After you login somewhere we can find a request that sends our username and password, and the response that shows our information. Like this:

A packet that tells Google to allow me to log in to my Gmail account

Notice the email and password in the bottom left. I’ve blacked some information for obvious reasons. See up top that the host is from a google ServiceLogAuth — it is authenticating my information.

Google’s Response to me logging in to my Gmail account

Here is the response of the same packet. In the body we can see the page that will load. All of that gibberish is called HTML and helps me see all the normal stuff I would expect to see when I log onto my email.

If you EVER see sensitive information like Personally Identifiable Information (email, name, date of birth, password) sent over HTTP and not HTTPS then this is a terrible security practice and means that any hacker listening to your network can acquire that information.

Recognize that you might not see all of this information on the mobile application or website user interface, however it is being sent behind the scenes whether you like it or not. Here is my view on an app and what I see behind the scenes:

Mitmproxy capturing HTTP and HTTPS packets on your computer from your Android Phone

Here is my view on a website and what I see behind the scenes:

Mitmproxy capturing HTTP and HTTPS packets via Chrome

In the header the primarily interesting information are the Cookies and the Referer. The cookies are who is tracking us and the referrer is who allowed the Host to get the information. Here is why that is juicy. Lets see what happens when I visit Jawbone.com

A third party called Adnexus, tracking me as I visit Jawbone.com

Above in the bottom picture we can see in the field “Referer” we have Jawbone.com, so Jawbone is allowing the host to come in. The host is something weird “secure.adnxs.com” and then at the bottom they set a Cookie — a bunch of gibberish. Well after (google searching Adnxs) we know that secure.adnxs.com comes from Adnexus, a third party that tracks your movements with Cookies on Jawbone.com.

The really juicy stuff is sent in the body. Let’s talk about how to break apart a body:

A packet of information sent to Jawbone with analytics about my movements on the Webpage

Okay so here we have a packet where Jawbone is loading something about me. Notice all the cookies including from an analytics company called Optimizely. In the Events field we are looking at JSON notation. The stuff in the brackets and the parentheses tells us different stuff. To the left of the colon is the field, and to the right is the information. For example in the bottom right: “page”: “https://jawbone.com.”

Looking at more JSON packets (especially once you click on your profile of a company) will yield more juicy information.

By hitting “q” we are back at the main screen. Wow look at all the third parties that Jawbone allows to get information about you. Click on each one to find out what exactly is being sent.

A scary amount of 3rd parties collecting information about me

Let’s click on one and investigate:

An example of a third party, “idsync.rlcdn.com” who is setting about 10 different cookies on my browser to track me

Okay so we have something weird called idsync.rlcdn.com that is setting a bunch of cookies on them. The cookies have lots of numbers.

We can go to cookiepedia.com and type in the host name. After playing around we realize that if you lop off the idsync part and you just search rlcdn.com. Cookiepedia tells us that this cookie is used by Live Ramp.

Searching for the source of idsync.rlcdn.com via Cookiepedia we can see that the third party is Live Ramp, a company that does Targeting & Advertising

Yup this certainly looks like a 3rd party tracking company. Now email them and tell them to go away! You can also google search to find out what their cookie codes are, to figure out what the cookies mean.

A list of Mitmproxy options with “shift-?”

Now let’s learn what we can do with Mitmproxy commands. Hit “shift-?” which is for help and gives you the list of things you can do.Scroll down with your arrow keys to see more. Hit “q” to go back.

We can also edit some information. **disclaimer please do not look for the personally identifiable information of other users and respect their privacy** We can, if we want take a packet, modify it and send it again to get back information that maybe we aren’t supposed to. With “i” we can intercept it before it is ever sent, modify and send it. Let’s change my email address which is sent in the URL of the below packet. The %40 instead of the @ symbol is just code speak.

Okay so to edit: Press “e.” Look in the bottom left of the image. We are going to edit a request for my email information (to ask for my other email account’s information) hit “q” for editing the query.

Click Tab to go over to Value and then Enter to edit.

Changing my email address to another email address.

Hit Esc to get out of typing. “q” to go back to the packet, and then “r” to replay the packet which now asks google for the email of hacker123, another email address that I’ve made. You can see this in the URL of the picture below.

The green replay symbol to the left of URL tells us that we were successful in getting some sort of response

Notice to the left of the URL the little “replay” symbol. The Green 204 code means successful! And we notice the change in the URL. Let’s check the response:

No content in the response after we change the email to hacker123@gmail.com

Alas no substantive response! Google is checking to make sure you are who you say you are. But this will not be the case for negligent companies.

Sometimes you will want to edit information in the body of a Request before sending it to see what the Response is. There is no rhyme or reason why software developers will send certain information in the header or the body or wherever.

Sometimes we need to edit the email inside the body of a packet.

Here the email and password is in the body. Hit “e” to edit and “r” for raw body. Because you are editing in JSON now you will have to use a special editor to edit and save your work.

We see all of the raw body information jumbled together. Press “i” to insert information. We are now editing in an editor called Vim. Google it to learn more. You can see in the bottom right and left I’ve changed the email again.

The raw body of our packet as seen in the Vim editor

When I’m done. I press “esc” and then “:wq” for save (w) and quit (q) then Enter. One more time that is colon-w-q. Now we press “q” to get back and “r” to replay!

Replaying the email request for hacker123@gmail.com. You can see in the bottom left corner

It worked! Look in the bottom left. But again, no response. Google is too smart.

Sometimes we will see jumbled information that we can actually decode. If you see something that is all letters, and sometimes ends in “=” or “==” it could be in base64 — a technique used by programmers to send less overall characters. Learn more here.

A third party analytics company called Mixpanel uses it:

A packet where Mixpanel is sending itself some information in the URL. It looks like gibberish but it is actually base64 encdoded

Look at the gibberish after the api.mixpanel.com part of the URL. If you see it starts with “eyJl” that is base64 code for a “{” which we know from JSON is how most bodies of code begin. Copy paste the entire thing from “ey” all the way until the % on the last line — the % is URL jibberish for end! If this is confusing just copy paste the first hundred or so. If you copy paste too much then it won’t decode properly. Copy paste in base64decode.org

Using base64decode.org we can find out the secret information

If you look on the bottom, we see the translation — Mixpanel is recording information about my computer and from which website I was previously (google.com)!

Limiting or search for just packets with information sent to Google

If you want to search within all of your packets for specific information (say all information coming from Google) you hit “l” (the lower case letter el) and type your search and hit enter. See — only google.

Again, you can type a capital f — “F” to have the arrow follow your packets in real time so you don’t have to scroll down after clicking a new link or going to a new mobile application.

Well that about ends the guide!

The last thing to remember is to save your captures before you quit! Hit “w” to save, “a” to save all of the capture, and then name it whatever you want and hit enter. Then “q” and “y” to quit. I recommend creating a folder on your desktop called captures. On Terminal/Konsole you will need to navigate to that folder to save things there. Look up on Google how to navigate to a folder with the “cd” command. To reopen a capture type in Terminal/Konsole:

mitmproxy –-host –r filename

and then Enter in the correct folder.

Conclusion

I hope you enjoyed this guide — good luck finding things, hacking things and saving the world!

If you find some seriously poor privacy and security practices of companies that you are investigating, feel free to let them know by emailing them — they’ll hopefully appreciate it. If they are unresponsive or do not fix the poor practices, report the behavior to the Federal Trade Commission, the authority who can take action against companies with bad data security practices. You can file a complaint here.

Questions, concerns or somethings not working? E-mail me at maxpg@princeton.edu.


Max Greenwald is a Computer Science & Public Policy Major at Princeton University and spends his time at hackathons and learning about information and security. His favorite animal is the Emperor Penguin. Read more of his work at www.maxgreenwald.io/blog

References

  1. Mitmproxy.org
  2. cyber.jotwell.com/an-internet-x-ray-machine-for-the-masses/
  3. blog.philippheckel.com/2013/07/01/how-to-use-mitmproxy-to-read-and-modify-https-traffic-of-your-phone/
  4. shubhro.com/2014/12/18/reverse-engineering-kayak-mitmproxy/