Why Internet Connections Records are (mostly) useless!

History

In the good old days, people used to speak to each other and when they weren’t in the vicinity of the other person maybe they’d pick up the handset of their rotary dial telephone and call them.

When a call was made, the number on the dial caused the circuit to break that number of times (i.e. dial 9 and there’d be 9 pulses down the line). These pulses were intercepted by the local telephone exchange and would cause an ‘uniselector’ to move to that position which would then connect to another one either in the same exchange or in another exchange and so on until the other party’s phone was reached at which point it would ring.

A uniselector (The Museum of Technology)

If the other party picked up the phone, the connection was established and the two parties could talk to each other.

This was known as a circuit switched connection as a ‘virtual’ circuit between the two callers was maintained across any phone networks involved until one end hung up and then the ‘circuit’ would be destroyed.

Circuit switched networks have been the mainstay of telephony networks up until very recently (even mobile phone networks still use them).

Unfortunately as the circuit is maintained for the duration of the call, no one else could use that circuit (or any part of that circuit) which is inefficient in terms of bandwidth use. So if an exchange only had 30 circuits out of it to other exchnages, then the maximum number of calls it could handle would be 30 (assuming all calls went out of the exchange) though there might be hundreds if not thousands of phones connected locally to the exchange.

If the security services wanted to tap a number, they’d go the exchange where its number was associated with an actual telephone line and physically connect cables (a pair) to the point where the telephone circuit terminated in the exchange. So the tap wasn’t obvious, each cable in the pair was often routed around the exchange along a different path, and single cables terminated just to confuse the casual on-looker.

The Digital Age

When digital telephones and exchanges were introduced (people may remember System X) telephone exchanges got much smaller. No longer were there banks of uniselectors clicking away, calls were switched digitally and rather than calls being sent in analogue, the exchange would convert them to to a digital format (in Europe a-law pulse code modulation and in the US/Japan u-law PCM) which compressed the call into a 64Kb/s stream of information (56Kb/s in US/Japan).

The system still was based on a circuit switched paradigm so a call meant a virtual circuit was maintained for the life of the call through the telephone network.

Digital exchanges meant calls could easily be logged and extensive use was made of Call Data Records or CDRs. These maintain a record of who originated the call, who it was made to, when it was made and for how long. Initially CDRs were used for billing (you could save the CDR, then at some point run a billing program against them which would then work out how much the call cost and bill the customer).

CDRs also became a good way to keep track of people. Communications companies had to keep CDRs for a time period and they could be requested by security services/police with the relevant paperwork.

When mobile went digital (with the advent of GSM) there was even a flag that could be set in the subscriber record that was ‘legal trace’, so it can be assumed that was possible on fixed network exchanges too. This allowed authorised bodies to set the flag and all their (on mobile) call records, text messages (and probably calls) would be logged somewhere and made available to the body requesting the info.

CDRs are the backbone of telephony networks and even through they no longer use circuit switched technology and have moved to packet based networks (which are more efficient as data is only sent when someone is talking), every time a call is made, there will be a CDR.

The Internet is NOT a telephone network

When the people behind the IP Bill (that’s likely to become the IP Act soon) were writing the legislation, there was obviously some thinking done. We have telephone networks and we have CDRs, well we have this Internet thing so we can have an Internet Connection Record or ICR just like we have CDRs.

Unfortunately the Internet doesn’t work like that and considerable effort has to be put it to collect an ICR.

When there was dial-up Interner, a customer would initiate a call to their ISP and modems would do their beep beep negotiations and then the customer would be connected to the Internet. The equipment would have to do some authentication to ensure that the customer was authorised to use that Internet Service Provider (ISP) and maybe they’d be allocated an IP address and that info could be logged. But once the customer was connected they could send packets anywhere into the network and the ISP didn’t really have any knowledge of where they went.

Nowadays there is still some authentication done when say a broadband line is brought up and once again the connection/IP address is logged somewhere. If the customer accesses some site, the authorities can then ask the ISP who the IP address belonged to at a particular time, then they know the customer.

Cleanfeed

The big ISPs did have to implement what was known as Cleanfeed, which did implement some basic packet filtering when it came to dubious sites on the Internet (such as child porn).

Here the Internet Watch Foundation (IWF) provide a list of banned URLs, which are then converted to IP addresses. When a customer tries to access that IP address the access equipment redirects the request to a web proxy which will then check to see if the actual URL requested matches the banned URL and then either block it, or post a message saying the image/etc was blocked.

This all became public when an image on Wikipedia was blocked (a Scorpions Album) and Wikipedia noticed pretty much all the UK traffic to Wikipedia came from 6 IP addresses (the web proxies of the big six ISPs). As well as highlighting that Cleanfeed was in operation, because all Wikipedia access was going through web proxies, access slowed down.

ICRs

Now the Government wants all ISPs to log every Internet connection made i.e. they want to know what site was accessed, the time of the access and by who. It’s only the first access to the site and no metadata about what was accessed is stored.

There are some issues here, that’s a LOT of information to store, it has to be stored somewhere and it has to be stored securely (that information is valuable to the authorities and to hackers who can use the information for illegal purposes).

There are technical issues to all parts of the above, but with time and money it’s all solvable (who pays for it is another matter altogether, though Government has set aside money to support this, but will it be enough?).

The question though is an ICR that relevant. If Facebook is accessed, then the ICR should contain a URL to Facebook.

So the ICR would be something like

2015–11–09 T 11:20 UTC 111.123.111.123 https://www.facebook.com/

The first bit is the data/time in UTC followed by the IP address of the requestor and then the URL accessed (there’s probably other information stored like source port and destination port, where the source port is usually a number above 1024 and the destination port 80 for http and 443 for https).

That may be useful for the authorities, but it doesn’t give much information, yes that authorities know you accessed Facebook, but not what you’ve done on Facebook.

The authorities may be luckier with Google searches as they can embed the search string in the first query.

So something like http://google.co.uk/search?q=query+goes+here (there’s also a lot more that can be appended following another ? which relate to query parameters).

However say you’re using Facebook Messenger, aha the authorities want to know who you’re talking to. Unfortunately most IP communications programs don’t work that way, the client connects to the messaging server over a SINGLE connection and all communications goes over that single connection, so all the authorities know is that you were using Facebook Messenger and that’s it. That could be useful as if they have that data, they can then go with a warrant to Facebook and ask them to give up the information on who you were messaging, but then they’re probably doing that anyway …

So in this case you’d get

2015–11–09 T 11:20 UTC 111.123.111.123 (port) messenger.facebook.com (port)

No John is speaking to James.

So unless the ISP has to delve into the packet stream and extract information from it, it’s probably not a huge amount of useful information (and deep packet inspection is extremely costly in terms of resources).

Worse if the end user is using a Virtual Private Network (VPN), then they will only be one ICR and all the ICR will contain is

2015–11–09 T 11:20 UTC 111.123.111.123 (port) VPN-endpoint (port)

that will be it, as everything is tunnelled in the VPN.