Stop Kidding Yourself About Data Privacy

Gregory Leman
Software Grognard
Published in
7 min readMay 10, 2022

Most of this will be old hat for people that work in these areas, but for everybody else there probably isn’t the realization of exactly how much Big Tech knows about every little detail of your life. I was recently involved in a conversation where someone said “But that data is anonymized! It’s not useful for tracking actual people!” Tsk Tsk.

Back in the 1990s, I was the CEO/founder of a startup that had a real-time mapping system for public safety. That doesn’t seem like much now that we’re all carrying smartphones with a variety of GPS mapping solutions, but it was cutting edge back then. Police/Fire/EMS were installing GPS receivers on their vehicles and the E911 systems were providing caller id/location to dispatch centers. Our system was able to show the locations of all units in relationship to the calls they were answering. It was a big deal. I was in the dispatch center of a major metro area the night we first turned on the system. About an hour in a dispatcher noticed that an ambulance responding to a drive by shooting was headed the wrong direction — they heard South over the radio when they should have gone to North. It sped up the response by 20 minutes. The dispatcher turned to me and said “You just saved someone’s life.”

The big problem back then was that people were starting to carry cell phones and a lot of calls came into E911 dispatch centers without the location data they got from landlines. There was interest in using pings from cell towers to triangulate the locations of callers, but they couldn’t narrow it down to a specific area. One story that happened was a caller had a heart attack while in his car and couldn’t speak. The dispatcher had every unit in the area alternate turning on their sirens. They used the sound coming over the radio to narrow down where the caller was.

I remember thinking back then that the tower pings wouldn’t turn out to be useful for dispatch in real time because they weren’t precise enough, but if the police started collecting this data, they could match it to crime and get an idea of who was in the area at a time. One of the techniques we used whenever a new crime spree happened was to overlay a map of the crimes with the locations of newly released parolees. Believe it or not, when a lot of people get out on parole, they do not immediately become model citizens. We caught a lot of bad guys. Being able to combine that data with tower pings of the subjects would solve even more.

Remember the Scott Peterson case in 2004? I think a lot of people missed the significance of the fact that the cops found Laci’s body by comparing the cell phone tower pings of Scott’s phone with his story of where he had gone fishing. This was the first time I was aware that police were comparing historical location data from tower pings to their theory of a case. Scott is still serving a life sentence.

In the later 1990s cities started to install red light cameras and license plate readers. Police started getting real time access to this data. They could enter in a plate and get an alert when the plate was picked up and dispatch the closest unit to find that vehicle. Police also started installing automated readers in their cars. Many departments would cruise the parking lots of motels and have the system automatically check for wants and warrants. It was a simple matter of going to the front desk to find out which room to go pick up their “customer.”

And then smart phones happened.

Privacy Has Left The Building

Somewhere around 2006 I attended a talk at a conference by a data scientist from IBM. He laid out the following:

  • There is a massive trove of location information that can be purchased from the telecommunications industry. It’s anonymized, but it shows the paths of where everyone travels on a day-to-day basis.
  • Public data such as voter registrations and property taxes can be used to build a profile of every person and where they live.
  • There is a massive database of cash register transactions that show the group of things people buy, with the exact time and date. This is also anonymized.
  • There are multiple sources of data that can be purchased that tell us where people are employed.

His point was that it would soon be possible to combine all of this data and completely de-anonymize it. If we know where you work and where you live, we can figure out which anonymized id is really your phone. We can also look at the transactions that match with when you’re at the grocery store and over a long enough time period we can determine which receipts are yours. We’ll know what you buy on certain days of the week, and when you arrive at the store we could send you a coupon. Amazingly prescient.

I remember thinking “They’ll never be able to handle that much data.”

Indeed, the only thing holding us back was processing power and data storage. In 2006, 1GB of disk could be purchased for $0.32. 1GB now costs about $0.02, and that’s because prices have spiked due to the current supply issues. In 2006 the cloud was just getting started. Today I can buy a virtual server for $2.50/month with a couple of keystrokes.

That data scientist was thinking too small. None of us would have believed back then the amount of data people would voluntarily give up just to load an app onto their smartphone.

When people tell me they’re concerned about online privacy, I usually point out that they’re carrying a tracking device in their pocket with them at all times.

The Data Ecosystem

Location Intelligence is now a mature market. It’s a $14B market sector. There are dozens of companies that act as brokers for all of the location data being scooped up by the various apps on smartphones and tablets all over the world. They anonymize the data and provide it with an id that allows them to track the movements of a device but not to the person. But as the IBM guy postulated back in 2006, it turned out to be pretty easy to de-anonymize the data.

Every time you click “I Agree” to the terms of service on an app for your phone, you’re agreeing to hand over your location data to yet another vendor. You didn’t think those apps were really free, did you? TANSTAAFL. Walk into a national chain store and you’ll often start seeing ads for things you’re interested in purchasing at that store. They know who you are and where you are.

This data is now extremely accurate. Instead of the cell tower pings, this is actual GPS location data coming from the apps. Periodically I misplace my iPhone and use the “Find My iPhone” app to find it. It can show me which corner of the house I left it. GPS enabled smartphones are typically accurate to within a 4.9m (16 ft) radius.

There’s a lot of money to be made with this data. Big companies are willing to pay handsomely for it, because the analytics they get from it really does turn into sales.

It’s available for anyone who can afford the purchase price. Don’t kid yourself that the government doesn’t have all of it.

But Wait, There’s More!

Location Intelligence is just one sector. It’s just one facet of knowing everything about you. Sure, they know how often you visit Wendy’s on your way home from work. But they also have data on everyone you interact with. They can compute the social graph of everyone you’ve spent a certain amount of time with in the same area. If you’re using a service such as Gmail there is an AI reading every email. Every text you send/receive is analyzed and added to your social graph. They’re reading everything you post on social media.

They’re doing this because it works. Companies wouldn’t be spending money to be able to target their ads if they weren’t getting a return on their investment.

I’ve often wondered how long it will be before my health insurance provider starts sending me texts that I’m 30 minutes late and should go to the gym to work off that side of fries I ordered last night instead of the salad. Totally doable.

Little Green Army Men

I know someone who was convinced that smartphones are listening to our conversations and targeting ads based upon what they hear. Everything I’ve read on this subject debunks this claim. My own impression is that the amount of processing power it would take just to listen and turn it into actionable information fairly would be mind blowing.

I suggested an experiment. We would have a few conversations about a product that neither of us had ever expressed any interest in in any way ever. No emails, no blog posts, no texts, never visited a store that sells it, nothing. For the next few days we discussed my desire to purchase some “Little Green Army Men.” Remember the children’s toy that kids choked on in the 1970s? Yes, I was planning on making a fairly substantial purchase of little green army men.

Two days later I started seeing ads in my feed for little green army men. I cannot explain that.

There’s no such thing as privacy.

--

--