The Murky Ethics of Data Gathering in a Post-Cambridge Analytica World

Marketers can lead the way in a post-Cambridge Analytica world, where data collection rules are murky and consumers are creeped out

Published in

AMA Marketing News

17 min readMay 31, 2018

Facebook changes its policies frequently, like a child switching up the rules to a game of his own making: a sly update when it’s personally beneficial, a knee-jerk pivot when in trouble. Everyone else — the platform’s users and advertisers — is left scampering and contorting to comply.

The Cambridge Analytica scandal shed a light on the social platform’s inner workings perhaps more than any algorithm or design update prior. The Guardian and The New York Times found the data firm paid to acquire Facebook users’ personal information through an outside researcher, Aleksandr Kogan, who created a data-harvesting personality quiz app that told users (in fine print) that it was collecting the information for academic purposes — a claim Facebook did not verify and was not true. Although only 305,000 people participated in the quiz and consented to having their data harvested, their friends also had their profiles scraped, bringing the estimated number of those affected to 87 million.

Facebook rescinded the ability to obtain data from friends of consenting users without their permission in 2015, but it’s unclear if companies that engaged in this sort of data collection deleted the information they pulled before the access was denied. The old policy was part of Facebook’s open platform style, which saw CEO Mark Zuckerberg inviting developers to build their apps on the website.

Cambridge Analytica’s ability to circumvent the rules left consumers feeling uneasy (even though most marketers were well aware of this practice already). What followed was a cascade of realizations about Facebook’s fast and loose policies. For instance, apps like the personality quiz aren’t the only way that companies harvest user profiles: In April, Facebook’s Chief Technology Officer Mike Schroepfer told Slate that he believes most users on Facebook could have had their public profile data harvested by third parties through contact information. The comment was related to a Facebook feature that allowed users to search for other users via phone number or e-mail address, a practice that Facebook says was abused by hackers who scoured Facebook using lists of contact information to locate and grab users’ public profile information. The feature was subsequently disabled, but Schroepfer said, “we believe most people on Facebook could have had their public profile scraped in this way.”

The ability to collect more granular data continues to grow, sometimes faster than guidelines can be written. In a blow to Facebook’s freewheeling practices, Zuckerberg was called to Congress. Europe handed down its own set of data collection rules, too. As Facebook policies continue to morph, marketers are mulling over whether the social platform is still the golden child of online ad targeting, or if guidance and filtering could help Facebook reach its potential.

Facebook-Advertiser-User Codependence

Facebook needs advertising to sustain its business, advertisers need Facebook users to sell their products, and Facebook users need the platform to remember their aunt’s birthday. It’s Facebook’s codependent world, and we’re all just living in it.

Users’ reliance on the platform hasn’t been monetized (for now), but advertisers and Facebook itself can put a dollar sign on their relationship: In its first quarterly earnings report since the Cambridge Analytica news broke, Facebook reported $11.8 billion in advertising revenue, a 50% increase since the same period last year.

On the marketing side, Facebook remains a great deal for advertisers. The cost per thousand impressions on Facebook is $5.12 (compared with LinkedIn at $16.99, Instagram at $4.20, Pinterest at $3.20 and Snapchat at $2.95), according to data science and martech firm 4C’s “The State of Media Q1 2018” report. Facebook also yields the highest click-through rates at 1% (compared with Pinterest at 0.48%, Snapchat at 0.37%, LinkedIn at 0.25% and Instagram at 0.17%), making its cost-per-click of 48 cents the most efficient when compared to other social media platforms. The accuracy of its ad reporting, however, has recently been questioned. Facebook’ has admitted to measurement issues, including miscalculations of average watch time of videos, organic reach of posts, video ad completion rate, average time spent on instant articles and referral traffic from Facebook to websites and mobile apps.

Even if users’ attachment to Facebook can’t be financially quantified, perhaps the fact that they didn’t peel themselves off the platform after the Cambridge Analytica story is illustration enough. Facebook lost about 2.8 million U.S. users under age 25 last year, but it still boasts more than 1 billion daily active users. And despite the momentary media commotion, the #deleteFacebook movement didn’t catch fire. A Reuters/Ipsos survey of 2,194 American adults following the Cambridge Analytica news found about half of Facebook users said they did not recently change the amount that they used the site, and another quarter reported they were using it more. Only the remaining quarter said that they were using it less, had stopped using it or deleted their account entirely.

The codependence among all parties may be one reason why the ethics of data collection have become murky. Though Cambridge Analytica’s data-gathering might have been a shock to users, many in the marketing industry were well aware of the practice.

Alexandra Samuel, an independent technology writer and the former vice president of social media for Vision Critical, wrote an article in The Verge about the siren song of shady data-gathering tactics, such as those used by Cambridge Analytica. She wrote that Facebook’s “generous access” to friend data was known to many marketers and software developers, as was the tactic of disguising data-mining as fun apps, pages or quizzes.

“I don’t think we can generalize about whether, when and why specific companies or marketers venture into unethical or ethically nebulous territory when it comes to data collection and targeting,” Samuel wrote in an e-mail.

She says a marketer’s willingness to dive into these waters can depend on their business model; whether they’re working in a company or industry with clear guidelines around how data is collected and managed; their level of tech knowledge (that is, do they know what kind of data is available or have the skills to collect and use it?) and their own personal ethics.

In her Verge article, Samuel wrote that it may be difficult to reform the industry of data collectors and marketing shops, which have grown to maximize the amount of data collected and the precision of ad targeting. “Social networks and other advertising platforms may set up various processes that notionally screen out data aggregators or manipulative advertisers, but as long as these companies run on advertising revenue, they have little incentive to promote transparency among data brokers and advertisers,” she wrote.

Facebook Reels It In

One of the only controls on shady user privacy practices may be the fear of bad publicity, Samuel says.

“At this point, the real check [would be] the availability of competitive platforms that behave better,” she says. “There certainly is a market opportunity for a social media platform to build a user base on the strength of respect for privacy, though so far none of those efforts have really taken off. I do believe it’s just a matter of time before that happens. The fear of that alternative may motivate Facebook and others to make some real changes.”

Whatever the motivation, Facebook has moved to take some interest in user privacy. There were rumblings, prompted by Facebook COO Sheryl Sandberg, of making a paid version of Facebook where users would not see ads. More immediately, Facebook tweaked its advertising program.

Marketers on Facebook could traditionally leverage three types of data streams for ad targeting: data gathered by Facebook, which could include information on users’ habits and usage of the platform, web browsing history and cellular location; data that advertisers collected themselves and uploaded, such as names and e-mail addresses of the customers who visit their stores; and data provided by third parties. This final source, from companies known as data brokers, includes insights gleaned by firms such as Acxiom, Oracle, Epsilon and Experian. These companies build profiles by gathering data over a period of years from government and public records, consumer contests, warranties and surveys and private commercial sources (namely, loyalty programs and subscription lists).

At the end of March, Facebook made an unusual and succinct announcement: It plans to shut down “partner categories,” a product that “enables third-party data providers to offer their targeting directly on Facebook.” “While this is common industry practice, we believe this step, winding down over the next six months, will help improve people’s privacy on Facebook,” the statement read.

The decision centers around the idea that Facebook has less control over where and how third-party data aggregators collect their data, which is risky. This wouldn’t have stopped the Cambridge Analytica debacle, but the development could have major repercussions for the broader digital advertising ecosystem. Should others in the industry follow Facebook’s lead and distance themselves from data brokers, it could also mean an increase in transparency into their work with personal data.

Kristen Walker, a marketing professor at California State University, Northridge, cautions that this move may be less privacy-oriented and more lip service from Facebook.

“If they were really concerned about ending this service to improve people’s privacy, they’d explain how it violated people’s privacy,” Walker says. “And how does it violate people’s privacy? Is it the storage, the use, the rental? Maybe the real concern for Facebook, beyond meeting [General Data Protection Regulation] guidelines, is maintaining people’s trust with illusions of transparency and concern for their privacy. And, really, isn’t Facebook a data broker, anyhow? The entire distribution system and access to consumer data is undergoing a change. Whether it is an improvement for privacy or not is still a question.”

Walker says ending partner categories doesn’t necessarily mean that Facebook won’t use these partners or access their data to boost targeting; it may just mean that marketers won’t have visible access to that information.

“It actually could mean less transparency at a certain level for marketers who may be interested in the source of Facebook data they use to target consumers,” Walker says. “Marketers want access to consumers and data broker info can help, but this service makes Facebook less of a dealer and more an intermediary. Nothing’s stopping them from buying the data and repackaging it as theirs for marketers to utilize.”

In another move to improve customer privacy (feigned or otherwise), Facebook is also changing its “custom audiences” policy by adding a tool that will require advertisers to verify that they gained consent to use e-mail addresses uploaded to the platform. Custom audiences allow advertisers to target users on the site by uploading e-mails, phone numbers and other information and cross-referencing it against Facebook user profiles. The permissions tool Facebook is developing will require an advertiser, along with the agencies or other organizations that obtain the information, to confirm that the third-party data in a custom audience has been responsibly sourced.

There’s a strong chance that these moves were at least hastened in response to Cambridge Analytica and GDPR, the latter of which restricts how personal data is collected and handled and focuses on ensuring that users comprehend and consent to the data collected about them. The European Union rule requires companies to spell out why data is being collected and whether it will be used to create profiles of people’s actions and habits. Consumers in the EU have to opt in, not search for ways to opt out; and they have a right to access the data companies store about them, to correct inaccurate information and to limit the use of decisions made by algorithms.

Whatever the reason for change, it’s not entirely clear whether these policy shifts increase transparency. It may be up to marketers to come armed with flashlights in the social media swampland.

The Argument for Draining (or at Least Straining) the Swamp

It’s not easy convincing marketers that not all data is worth gobbling up and storing, but Cambridge Analytica and GDPR may be enough to urge marketers to consider the quality of what they collect and their ethical responsibilities.

“Millennials grew up in digital, meaning they had to learn all the things the hard way about what you should and shouldn’t put out in the public space,” says Jessica Best, director of data-driven marketing at Barkley. “We’re in that space on the advertising side right now, where we actually have to think about what we should and shouldn’t do morally, ethically in data. We’re going to start to take some responsibility. There’s a lot of excess data being stored in unsecure ways right now in vendor databases and company databases. We’re going to have to be a little bit more responsible, and I think that’s going to be a good thing.”

Although the trend has been toward increased personalization in marketing, there comes a point where the environment begins to feel creepy to the customer. There’s a difference between seeing ads from a company you previously purchased from and receiving targeted communication that speaks very directly to your lifestyle from a brand you never interacted with.

There’s precedent for this, Best says. “People have not liked being marketed to since about two years after marketing became a thing. The difference is that the scalable level to which we’re able to use and activate data has become alarming. […] What’s scary is the level to which somebody can cultivate a profile of me and understand my ‘why,’ or the algorithm can construct this fairly accurate representation of what will motivate me. That’s where we’re not accustomed to it yet: We’ve scaled past the comfort level of those on the receiving end. It’s become creepier faster than it’s become acceptable.”

In theory, any marketer could spend time on Facebook learning about users and their interests, but technology has allowed this to happen on a much larger scale, and faster. In this way, the data becomes almost weaponized — especially if used in a manner that could cause harm.

At Cal State, Northridge, Walker’s research has explored the difference of shared data and surrendered data, advocating for controls on the amount and type of information that social media users provide. The amount of data coming down the pipeline is too much and not always accurate. Placing some sort of filter or slowing the trickle of information could result in more robust and appropriate targeting.

“What if we asked consumers what information they’d like us to know?” Walker says, “because I can guarantee you that if I know what you want me to know, that’s going to help me address and serve your wants and needs.”

Not only could marketers see their dollars go further, but consumers would likely appreciate seeing ads that are relevant to their lives. When Facebook made it possible for users to access the list of advertisers that uploaded their information, it was clear that not all marketers may benefit from such gratuitous access (this author included, after finding the vast majority of political groups or candidates with my information weren’t remotely relevant to me in location or ideology). When marketing works well, it works for the advertiser and the consumer.

Regulating the Flow of Data — in Plain Sight

No one is expecting consumers to take the lead on how their data is collected, stored and used, as the nature of technology is to help them multitask more and move faster. In short, they’re too busy and too overwhelmed to stop and read paragraphs of technical language before downloading an app that provides a few minutes of mental escape. The onus falls on marketers to guide data policies.

The industry is defensive of its ability to self-regulate, and most marketers in the U.S. hope to continue on this route, even as overseas legislation is enacted. Sentiment among Americans differs from their European counterparts: Privacy is viewed as a privilege in the U.S. and as a right in the EU. As such, marketers in the states have been freely roaming the information highways.

Yet this isn’t the first time that advertisers have grappled with issues of audience deception. To potentially avoid legislation in the U.S., digital marketers may want to focus on increased transparency and ethical behavior as it relates to personal data. In fact, advertisers can serve as guides for the relatively young technology industry.

“It’s going to require demanding quality data,” Walker says. “Then it’s going to require someone to stand up and say, ‘We can help you with this.’ Marketers understand that we’re supposed to avoid deception, and we see a lot of newer industries learning that the hard way.”

Despite the new policy guidelines set by Facebook, experts caution against expecting the social media giant or others in the tech industry to lead the way on data regulation. Marketers will need to play the role of middleman, filtering the rivers of customer data that pour through social media and choosing what is quality and how to use it ethically. Being transparent to consumers will be a balancing act: Data collection must have consumers’ approval, but to avoid the “ick” factor of seemingly creepy ad targeting, the curtain can’t be pulled away completely to reveal all of marketing’s tricks.

Best cites a greeting card company as a theoretical example for striking an appropriate balance. Based on a consumer’s buying behavior (e.g., purchasing a “Congratulations on your new baby boy!” card), the company would know if a household has a young child. Rather than speaking directly to that knowledge (“Here’s a perfect onesie for 3-month-old Timmy!”), advertisers can simply show appropriate products. The ads are relevant, but not invasive.

“I believe this entire conversation has really been about the marketer self-regulating,” Best says.

As marketers use data to create segments and personas, it’s their job to regulate just how much information is collected and be transparent about the ways it’s used.

Thinking Like a Market Researcher

Market researchers are quick to point out that Cambridge Analytica is not a market research firm, but the data-driven industry’s principles may be worth a review by digital marketers.

A core tenet of market research ethics is transparency. Firms like Forrester, Nielsen and Kantar follow guidelines that require researchers to be up-front with subjects: providing interviewees with clear descriptions of their work, keeping recorded conversations classified and generally withholding all personally identifiable information. These are ethics that allow for segmentation, not personalization.

Julia Clark, senior vice president of public affairs in the U.S. for Ipsos, emphasizes the need in market research for informed consent, in which research participants provide information with knowledge of how that data will be used and by whom.

“The first thing I think of when I think of ethics in research is the ethical responsibility we have toward participants and respondents,” Clark says. “That means protecting their information and data and utilizing the best data control protocols. It also means not stressing them out. You don’t want to administer a survey that’s going to leave them with a real sense of unease or ask them questions that are going to make them so uncomfortable they want to stop.”

GDPR seems to take a cue from these guidelines and gets at the heart of a modern challenge for market researchers: how to ensure that people do not suffer adverse consequences as marketing relies increasingly on secondary data, defined as information collected for another purpose and subsequently used in research, versus primary data collection (surveys).

“One stream is useful but messy,” Clark says of secondary data. “The other (primary data) is very clean but probably not as comprehensive and loose. Navigating between those two is at the crux of all of this.”

Reg Baker, executive director of Marketing Research Institute International, says the industry is working to expand the ethics that cover primary data to also address secondary data. He says it is essential that no one is harmed related to the use of data, whether obtained in person, on the phone, through Facebook posts or via Amazon transactions.

The International Chamber of Commerce and European Society for Opinion and Marketing Research (ESOMAR) revised its International Code in 2016 to account for this concern, adding a new article addressing secondary data. The code defines harm as “tangible and material harm (such as physical injury or financial loss), intangible or moral harm (such as damage to reputation or goodwill), or excessive intrusion into private life, including unsolicited personally-targeted marketing messages.”

This last point, perhaps appealing to advertisers, is what market researchers want no part of. It’s one of the industry’s historic principles, and it has no plans to change that in the digital era.

“We don’t deliver data to clients that allows them to identify individual data records and associate them with a person,” Baker says. “As part of the research, we may develop a profile on a whole bunch of people whom we have information on, but that remains protected and confidential, and it’s not what we would deliver to a client. We’re here to help companies make decisions, a bet based on the information about the marketplace. But we’re not in the business of actually giving them the weapons to go at individuals so that they can change their behavior or sell them something.”

That’s the business Cambridge Analytica was involved in. The company mined users’ data and created profiles, then used those profiles to reallocate advertising and other communications. It was a full feedback loop. This is compared with market research where the data are used to create segments. Individuals whose data were used in the research might receive the segmented marketing communication, but never in a personally targeted manner.

“We (market researchers) are capturing the information, but we are not deploying that directly back to the person we captured it from,” Clark says. “It’s being aggregated, it’s being anonymized.”

General Data Protection Regulation (GDPR)

What is it?

The regulation is a new set of rules from the European Union that are designed to improve individuals’ control over their personal data. The rules replace the 23-year-old Data Protection Directive 95/46/EC, and aim to harmonize data privacy laws across Europe.

GDPR affects organizations located within the EU, but it also applies to organizations located outside of the region if they monitor the behavior of EU data subjects. It applies to all companies processing and holding the personal data of subjects residing in the EU, regardless of the company’s location.

Under the regulation, personal data is defined as any information related to a natural person or “data subject” that can be used to directly or indirectly identify the person. It can be a name, a photo, an e-mail address, bank details, posts on social networking websites, medical information, a computer IP address or a host of other identifiers.

Parental consent will be required to process the personal data of children under the age of 16 for online services.

How are marketers affected?

Organizations need to pay attention to the rules: Penalties could lead to fines as high as $24.6 million or 4% of global annual revenue, whichever is larger.

GDPR limits the amount of data that marketers can collect on European consumers, who have more options about what data companies can see about them. Customers must be able to give consent, and implied consent is unacceptable. The consent must be informed, specific, unambiguous and revocable. That means consent may not be within long-winded terms and conditions that use complex legal language. The customer is also given the right to remove their consent at any time.

The type of data collected must be adequate, relevant and limited to what is necessary for the intended purpose of collection. Information may not be used in a way that would be incompatible with the intended purpose for which it was collected. Data may not be shared or transferred to another organization without consent from the person to do so.

Customers also reserve the right to be forgotten, meaning they may request that their personal data be removed from any database or cookie pool. Marketers will need to have processes that can erase collected data should a user submit a request for withdrawal. Users also reserve the right to correct or update any data.

Additionally, the data that an organization obtains from a consenting user must be protected. Any data breaches must be reported within 72 hours to all consumers and respective bodies.