Cambridge Analytica and The Power of Data

Erika D
6 min readMay 23, 2019
Image via: http://nymag.com/intelligencer/2018/03/what-is-cambridge-analytica-and-who-is-christopher-wylie.html

In 2013, a study was conducted by The Psychometrics Centre at the University of Cambridge and was later published by the Proceedings of the National Academy of Sciences. In this study over 58,000 participants took personality tests and others surveys and questionnaires to provide personal information about themselves, including sexual orientation, religion, drug use, and whether one’s parents had stayed together until the user was 21 years old. The study then tracked these participants Facebook likes to see how accurately a user’s likes could predict personal traits. The results showed that one’s Facebook likes were extremely predictive of personal attributes, including political views and personality traits (Krosinski, Stillwell, Graepel, 2013). Similar studies continued at the school that only further proved the predictive power of Facebook likes. These studies were the inspiration behind the Cambridge Analytica model.

Image via: https://www.pnas.org/content/pnas/112/4/1036.full.pdf (2015 study by Youyou, Kosinski, Stillwell)
Image via: https://fivethirtyeight.com/features/this-algorithm-knows-you-better-than-your-facebook-friends-do/
Image via: https://fivethirtyeight.com/features/this-algorithm-knows-you-better-than-your-facebook-friends-do/
Image via: https://fivethirtyeight.com/features/this-algorithm-knows-you-better-than-your-facebook-friends-do/
Image via: https://fivethirtyeight.com/features/this-algorithm-knows-you-better-than-your-facebook-friends-do/
Image via: https://fivethirtyeight.com/features/this-algorithm-knows-you-better-than-your-facebook-friends-do/

The Model

How did Cambridge Analytica create their model? Well, according to whistle blower and former Cambridge Analytica employee Christopher Wylie, the first step “when you’re building an algorithm, you first need to create a training set”. To get this training set, Cambridge Analytica needed to gather some data, which they planned to do through personality tests and Facebook data (Hern, 2018). Of course the controversy in the scandal was how Cambridge Analytica obtained their data but more on that in a bit.

Wylie goes on to clarify in his interview with The Guardian, their feature set, or independent variable was the Facebook data and the target variables were personality traits and political orientation. So to build their model they needed to get enough people to complete personality tests and get each persons corresponding Facebook information.

Dr. Aleksandr Kogan

In an effort to build the best predictive model, Cambridge Analytica contacted the previously mentioned Psychometrics Centre at the University of Cambridge for assistance. When they declined, Dr Aleksandr Kogan, then psychology professor and researcher at Cambridge, who had knowledge of the techniques being used by the school’s studies on personality and Facebook likes was scouted by the firm (Rosenberg, Confessore, Cadwalladr, 2018). In 2014 Kogan created an app called “thisisyourdigitallife”. This application allowed users to take personality tests and then scraped a user’s Facebook page to collect information as well as a user’s friends information. Numbers vary for how many profiles the application collected data on, it appears the consensus is around 87 million.

A big part of the controversy was that all Kogan had told users and Facebook was that he was collecting information for academic purposes, not for political targeting. It is important to note that at the time this app was available, web developers were often given access by Facebook to scrape a user’s Facebook information along with their friend’s data and while what Kogan did was not honest, it was not illegal.

Through 5-factor personality tests looking to obtain OCEAN scales on individuals (Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism) and Facebook data. Cambridge Analytica now had a model and a bunch of information on people, for personality traits, demographic location and political views. Through other models and research they were able to create ads that would target people not just on their demographic and political background which is a common technique used by political marketing campaigns, Cambridge Analytic could alsotarget people on their Psychological background as well.

Email between Wylie & Kogan. Image via: https://www.nytimes.com/2018/03/17/us/politics/cambridge-analytica-trump-campaign.html

The Model Put to Use

Cambridge Analytica began to run ads that would target people in a way and scale never seen before. If someone scored high on Conscientiousness they may have seen the below ad, described by Wylie, “It was targeting conscientious people. It was a picture of a dictionary and it said ‘Look up marriage and get back to me’. For someone who is conscientious, it is a compelling message: a dictionary is a source of order, and a conscientious person is more deferential to structure” (Interview by Hern, 2018).

Image via: https://www.theguardian.com/news/2018/may/06/cambridge-analytica-how-turn-clicks-into-votes-christopher-wylie

In the same interview, Wylie goes on to say that Conscientious people were likely to see ads with pictures of walls, “Conscientious people like structure, so for them, a solution to immigration should be orderly, and a wall embodied that. You can create messaging that doesn’t make sense to some people but makes so much sense to other people. If you show that image, some people wouldn’t get that that’s about immigration, and others immediately would get that” Interview by Hern, 2018). Depending on what personality traits people scored high or low on would influence the type of ads they would see on the internet. The goal was to target the right people with the right ads that would resonate with them and influence them to think about a candidate in a certain way.

Image via: https://www.theguardian.com/news/2018/may/06/cambridge-analytica-how-turn-clicks-into-votes-christopher-wylie
Image via: https://towardsdatascience.com/effect-of-cambridge-analyticas-facebook-ads-on-the-2016-us-presidential-election-dacb5462155d

Cambridge Analytica didn’t just use personality data but also used other data collected to target ads to different voters. On election day, the Trump campaign used two different ads on Google’s video-hosting platform. Geographical data on a user would influence which ad they saw. If a user was likely to be in a pro-Trump area they would see an ad with Trump smiling and information on how to vote. If a user was in an area that was a swing state or not pro-Trump they would see the below ad. The ad uses pictures of high-profile celebrities instead of focusing on Trump. The hope was that you would see these celebrities standing behind Trump and that would make a user more inclined to vote for him (Lewis & Hilder, 2018).

Image via: https://www.theguardian.com/uk-news/2018/mar/23/leaked-cambridge-analyticas-blueprint-for-trump-victory

Looking Forward

Cambridge Analytica changed the way that political campaigns use data. Although political campaigns have been using data to target voters with different ads for a while, Cambridge Analytica was able to target voters in the most individualized way seen yet by using data voters were unaware the company had on them. The 2016 election was evidence of how powerful the use and manipulation of data can be. With the increasing amount of data being collected by companies on individuals today, it will be interesting to see how political campaigns use data in the future and if this will be the new norm?

References:

Hern, A. (2018, May 06). Cambridge Analytica: How did it turn clicks into votes? Retrieved from https://www.theguardian.com/news/2018/may/06/cambridge-analytica-how-turn-clicks-into-votes-christopher-wylie

Kosinski, M., Stillwell, D., Graepel, T. Digital records of behavior expose personal traits. Proceedings of the National Academy of Sciences Apr 2013, 110 (15) 5802–5805; DOI: 10.1073/pnas.1218772110.

Lewis, P., & Hilder, P. (2018, March 23). Leaked: Cambridge Analytica’s blueprint for Trump victory. Retrieved from https://www.theguardian.com/uk-news/2018/mar/23/leaked-cambridge-analyticas-blueprint-for-trump-victory

Rosenberg, M., Confessore, N., & Cadwalladr, C. (2018, March 17). How Trump Consultants Exploited the Facebook Data of Millions. Retrieved from https://www.nytimes.com/2018/03/17/us/politics/cambridge-analytica-trump-campaign.html

Youyou, W., Kosinski, M., Stillwell, D. Computers judge personalities better than humans. Proceedings of the National Academy of Sciences Jan 2015, 112 (4) 1036–1040; DOI: 10.1073/pnas.1418680112

--

--