Last year, the politically aware world was taken by surprise, not just once but twice. First, most people assumed the Brexit would be a wash, only to wake up and find out the Britain would be leaving the EU. Then it was assumed that Clinton would be the next president, until the polls started reporting in and Trump became the leader of the States. There have been numerous articles about what was done right and wrong in the various campaigns, but it appears that Big Data played a role in both, and it has me more than a little bit concerned.
For those who haven’t played Watch Dogs 2 or read about Big Data, it’s a fairly straight forward concept. Big Data tracks all your data across multiple sources. This isn’t just your credit rating or racial background. Big Data tracks everything from your purchasing history at your local super market (thanks to loyalty cards) to your browsing patterns online. Previously, people thought this would mainly be used so that Google could show you ads for medications after you search for what your symptoms might mean, or e-commerce sites continually showing you ads for those shoes you only meant to look at once. However, it seems a whole lot more sinister.
Watch Dogs 2 predicted that Big Data could be used to profile people as criminals, to boost insurance premiums or other things that I won’t mention because spoilers. But thanks to this article, it appears that it’s actually being used to sway elections.
Let’s be clear, this isn’t rigging or illegal. Instead, Big Data is being put to full use. In 2008, Michal Kosinski started work on his PhD in Psychometrics. He decided to work on something common in the field of psychometrics – people are classified across the “Big Five” personality traits, which can then be used to predict behavior. If someone is generally more extroverted, less agreeable or more neurotic, their further beliefs and actions are easy to predict for those in this field of study. In the past, people had to take a long questionnaire to determine their personality, something most people didn’t want to do. But, thanks to the dawn of social media, the personality test Kosinski and his research partner David Stillwell created was shared and spread across Facebook, answered by millions of people.
The approach that Kosinski and his colleagues developed over the next few years was actually quite simple. First, they provided test subjects with a questionnaire in the form of an online quiz. From their responses, the psychologists calculated the personal Big Five values of respondents. Kosinski’s team then compared the results with all sorts of other online data from the subjects: what they “liked,” shared or posted on Facebook, or what gender, age, place of residence they specified, for example. This enabled the researchers to connect the dots and make correlations.
Remarkably reliable deductions could be drawn from simple online actions. For example, men who “liked” the cosmetics brand MAC were slightly more likely to be gay; one of the best indicators for heterosexuality was “liking” Wu-Tang Clan. Followers of Lady Gaga were most probably extroverts, while those who “liked” philosophy tended to be introverts. While each piece of such information is too weak to produce a reliable prediction, when tens, hundreds, or thousands of individual data points are combined, the resulting predictions become really accurate.
Thanks to this research, Kosinski was eventually able to predict people’s skin color, sexual orientation, party affiliation, intelligence, drug use and even if their parents were divorced, all from an average of 68 Facebook “likes” information. It really is incredible how much he was able to learn about people, but he also realized the inherent danger of his work. He rejected some offers to work with various institutions and refused to sell his database. But it was already too late – other Big Data analysts realized what was possible and put it to work.
Cambridge Analytica is one such Big Data organization that promises better return on investment for those who use their services.
At Cambridge Analytica we use data modeling and psychographic profiling to grow audiences, identify key influencers, and connect with people in ways that move them to action. Our unique data sets and unparalleled modeling techniques help organizations across America build better relationships with their target audience across all media platforms.
Cambridge Analytica was involved with the Brexit campaign, although it is still unclear just how much. What is clear, however, is some of the work that they did for Senator Cruz’s campaign and later Donald Trump’s:
“At Cambridge,” he [Alexander Nix] said, “we were able to form a model to predict the personality of every single adult in the United States of America.”
Thanks to this information, each bit of messaging could be uniquely tailored for the specific audience. An example he gave is that of gun rights. For a highly neurotic and conscientious audience, it’s all about the threat of a burglary, so the ad might show the hand of an intruder smashing a window. Alternatively, a closed and agreeable audience would care more about tradition, habits and family, thus their ad shows a man and a child standing in a field at sunset, shooting ducks. Sounds simple enough, but done consistently and on a large enough scale, you can sway the way people might vote.
Let’s pause for a moment and look at the scale. According to Nix, “Pretty much every message that Trump put out was data-driven”, with Trump’s team testing 175 000 ad variations for his arguments on the day of his third presidential debate with Clinton.
The messages differed for the most part only in microscopic details, in order to target the recipients in the optimal psychological way: different headings, colors, captions, with a photo or video. This fine-tuning reaches all the way down to the smallest groups, Nix explained in an interview with us. “We can address villages or apartment blocks in a targeted way. Even individuals.”
It’s scary enough when this is done with Facebook advertising, but it can go so much deeper. Across all politics, one of the biggest things is people on the ground. These are people who literally go door to door, talking to undecided voters to try and get them to vote a certain way. While demographics have been used to help such canvassers for years, Big Data has made it much more specific:
From July 2016, Trump’s canvassers were provided with an app with which they could identify the political views and personality types of the inhabitants of a house. It was the same app provider used by Brexit campaigners. Trump’s people only rang at the doors of houses that the app rated as receptive to his messages. The canvassers came prepared with guidelines for conversations tailored to the personality type of the resident. In turn, the canvassers fed the reactions into the app, and the new data flowed back to the dashboards of the Trump campaign.
Again, this is nothing new. The Democrats did similar things, but there is no evidence that they relied on psychometric profiling. Cambridge Analytica, however, divided the US population into 32 personality types, and focused on just 17 states. And just as Kosinski had established that men who like MAC cosmetics are slightly more likely to be gay, the company discovered that a preference for cars made in the US was a great indication of a potential Trump voter. Among other things, these findings now showed Trump which messages worked best and where. The decision to focus on Michigan and Wisconsin in the final weeks of the campaign was made on the basis of data analysis. The candidate became the instrument for implementing a big data model.
Of course it is hard to put all the credit (or blame depending who you speak to) on Big Data. There were many factors that led to the Brexit vote, the election of Trump. However, the fact that messaging could be targeted in such a specific way is both inspiring and terrifying. This isn’t the kind of propaganda of old, where messaging was blatant and might only sway a small portion of society. This explains why many were shocked at the results – they weren’t even seeing the same ads, articles or message that others were. This is individually customized messaging, delivered on a grand scale, and it makes me wonder how all of us will be manipulated in future decisions – from elections to purchasing choices, it seems we are all susceptible to influence based on the massive amounts of data we’ve already shared.