Big Data is used to spy on us, hire and fire us, and sell us things we don’t need. In Dataclysm, Christian Rudder, founder of one of the world’s biggest dating websites OkCupid, puts this flood of information to an entirely different use: understanding human nature.
Drawing on terabytes of data from Twitter, Facebook, Reddit, OkCupid, and many other sites, Rudder examines the terrain of human experience to answer a range of questions: Does it matter where you went to school? How racist are we? Rudder shows that in today’s era of social media, a powerful new approach is possible, one that reveals how we actually behave when we think no one’s looking.
Here we put five questions to Christian
Q) What makes this moment in time—and this set of data—different from the massive data surveys of the past, such as Pew, Gallup, or the Kinsey Institute?
The data in my book is almost all passively observed—there’s no questionnaire, no contrived experiment to simulate “real life.” This data is real life. Online you have friends, lovers, enemies, and intense moments of truth without a thought for who’s watching, because ostensibly no one is—except of course the computers recording it all. This is how digital data circumvents that old research obstacle: people’s inability to be honest when the truth makes them look bad. You could never ask people these days if they like racist jokes and get a real answer. Yet lo and behold the country’s most notorious slur for black people is incredibly popular as a Google Search term; it still appears in a half-million searches a month in the United States. As I say in the book, the epithet is more American than “apple pie”—we search for it about 30 percent more often. Digital data’s ability to get at the private mind like this is unprecedented and very powerful.
Q) As more of our social interaction happens on social media, how much can researchers learn about us from our online interactions?
Well, they can only learn what we tell them, but in the age of Facebook and Google, that’s become pretty much everything. To the extent that friendship, anger, sex, love, and whatever else happen online, we can investigate them.
Your search history tells us what kind of jokes you like. Your Facebook network reveals not just your friendships, but in some cases the state of your marriage. Your preferences on OkCupid tell us what you find sexy, and your reaction to the strangers the site offers up tells us how you judge people. The articles you “like” tell us not just about your politics, but even predict your intelligence.
You fold in data points like these for millions and millions of people, and you start to get a whole new picture of humankind.
Q) You have a lot to say about race in the book, and you use data to shed light on the many ways it affects the way we interact with one another. What surprised you about your research in this area? Did you find anything unsurprising?
The data on race was surprising only in its stubborn predictability—for all the glitzy technology, the results could’ve been from the 1950s. I grew up in Little Rock and graduated from Central High, the first school in the South to be integrated: Eisenhower, the National Guard, mobs of white people screaming at nine black children, that’s Central. The school embraces its history and is now over half black. I’m no brave crusader, but race (and racism) were part of my education. So when, in researching the book, I unpacked three separate databases and found that in every one white people gave black people short-shrift, I wasn’t shocked, you know? Asians and Latinos apply the same penalty to African Americans that white folks do, which says something about how even (relatively) recent additions to the “American experience” have acquired its biases.
Q) In Dataclysm you’re taking this flood of information and putting it to an entirely new use: understanding human nature. So what have you found?
I tried really hard to avoid the numerical dog and pony show. There are of course lots of interesting one-off factoids, but I mostly found what I (and probably you) have always known: that people are gentle, mean, stupid, lusty, lonely, kind, foolish, shrewd, shallow, and endlessly complex. Dataclysm’s central idea isn’t necessarily what we can see using big data; it’s the fact of the vision itself. That we can get real data on even the most private moments in people’s lives is an astounding thing. It’s like the second advent of reality television, but this time without the television part. Just the reality.
Q) Are you worried about any of this?
I have mixed feelings about the implications. I myself almost never tweet, post, or share anything about my personal life. At the same time, I’ve just spent three years writing about how interesting all this data is, and I cofounded OkCupid. My hope is that this ambivalence makes me a trustworthy guide through the thickets of technology and data. I admire the knowledge that social data can bring us; I also fear the consequences.
11th September 2014