This week I read a remarkable book by a 30-something named Seth Stephens-Davidowitz, Everybody Lies. Seth, which I shall call him for obvious reasons, has a graduate degree in economics and writes for the New York Times, but he is not mainly interested in purely economic issues any more. Instead he is interested on what “Big Data”—the enormous amounts of digital data that go along with our new way of life—can tell us about various aspects of our life. Seth has a very quick mind, a flair for simple, punchy sentences, and a sense of humor. His work reminds me somewhat of the early work by Bill James, the founder of modern sabermetrics, or statistical baseball analysis, with whom he clearly feels some affinity. The book has a lot to say about different areas of American life, but I will focus on what he has to say about two sensitive topics, sex and politics. (Religion, apparently, will have to wait at least until his next book.)
Seth’s findings about sex illustrate the difficulties of matching public morality and real human behavior. Both the political right and the political left have very definite ideas nowadays about relations between the sexes (and between consenting adults of the same sex), but neither one of them seems to match what the bulk of the population is doing. Seth uses two main sources: statistics on google searches, which we can all access, it turns out, using a readily available app called google trend, and statistics he acquired from pornhub.com, one of the nation’s leading pornography sites. He points out that google searches are especially useful because people will ask google questions that they literally would not ask any human being face to face. (Thinking about this carefully, I realized I had done this more than once myself.) People lie on Facebook all the time—one reason for the title of the book—but they give an accurate idea of what they are worried about on google. That leads to interesting findings.
Using search data from pornhuh.com, one of the leading pornography sites, Seth estimates the gay male population of the country to amount to 4-5% of all men—a much lower estimate that Alfred Kinsey’s 10%, which was probably skewed, he explains, by his sample, which included a good many prison inmates. This is a very straightforward calculation: about 5% of all porn searches by men on pornhub ask for gay porn. The search data also suggests that there are roughly as many gay men in red states as in blue. Other data from google, as we shall see, tends to confirm that there are far more men in the closet in red states than blue.
What do people worry about concerning sex? Men, few will be surprised to learn, are very concerned about the size of their penises, and often wonder how they might be increased. Straight women, on the other hand, are much more likely to complain about excessively large penises than excessively small ones. Women’s concerns about their appearance, on the other hand, have shifted. The posterior has replaced the breasts as the main source of anxiety during the last decade, and women are more likely to worry that their rear end isn’t big enough. I couldn’t help but wonder, although Seth never brings this up, whether this might have had something to do with one of the most visible women in the United States, the last First Lady, who was well endowed in this respect. A great many women are also worried about certain bodily odors, but I shall leave the details of those concerns to those who want to read the book for themselves. There is one very encouraging piece of data from porn searches that more people should take to heart. Just as the New Yorker facebook movie page, where I now contribute, tells us that there is no movie so bad that it does not have at least a few fans out there, it turns out that there is no aspect of physical appearance, including obesity, that does not excite some measurable portion of the population. In fact, Seth has commented in an interview I read that if if men and women were willing to use dating sites to search for what they really wanted, instead of what society tells them they should want, they would be a lot happier.
I was very struck, however, by the data google provides on sex within relationships. Relatively few people seem to ask google why their partners want so much sex; instead, they (and especially women) often ask why their partner seems to want so little. A surprisingly popular question among women is, “how do I tell if my husband is gay?”—and this question is especially popular in the deep red South, perhaps (see above) because there are quite a few closeted gay men there. I did get the feeling that the sexual revolution and new courtship customs for 20-somethings may have done a lot of harm to marital sex. People, especially better-educated people, now tend not to get married until the most intensely sexual phase of their relationship is over. I also wondered about the effect of infinite, free, readily available hard core pornography on the sex lives of young Americans. It does not seem to have improved it, and indeed, I know from other sources that there is a measurable population now, including some women, who are so addicted to masturbation while watching it that they can no longer do the real thing.
For whatever reason, Seth had much less to say about what big data tells us about women’s sexuality, but he does report one very politically incorrect fact. I quote: “Fully 25 percent of female searches for straight porn emphasize the pain and/or humiliation of the woman—‘painful anal crying,’ ‘public disgrace,’ and ‘extreme brutal gangbang,’ for example. Five percent look for nonconsensual sex—‘rape’ or ‘forced’ sex—even though those videos are banned on PornHub. And search rates for all these terms are at least twice as common among women as among men.” I will leave it my readers to absorb that data on their own—except to say that those findings would not have surprised my favorite author on relationships and sex, the late Nancy Friday, whose compilations of female sexual fantasies include quite a few along those lines.
Now let us turn to politics—where Seth gives us only one major finding, but one which has really set me thinking.
That finding has to do with race in politics, and its apparent impact in each of the last presidential elections. During the last year and a half we have all had occasion to ask whether racism is in fact a much more powerful force in the US than we had come to believe. The answer is yes.
Although Seth is an Xer, not a Boomer, he shares the Boomer view of 50 years ago that words do not kill, and does not sugar coat his message. I won’t either. The main instrument he uses to measure racism in the US is searched using the word “nigger,” of which there are about seven million every year in the US. That is comparable to the number of searches for “economist” or “migraine.” Sadly,. 150 years after the Civil War and 50 after the great civil rights acts, no other racial or ethnic prejudice in the US remotely compares to racism. I quote: “Searches for ‘nigger jokes’ are seventeen times more common than searches for ‘kike jokes,’ ‘gook jokes,’ ‘spic jokes,’ ‘chink jokes,’ and ‘fag jokes’ combined.” [emphasis added.] Such searches tend to spike whenever black Americans are in the news—most notably, of course, during the presidential elections of 2008 and 2012. And when the searches were broken geographically, a clear pattern emerged. There was a negative correlation in 2008 and 2012 between the number of searches for “nigger” in a given area and the vote for Barack Obama, and a positive correlation in 2016 between the number of those searches and the vote for Donald Trump. And those areas included upstate New York, western Pennsylvania, eastern Ohio, West Virginia (all those, actually, part of a contiguous region), industrial Michigan, and rural Illinois, as well as Appalachia and parts of Louisiana and Mississippi. Overall, comparing Obama’s vote in 2008 and 2012 to John Kerry’s in 2004, Seth estimated that racism had cost Obama about four percentage points in the national vote. When he and a collaborator first wrote an article presenting those results, quite a few academic journals refused to publish it, but one eventually did.
Now there are some possibly complicating factors which Seth might have paid more attention to. In these mostly white areas, Obama’s race cost him perhaps 4% of the national vote. But his race seems to have earned some additional votes elsewhere, not only from his fellow black Americans, but also from young people who were inspired to vote by his status as the first major non-white presidential candidate. Analysis of the 2016 vote has shown that many of those voters did not turn out to vote for Hillary Clinton. We are dealing here, it seems to me, with an effect of partisanship. A candidate who appeals to his or her party’s activists and seems out of the mainstream, such as Obama or Trump, will draw more votes from his own side, but may be less effective at winning votes from the other side. That dynamic, if it exists, will make it very difficult for any candidate to be elected with more than a narrow majority.
The other huge question this data raised in my mind, however, was the role sexism in the 2016 election. I went to google trend myself, and I did searches for “Hillary Clinton” and “bitch.” Searches for Clinton’s name alone, clearly, would not distinguish supporters from opponents. Now the time line for those searches showed huge spikes during two recent years—2008 and 2016, when Clinton was running for President. The 2008 spike took place in the early spring, at the height of the primary season, and the 2016 one took place in the fall. The largest number of 2016 hits occurred in some of the same places the ones indicating racism did—in Mississippi, Louisiana, Kentucky, and Indiana Pennsylvania, and Indiana, but also in Wisconsin and Minnesota. But when I used another feature of the program and compared the numbers to searches for “migraines,” the term that had been about as frequent as the racist searches when Obama was running they were much less numerous. A few other abusive search terms turned up even fewer hits. Unless I simply failed to think of the most revealing search, it seems that overt sexism is now much less of a factor in our politics than overt racism. Clinton seems to have lost critical states not because so many people hated her, but because too few people were moved to come out and vote for her.
A good deal of Everybody Lies is devoted to exploring uses and potential uses of big data and what it might do to us. Seth effectively describes one of its pitfalls, the pitfall of dimensionality, which has led to some misleading articles tying particular genes to particular human characteristics or diseases. Essentially, there are so many thousands of variables within our chromosomes that, given a large sample of human beings with a particular problem, chance will produce an apparent match or two. But those will be random matches with no predictive value for the future. Everybody Lies does not seem to have been very widely reviewed. I recommend it.