Monday 14 October 2013

How'd they reckon this?

Americans surveyed in 2011 substantially overestimated the proportion of Americans identifying as homosexual. Where most estimates reckon about 3.5% of the population are homosexual, Americans surveyed thought that somewhere between 20-25% of the population are gay or lesbian.

Some candidate hypotheses for the overestimation:
  1. Availability bias where observations of people you know carry less weight than observations from TV shows or movies: if people take pop culture as more representative of average reality than their own personal circumstances, and if homosexual characters are over-represented on TV, then this could do it. 
    1. In that case, we would expect overestimation particularly among lower-IQ cohorts. 
    2. This alone shouldn't account for it: how many popular TV series other than Modern Family have at least 20% gay characters?
  2. Availability estimation of proportions where individuals of different characteristics are more or less likely to have friends or acquaintances who are gay. This would predict dispersion of estimates but shouldn't affect estimates of the population mean unless it's combined with downward bias in the number of people you know. If you're asked "What proportion of the population is gay or lesbian", and you think about how many homosexuals you know, and you then underestimate the number of heterosexuals you know, you'd bias upwards your estimate. I still can't see how that gets you to a 6-times overestimate.
  3. Ideology doesn't give clear-cut predictions, or at least not to me. You could build a story where social conservatives' fear of the 'gay agenda' is driven by their overestimation of that group's proportion in the population, or you could build an equally plausible story where social conservatives' dismissal of gay rights is founded on that the needs of a tiny proportion of the population should not drive changes in the definition of institutions that have persisted for thousands of years.
I'm putting most weight on #1. Gallup provided some population cross-tabs that can help:

Everyone overestimates, substantially. It's so far out of whack with reality that you wonder whether it's just a wonky survey. But the numbers are apparently consistent with the overestimates in a similar 2001 survey. 

Smarter and richer people, and men, have far more accurate estimates - this isn't out of line with fairly standard findings on other kinds of knowledge. Older cohorts were more accurate. 

Republicans, conservatives, and social conservatives were more accurate, which is inconsistent with the ideological hypothesis that "gay terror" would lead to overestimating the proportion of homosexuals in the population. And while "we shouldn't change everything for a small minority" would be consistent with social conservatives having a lower estimate than social liberals, which is true, it is not consistent with social conservatives still overestimating the population proportion more than five times over. 

Intriguingly, while social liberals more greatly overestimate population proportions, those favouring bans on gay and lesbian relations overestimate population proportions relative to those believing that gay and lesbian relations should be legal. This is likely (hopefully) an artefact of very small proportions of the population believing that gay and lesbian relations (not marriage, but relations) should not be legal. 

The data seems to give weak support to my candidate hypothesis #1, though it is completely indistinguishable from a dozen potential alternative hypotheses about intelligence, education, and accuracy in estimating things. It would be interesting to partial out the effects of education, age, gender, income, partisanship and ideology; alas, they give cross-tabs instead of regression coefficients.

Suppose that you favour gay rights, as I do. Would accurate perceptions of population proportions tend to increase or decrease support for gay rights? The estimate among those favouring same-sex marriage is just a titch higher than that among those opposing it, but at the same time college grads and postgrads have a smaller degree of overestimation and, I would expect, are more likely to support same-sex marriage. Only the partial derivative of the overestimate on the likelihood of supporting same-sex marriage in a probit would tell for sure.

Update: Chris Auld very helpfully points to work suggesting that correcting for under-reporting could roughly double the number. The sample in the paper is not representative, so we shouldn't extrapolate from their reported levels, but the magnitude of under-reporting is plausible. But even if under-reporting got us all the way to 10% in the full sample (7% seems more likely), that's still miles away from 20%.


  1. A wild guess: The way the question is worded ("Just your best guess, what percent of Americans today would you say are gay or lesbian?") might make some people answer with reference to people they do not consider "real men" or "real women" (as in "Justin Bieber is totally gay). I wonder, however, what people who know U.S. usage better than I do think about this.

    As for the availability heuristic, it may be that people sometimes being explicitly identified as homosexuals, but rarely as heterosexuals, in the news, helps explain the finding along these lines. This would be a specific instance of the general tendency of media consumption leading you to believe that rare phenomena are more common than they are, because journalists like reporting stuff that's out of the ordinary (again, that's just a plausible hypothesis; I've got nothing to back this up).

  2. I wonder whether the population stats are "identify as" while the popular stats are "what proportion has ever even had homosexual thoughts because that makes you gay too".

  3. Framing effects may explain some of the overestimation. The question gives respondents six ranges, the lowest of which is "less than 5%" the highest "more than 25%." There's a lot of bunching in the middle category (10-15%). But I have no good explanation for all the people piled up on "more than 25%." See page 2 of-

    May also partially merely reflect innumeracy (bad form to ask for percentages on surveys), also consistent with the most overestimation among the least educated.

    Tangentially relevant interesting paper:

  4. Awesome, thanks!

    I'm keeping an eye out for all this stuff as I've a Masters student finishing up a thesis on the lesbian wage premium and to what extent lesbians' lower fertility rates (forward-looking measure, not current # of kids) affects that wage premium.

  5. That's what I was thinking.

    If people believe the 10% figure (which is an overestimate, but not ridiculously so) for 'identify as', they could easily believe 'not completely 100% straight' as 20-30%.

  6. Also interesting for comparison is the proportion of people, 60%, who say they personally know someone who is gay, also from Gallup:

  7. Combine the two surveys and you get a very strange model of what must go on in the heads of some folks in estimating population proportions. "Nobody I know is gay, but the big cities must be like 50%."

  8. Which could actually make sense for racial minorities, since parts of the US are that segregated, but not for something so weakly heritable

  9. Think of it more via migration: liberals, of all orientations, move to the cities, for cultural reasons; homosexuals, whether liberal or conservative, move to the big cities for a greater potential pool of partners - thin markets in small towns aren't great.

  10. Yes, but their families would still know them. You'd need mass memory-suppression about gay kids as well as migration.

  11. True. Wonder how many don't come out 'till they've left town so only their families know.

  12. I followed the link in the Gallup report to this

    which shows they aren't good at estimating percentage of US population which is Black or Hispanic (compared to the census). Way over again. Does the fact that it happens in this case also, tell us anything else about what might be going wrong?