Thursday 15 July 2010

The lies folks tell

OK Cupid turns its statistical eye to the lies its members tell. Of course, it's also possible that they're just drawing from the taller and higher income parts of the overall distribution, but that seems unlikely.

Hit the link above for the pretty graphs; here's a summary:
  • Self-reported heights follow the US distribution, but with a level shift of two inches. Self-reported height also correlates reasonably well with the number of unsolicited messages received: a 6'1" male gets about 1 unsolicited message per week; a 5'4" male gets one every two weeks. Women over 5'10" see a sharp decline in unsolicited messages. Recall of course that unsolicited messages here are date requests, not spam.
  • Reported incomes are about 20% higher than you'd expect given age, gender, and zip code. Men overstate by more than women, but not by much more. Young folks don't overstate much; older folks, by a lot. And, all of this lines up well with frequency of unsolicited messages as well: after age 22, there's a sharp drop off in unsolicited messages to low income folks.
  • "Hot" pictures are more likely to be out of date than average pictures: the median average picture is 2 months old; the median "hot" picture is 5 months old. 80% of average pictures are no more than a year old; 80% of "hot" pictures are no more than 2 years old.
  • 80% of self-reported bisexuals only are interested in one gender, but women so-reporting are more likely to send messages to both men and women than are men so-reporting. Half of young bisexual males send messages only to men; half of old bisexual males send messages only to women. There's no age trend among women.
I continue to await the first PhD dissertation written making use of OK Cupid data.

3 comments:

  1. "I continue to await the first PhD dissertation written making use of OK Cupid data"

    Try to convince one of your students to do it for sure.

    ReplyDelete
  2. I wonder whether they'd ever make their data available. I'd expect not absent some pretty serious privacy disclaimers.

    ReplyDelete
  3. I'm sure that it would be possible to get them to release data under privacy agreements AND giving them a substantial chunk of the IP :D

    ReplyDelete