Wednesday 31 March 2010

Dating Data

I'd pointed previously to some of the excellent data analysis work going on over at OkTrends: the data blog of internet dating site OK Cupid.

They're now taking on ideology. They seem to have a version of the World's Smallest Political Quiz running: they have ideology data on 172,853 people. THEY HAVE IDEOLOGY DATA ON 172,854 PEOPLE. The heart races!

They use the data to plot out the age-path of average position in two-dimensional ideological space (libertarian from 18-22, leftie from 23-31, rightie from 32-53, then authoritarian thereafter), match it with relative importance of economic and social issues by age to get correlates of partisanship, then show some effects on dating. On the latter, economic (or social) conservatives [note: US terminology] tend to have high agreement with other economic conservatives on non-political issues; economic (or social) liberals, not so much. So it's easy for them to make matches among conservatives, but harder to get good matches for liberals. I'd expect this to be due to a stronger religious dimension to social conservative views that yields common answers on other questions; if economic and social conservatism are correlated, then we'd see (and do see) similar but slightly weaker results for economic conservatives.

I'll leach one of their graphs: go to their post for the full set.

I'd love to see a variant on their age-slider graph that would show densities across the space rather than just average position.

I also wonder how badly sample selection issues come in for the 30 and up bracket. Presumably folks drop from OK Cupid when they marry and only re-enter if they divorce (or are high risk to divorce); the sample of those in their 50s who are dating is going to be different from the full sample.

Massive props to OK Cupid for sharing some of this summary data. Awesome work guys!


  1. Very interesting stuff. I guess I haven't grown up yet. :P

  2. Wow, you are right about the depth of that post. Now I'm trying to figure out whether the work on that blog exists as an advertising medium, or it's just someone who loves graphs.

  3. There's at least a few PhDs in the data they're sitting on.