Friday 17 February 2012

Trusting econometrics

One of my profs at Mason told the story of how he'd been offered a new boat if he could get the coefficient in a regression to be below two - which would have allowed a merger to proceed. He turned it down, but not everybody does. Unfortunately, in a whole pile of empirical work, you either have to really trust the guy doing the study, or make sure that his data's available for anybody to run robustness checks, or check that a bunch of people have found kinda the same thing. Degrees of freedom available in setting the specifications can sometimes let you pick your conclusion, like getting a coefficient that hits the right parameter value or the right t-stat.

David Levy and Susan Feigenbaum worried a lot about this in "The technological obsolescence of Scientific Fraud". Where investigators have preferences over outcomes, it's possible to achieve those outcomes through appropriate use of identifying restrictions or method - especially since there are lots of line calls in which techniques to use in different cases. They note that outright fraud makes results non-replicable while biased research winds up instead being fragile - the relationships break down when people change the set of covariates, or the time period, or the technique.

Note that none of this has to come through financial corruption either: simple publish-or-perish incentives are enough where journals are more interested in findings of significant than of insignificant results; DeLong and Lang jumped up and down about this twenty years ago. Ed Leamer made similar points even earlier (recent podcast). And then there's all the work by McCloskey.

Thomas Lumley today points to a nice piece in Psychological Science demonstrating the point.
In this article, we accomplish two things. First, we show that despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings ( ≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.
Degrees of freedom available to the researcher make it "unacceptably easy to publish "statistically significant" evidence consistent with any hypothesis." They demonstrate it by proving statistically that hearing "When I'm Sixty-Four" rather than a control song made people a year-and-a-half younger.

The lesson isn't radical skepticism of all statistical results, but rather a caution against overreliance on any one finding and an argument for discounting findings coming from folks whose work proves to be fragile.


  1. "The lesson isn't radical skepticism of all statistical results, but rather a caution against overreliance on any one finding and an argument for discounting findings coming from folks whose work proves to be fragile."

    Indeed. However, I think the realisation of just how important unobservables are in most circumstances - and how it is often difficult to deal with them - has made me discount empirical results more over time.

  2. Christian BjørnskovSun Feb 19, 01:13:00 am GMT+13

    "none of this has to come through financial corruption" - of course. But it's not just publish-or-perish, but also personal beliefs. The literature on aid effectiveness is a great case. Doucouliagos and Paldam, in their meta-analysis of the 140 published studies in the field, for example find that people working with or financed by donor organisations are way more likely to 'find' that foreign aid works. Virtually no studies by really independent scholars find any positive effects of aid.

    1. Doucouliagos is great for poking holes in empirical literatures, isn't he?