Tuesday 20 September 2011

Hyslop and Stillman [updated]

Dean Hyslop and Steve Stillman have updated their prior work on the youth minimum wage in New Zealand to look at the most recent changes.

Here's the briefest synopsis of why I think we find divergent results on unemployment. Where I have everywhere been using the unemployment rate - the fraction of those in the labour force who are unable to find work - they are instead using the percentage unemployed - the fraction of the population cohort who are unable to find work, regardless of what proportion of that population wishes to be in work. As the labour force participation rate among sixteen and seventeen year olds over the period did not drop as quickly as did employment, the unemployment rate increased greatly relative to the percentage unemployed. The two measures answer very different questions. But skip straight to the end for the graphs showing this.

Recall that their prior study found no particularly bad outcomes consequent to the year 2000 changes to the youth minimum wage that brought 18 and 19 year olds up to the adult rate, despite some evidence of employment decreases among that group by 2003.

In the current study, they find that bringing 16 and 17 year olds up to the adult minimum wage resulted in substantial decreases in employment - they say 20-40% of the drop in employment among that age cohort, or between 4,500 and 9000 jobs losses, can be chalked up to the regulatory change. But, they argue this had no significant effect on percentage of unemployed 16 and 17 year olds because most of the employment losses were among students combining study and part time work. They've a rather more complicated econometric model than the simple one I've been using; my simple one finds substantial increases in unemployment among 16 and 17 year olds as well as decreases in employment.

First, a quick tour through the main results I've been finding and posting here on the blog before going through Hyslop and Stillman's.

Until very recently, I was using HLFS data on the 15-19 year old cohort for youth unemployment; I hadn't access to more finely grained data. But, StatsNZ kindly sent over data splitting each age group in that cohort. Here's what the unemployment numbers look like.

The red line hits at 2008Q2 - the first quarter in which 16 & 17 year olds are subject to the same minimum wage as that facing workers in all older cohorts. The blue line traces the unemployment rate for that group. Do note that the gap between the blue and red lines - divergent outcomes between 16 & 17 year olds and 18 & 19 year olds - only became persistently large starting around 2010Q3. Since that quarter, 16 and 17 year olds' unemployment rate has been ten points larger than that experienced by 18 and 19 year olds; the largest gap prior to 2008Q2 was about eight points in 1986. This will matter later when we look at the period of analysis in Hyslop and Stillman's paper. Note also that, according to the numbers Stats NZ gave me, the current unemployment rate for 16 & 17 year olds is higher than 30%.

What about employment rates? 

Youth employment rates tank after 2008Q2. Some of this is just the recession. But note how little the adult employment rate has moved compared to that for those aged 16 and 17. 

The very very simple model I've been running has taken unemployment outcomes for youths as a function of adult unemployment rates and the square of adult unemployment rates. I estimate the model over the period from 1986 through and including first quarter 2008. After that point, sixteen and seventeen year olds become subject to the adult minimum wage. I then ask Stata to predict the youth unemployment rate given the adult unemployment rate, both for the period of estimation and for the post-estimation period. The gap between the estimated and the actual unemployment rate is the residual. I do the same again for employment rates.

Now there can be a few problems with this kind of very very simple model. First off, out-of-sample prediction is always a bit of a mess; we need to check that the method isn't throwing spurious results. I do this by taking, in turn, each age cohort's unemployment rate as the dependent variable and putting the "everybody except for that cohort" unemployment rate (and its square) over on the right hand side. If the predicted unemployment rate diverges wildly from that observed for the post-2008 period, then I have a problem with my method. If the predicted unemployment rate only goes haywire for the group affected by the minimum wage changes, that lends weight to my method. If the predicted unemployment rate goes most haywire for the 16-17 year olds, rises less for 18-19 year olds, and rises less again for 20-24 year olds, that suggests, to me, that two things are going on: the youth minimum wage has worsened unemployment outcomes for the 16-17 cohort, and that groups with the highest proportion of members on the minimum wage have worse outcomes when the recession hits late in 2008. While 18 and 19 year olds have been subject to the adult minimum wage since 2001, overall unemployment rates were very very low through most of the 2000s. Once unemployment rose, the previously non-binding minimum wage on 18-19 year olds became binding. 

What happens when I check? Here's a plot of the residuals for each age cohort. The red line marks the start of the out-of-sample prediction period - 2008Q2 onwards. The blue line that reaches for the sky is the residual on the 16-17 year old unemployment rate. The red line that also tracks upward, albeit not dramatically, is the residual on the unemployment rate for 18-19 year olds. There's a slight increase in the residual for 20-24 year olds. If the blue and red lines weren't there, you would really not be able to tell that the red line marked the start of an out-of-sample prediction. So I'm pretty sure that the method I'm using isn't throwing up artefacts. 

Hyslop and Stillman use the unemployment rate among 20-21 year olds as the basis for their difference-in-difference estimation technique; I'm using the unemployment rate among everyone who isn't 16-17. Is that what's driving differences? No. Or, at least, I don't think so. I'm not sitting on a StatsNZ Data Centre,  as I expect Dean Hyslop was for rather a while while doing up this study, and so I don't have access to data on the unemployment rate facing 20 and 21 year olds. But I can run a set of other potential baselines for the simple regressions: the unemployment rate among everyone who isn't 16 or 17, the unemployment rate among everyone over the age of 19, the unemployment rate among 20-24 year olds, and the unemployment rate among 18-19 year olds. They all track pretty similarly, though the residuals are smaller in the post-2008 period when I use younger reference cohorts.

It's really not going to matter much which non-youth unemployment rate I use to predict the unemployment rate experienced by 16 and 17 year olds.

It's also worth noting that my simple technique is, nevertheless, a difference-in-difference technique. I'm looking at what happens to the youth unemployment rate relative to the adult rate (or various older cohort rates) subsequent to a policy change particularly affecting 16 and 17 year olds.

What happens when I do all the same fooferah for employment rates rather than unemployment rates? Recall that employment rates aren't just the inverse of unemployment rates; rather, the denominator is cohort population including those outside of the labour force while the unemployment rate counts only those in the labour force in the denominator. Well, here the choice of comparison group starts to matter. Here are the residuals:

Here, when I use employment rates among everyone else or among adults as baseline, relative employment rate outcomes for youths are worse in the post-2008 period than when I'm using younger cohorts as baseline. Either way, though, we get big declines in employment rates among 16 and 17 year olds, even relative to 18 and 19 year olds, in the period from 2008Q2 onwards.

So all my cards are on the table. Here's my .do file. And here's my .dta file. I don't think Hyslop and Stillman can put theirs up since they're using confidential HLFS individual-level data.

What do Hyslop and Stillman do? Instead of running a cohort's unemployment rate as the dependent variable the way I have, they set things up as a panel. Then, the unit of observation is the cohort-quarter with one observation for 16-17 year olds, one for 18-19 year olds, one for 20-21 year olds, and observations on others used to get business cycle effects. They then run panel techniques with age fixed effects, quarter fixed effects, and an indicator variable for whether the cohort was subject to the adult minimum wage. That's a lot of fixed effects to be throwing around when there are only twelve quarters of treatment period in their study. [No it isn't. They're using individual level data on thousands and thousands of individuals.]

But, as best I can tell, Hyslop and Stillman aren't testing the unemployment rate in any of their work. They're testing the fraction of unemployed in the cohort population. Those are not the same thing. The unemployment rate takes as denominator the number of people of the age cohort that are in the labour force. They're instead using the ratio of the number of cohort unemployed to the total number of people in that cohort. The difference matters a lot. Here's a short plot of the two series.

The unemployment rate among 16 and 17 year olds spiked massively after 2008Q2 but the cohort's percentage of unemployed persons did not climb very much. Honestly, the only way I noticed that they were using the percentage of unemployed rather than the unemployment rate was because the summary stats reported at page 10 were just so way out from the dataset I've been using. They report an increase in the percentage unemployed from 8.1% to 13.5%; meanwhile, the unemployment rate increases from 14% to 27% over the same period. How do we get the divergent series results? The labour force participation rate among 16 and 17 year olds had to have been dropping less quickly than were the number of kids in employment. 

If I re-run stuff using the percentage unemployed as dependent variable rather than the unemployment rate, and take the 20-24 cohort as the basis for predicting outcomes here's the comparative residual plots:

I've added in a second red vertical line here. Why? Because Hyslop and Stillman only consider a two year window subsequent to the law change. The red lines mark the start and end of that period, inclusively. The red line traces residuals using the Hyslop and Stillman specification that has the percent unemployed as the outcome variable of interest. [Update: They run things through Q42010; their window is wider than I'd thought on a first reading] The blue line does the same for unemployment rates. After the second red line, outside the period of their analysis, the youth unemployment rate continues to skyrocket relative to expectations given the unemployment rate among 20-24 year olds. The percent unemployed climbs back up to the high levels experienced for some, but not all, of the period inside the red lines.

And that's why we get different results. I don't think it has anything to do with their fancier econometric techniques. If I thought that "number of unemployed over total population" were something more economically relevant than "number of unemployed over total labour force", then I'd also conclude that there wasn't a big effect. The residual jumps up, but hardly enough to make anything of. The residual over their estimation period is 2.2 points - the percentage of 16 and 17 year olds unemployed in that two year window is two percentage points higher than we would have expected over the prior period. If we extend the window to include all the potential observations (I have no clue why they truncate to a two year window either side when sufficient data is available for a three year window), the residual increases to 2.7 points.

I really am not sure why Hyslop and Stillman chose to use the percent unemployed rather than the unemployment rate. They're top notch guys and must have had a good reason for it. [Updated post follows here: they had good reason.] The two measures answer different questions. Their measure tells us "What is the effect of increasing the youth minimum wage on the percentage of sixteen and seventeen year olds who are unemployed?" My measure tells us "What is the effect of increasing the youth minimum wage on the percentage of sixteen and seventeen year olds who are unable to find work, among those who wish to be in work?" The latter tends, I would have thought, to be the more interesting question as the expectation of a higher potential wage will increase the number of kids (attenuate the decline in the number of kids) wishing to be in the labour force. The unemployment rate tells you the fraction of those whose wishes for employment are thwarted. The percent unemployed tells you the fraction of those in an age bracket who are unemployed, but without any measure of what portion of those in that cohort wish to be in employment. 

And now I expect political debate about the youth minimum wage to turn into quibbles about which definition of unemployment matters most: the one that StatsNZ regularly reports, or the one Hyslop and Stillman were commissioned to use. 


  1. And the Greens are suggesting that scrapping the minimum wage pushed more young people into continuing and/or higher education rather than entering the workforce. I imagine this is an unintended consequence, although the Greens seem to be claiming this as some sort of victory for the policy. I'm not knocking higher education, its a good thing, but I suspect this only happened because a whole chunk of young people couldn't find work. I don't doubt the Greens passion and committment, but they do at times seem to be blinded by ideology and their sense of fairness.

  2. There's good reason for Hyslop & Stillman's choice of method - it makes a lot of sense. It just answers a different question. Post to follow.

    Apologies for the munged template. Lousy Android Blogger app killed it. Egads.

  3. No doubt, and my beef is more with Gareth Hughes and co cherry picking out the studies which support their preconceptions. Confirmation bias is alive and well, but I know it isn't only the Greens that do this.