Offsetting Behaviour: Easton on schools

Brian Easton doesn't seem to like our recent report on school quality.

A recent contribution has been from the NZ Initiative’s report Tomorrow’s Schools: Data and Evidence. [EC note: I've updated the link to the Initiative's site rather than Scoop] Unfortunately it is only note of six pages, which does not meet the standards of a research report, so I can hardly comment on the quality or veracity of the findings.

The note observes there are performance differences among schools (it uses NCEA attainment as a measure). No one is surprised that higher-decile schools outperform lower-decile schools by a large margin (on average). However, once the NZ Initiative adjusted for the effect of family background (they dont explain how), they found that the average differences in education outcomes across school deciles disappears. The report concludes that the inequality in education outcomes evident in school league tables is not a result of large differences in school quality, but rather of large differences in family background, particularly differences in parental education.

The NZ Initiative concludes that their ‘research’ demonstrates that the current schooling system is working and should be retained. Maybe; one wants to see the research first, especially as it contradicts the international literature. (I can think of a number of ways one could do the exercise – not all of them would be valid.)

What strikes me is that the NZ Initiative barely observes that the research suggests that the main source of educational inequality (and a whole lot of life opportunities which follow on from it) is ‘family background’, whatever they mean by that. The implications for inequality are hardly explored. As far as I can infer, the NZ Initiative is so besotted with defending the competitive model of schooling it is uninterested in the wider questions of the sources of and policies for children’s opportunities,; issues central to the egalitarian society. That, I think, captures a deep attitude of the elite right; ‘who cares about social inequality providing we are doing all right’.

Indeed there is celebration of inequality when the rich display their wealth. Of course there was inequality in the egalitarian society before 1985, but it was rare for the rich to show it, to display, what Thorstein Veblen called, ‘conspicuous consumption’. After 1985 it became common to flaunt how rich you were.

Our analyst, Joel Hernandez, spent about a year in the IDI lab on this one.

The mission we gave him: start by figuring out how much of the variability in school performance is due to things outside the school's control, like family and student background. Current league-tables could easily mostly be picking up parents' education or income. We need to be able to find the schools that are doing a superb job despite difficult circumstances, so that we can learn from them. The measures out there just aren't up to spec for doing that kind of work.

So he spent the last year merging a ton of administrative data sets and cleaning the data. It is not a small job.* For the population of students who completed NCEA from 2008 through 2017, there's a link through to their parents. From their parents, to their parents' income. And their education. And their benefit histories. And criminal and prison records. And Child, Youth, and Family notifications. And a pile more. Everything we could think of that might mean one school has a tougher job than another, we threw all of that over onto the right hand side of the regressions.

It's student-level observations with a ridiculous number of control variables enabled by the data linkages in the lab.

The point of the exercise wasn't to precisely identify coefficients on each of the independent variables. The point rather was to mop up all of the variation that comes from family circumstances. There's no structural equation modelling here or any attempt at getting at causality among those variables - just a giant reduced-form kitchen sink.

Plus, five hundred or so indicator variables for each of the country's secondary schools.

On the left-hand side - a few measures of performance at NCEA. But that's just the starting point. We're going broader. Employment after graduation. Income after graduation. Progression to tertiary. NEET status (Not in Education, Employment or Training). Benefit uptake. We could even put future criminal activity in there. So far, it's just NCEA though.

After separating out all of the variability that comes from family background, the coefficient on each of the schools' indicator variables tells you the average effect of that school on outcomes.

Our plan had been to put up the big report in July(ish) with all of the method and the first set of results. Then, short reports would follow regularly on different outcomes.

But then the Bali Haque report came out. The report said that there are huge differences in student outcomes across schools, that those differences showed up as differences by decile, that decile differences were inequitable, and that the entire school system needed to be overturned because of it. Currently self-managed schools operating under school boards would be replaced with hubs managing dozens of schools.

There are indeed very real problems in board governance in some failing schools. It's something that features regularly in stories of persistent school failure. But if the justification for abolishing school boards and putting in place a big new governance structure is strong differences in school outcomes by decile, well, we know that that wasn't the case.

So we moved. Because we're a think tank. Forward the short report on the variability in outcomes by decile after separating out the family background effects.

And then Brian thinks we're hiding stuff or trying to downplay the family circumstances, perhaps trying to hide the evidence that big income redistribution schemes are warranted.

It would have been irresponsible of us to put up the coefficients on the other correlates. We have a kitchen sink of variables to mop up effects, not to precisely identify the coefficients on any of them. Putting up all of those coefficients without checking their sensitivity would have been premature. We control for whether the child is from a single-parent household. Whatever the sign on the coefficient, it would feed into culture wars around divorce and the desirability of two-parent families. We control separately for mothers' and fathers' incomes, and mothers' and fathers' educations. Results could feed into arguments around two-earning families. Most of those control variables would fuel one interest group or another. And even if we were sure we had the numbers right, they're still not causal. If you find that kids of well educated parents perform better at NCEA, that doesn't mean you should start giving degrees to parents to boost their kids' chances.

We'll have more in the full report. What we have so far though lends zero support to arguments around redistribution. Parents' education matters a ton. Income - not so much when education is controlled for. But we need to play with it more before we say anything more. If we have to suffer Easton's grouchiness for it, oh well.

But our object here is the exact opposite of Easton's imaginings. If we can identify the schools that do a fantastic job with kids that other schools have a hard time helping, that means the Ministry or ERO could go into those schools and see whether they're doing anything differently from schools that aren't doing such a good job. Sure, it would take a policy change around operational use of IDI. But it is entirely doable. Learning from that can help lift performance for those that too many schools are currently failing.

And there are all kinds of ways of handling it.

Within the current model, you could get reports from the Ministry to every school board telling them where they're doing well, where they're doing poorly, and which schools they might want to learn from (and which might need their help). The reports would help empower school boards that cannot tell whether poor outcomes are because the community is disadvantaged, or because the Principal is failing. And if the data were available to the parents, that could encourage parents to take a more active role in board governance in places where there is underperformance. Both voice within the school, and exist from underperforming schools, could help encourage better performance. And don't pretend that this is bad because the status quo is some paradise where all the schools are doing great and everyone sends their kid to the local school. Right now, parents use worse proxies for school performance and will happily walk by an excellent low decile school to get their kid into a higher decile school that's further away. The local school might be the one who could do more to help their kid. But we can't tell without better data.

Within a hub-based model, the reports provided by the Ministry could help the overarching structure to manage performance among their schools, to send investigators in to figure out why one school is doing particularly well in ways that nobody had noticed before, and to use what they learn to help others. They could use it to test the effects of different kinds of practice on outcomes. In the data lab, what goes on in the schools is a black box. We just don't know. But the hubs might know that one school never shifted to modern learning environments and the other one shifted to them 6 years ago. It could look at whether those kinds of policies had any effect.

Either way, it would also help the Ministry in similar ways. It could help ERO check whether any of their interventions improve student outcomes.

There is just so much that can yet be done with better use of the data we have. I've been pointing to it for years. Nothing's being done about it. The Ministry has a staff of 3000; we have Joel. We don't have time to do all of it. That's one reason we're opening up all our code in the lab for others to build on.

Imagine if every guidance counsellor in every school received a report from the Ministry for every student. The report for each student finishing Year 10 would say "Here are a thousand kids who looked a lot like you 5 years ago, and another thousand who looked a lot like you 10 years ago. Here are the choices they made about paths through school, through to tertiary or vocational training, and their later employment outcomes. Here's what the kids like you who chose a Bachelor's Degree are doing now. Here's what's happened for those who chose vocational training. Many of these choices may never have occurred to you." It is entirely feasible to do this right now. It would take a few months' coding. After that, it's just push-the-button. And it isn't being done.

What difference could it make for a kid who never considered university a possibility, because of the community she grew up in, to see that other kids with similar academic records had done brilliantly at uni and that they'd better push to do the UE courses? What difference could it make for a kid whose parents were pushing university to see that kids with comparable records did far better pursuing a trade? Better information has to help.

We have a pretty big work programme here on deck. Once Joel's code is up in the data lab, I'll put up a note about it. I've already been in touch with friends back at Canty. One substantial barrier to assigning IDI projects for Masters thesis work is that you spend a year in data cleaning and matching (and just learning your way around) for any big project. You can't dump that on a Masters student without strong risk that the project falls over. But you can assign projects that build on an existing codebase.

Academics won't put up their code because their incentive is the opposite of ours. They'll want to get the vita lines on every possible way of dicing the data after fronting the fixed cost of merging it. Just look at access around the Dunedin Longitudinal survey, or some of the others out there. Tons of publicly funded work locked up for the benefit of those who ran the surveys.

We want ours to be as open as possible within the constraints set by StatsNZ around the data lab. We want way more people using that code base to see what's going on in education. And if any of them find ways of improving the code to improve match rates, even better!

Our work here is a starting point.

* Even worse, it seems to be a much-repeated job, with anyone doing work in the area duplicating efforts. Joel will be getting all of his code up into the StatsNZ wiki for others to build on - the process for getting it in there isn't trivial though.

Monday, 15 April 2019

Easton on schools

No comments:

Post a Comment

FeedBurner FeedCount