Wednesday, 26 September 2012

Education regressions

Luis at Quantum Forest has been doing some great work with the schools data. I'm picking up on it with a few regressions. Raw performance data for the schools isn't all that instructive as schools have very different raw materials with which to work. But it would be nice to know how well a school does given the decile and ethnic mix of students coming into the classroom. So let's check that.

I'm using Luis's data here, modified slightly to work in Stata: I replaced the NA cells with . so that Stata would read things as numeric variables rather than strings. My do file and dta file are up at Dropbox.

Like Luis, I started by generating a variable giving the proportion of students either meeting or exceeding the standard in each of reading, writing, and maths. I then ran a few simple linear regressions with analytic weights equal to the total school roll: the dependent variable is an average and the schools average over different numbers of students.

Covariates are decile, decile squared, the number of students per full time teacher equivalent, proportion of each of {Maori, Pacific, Asian, International, Melanesian [MELAA, which could be an acronym for Middle-East, Latin America and Africa, says Kiwi Poll Guy in comments; he's likely right], Other} students (European dropped), indicator variables for each of {minor urban area (town), secondary urban area (suburb), main urban area (rural dropped)}, indicator variables for single sex boys and girls schools (co-ed dropped), an indicator variable for state schools (integrated schools dropped), an indicator for boarding schools, and indicator variables for each of the main types of school {composite, contributing, intermediate, secondary (full primary dropped)}.

Results are in Table 1, below. But first, a caveats. As best I understand things, these grades are not moderated. So any effects here could be saying either that some schools do a better job in teaching, or that some schools engage in grade inflation.

Table 1: Full sample: Reading, Writing, and Math

(1) (2) (3)
ReadingWritingMath

decile 0.0485***0.0363***0.0437***
(6.53) (3.67) (5.27)
Decile squared -0.00224***-0.00110 -0.00195***
(-4.39) (-1.62) (-3.41)
Students per teacher0.00282* 0.00362* 0.00215
(2.41) (2.35) (1.65)
Proportion of Maori students-0.0664* -0.0848* -0.112***
(-2.20) (-2.13) (-3.34)
Proportion of Pacific students-0.115***-0.114** -0.0676
(-3.52) (-2.63) (-1.86)
Proportion of Asian students-0.0796** -0.0787* 0.00448
(-2.63) (-1.97) (0.13)
Proportion of International students0.290 0.863** 1.510***
(1.23) (2.79) (5.76)
Proportion of MELAA students0.0952 -0.0934 -0.0722
(0.61) (-0.45) (-0.41)
Proportion of students Other ethnicity0.428 1.043* 1.040**
(1.35) (2.50) (2.92)
Minor Urban Area (Rural dropped)0.00817 -0.0104 -0.0257
(0.62) (-0.60) (-1.76)
Secondary Urban Area (Rural dropped)-0.0189 -0.0283 -0.0416**
(-1.33) (-1.52) (-2.63)
Major Urban Area (Rural dropped)0.00890 -0.00366 -0.0186
(0.80) (-0.25) (-1.50)
Boys school (co-ed dropped)0.101***0.118***0.205***
(3.90) (3.48) (6.87)
Girls school (co-ed dropped)0.125***0.181***0.142***
(5.60) (6.16) (5.44)
State school (integrated schools dropped)-0.0246* -0.0594***-0.0219
(-2.49) (-4.58) (-1.96)
Composite (Year 1-15) (Full Primary dropped)-0.0221 -0.0566* -0.0384
(-1.14) (-2.23) (-1.79)
Contributing (Year 1-6) (Full Primary dropped)0.0127 0.0345***0.0338***
(1.77) (3.64) (4.23)
Intermediate (year 7 and 8) (Full Primary dropped)-0.0673***-0.0974***-0.107***
(-6.07) (-6.64) (-8.65)
Secondary (Year 7-15) (Full Primary dropped)-0.0939***-0.110***-0.158***
(-6.78) (-6.06) (-10.15)
Boarding school -0.00755 -0.0716** -0.0475*
(-0.39) (-2.79) (-2.01)
Constant 0.574***0.527***0.569***
(15.48) (10.75) (13.78)

Observations 1006 996 1000
Adjusted R20.528 0.467 0.532

t statistics in parentheses
* p < 0.05, ** p < 0.01, *** p < 0.001

Decile matters greatly. All else equal, a school one decile higher has about a four percentage point increase in pass rates. But, decile matters at a decreasing rate: moving from Decile 2 to Decile 3 correlates with a 3.3 percentage point increase in maths pass rates while moving from Decile 8 to Decile 9 only improves pass rates by one percentage point.

Class size matters: schools with more students per teacher have higher pass rates. I suspect reverse causation here: for a fixed budget, those schools that are able to run larger classes are likely those that have fewer discipline problems and so are able to put those resources to other uses.

Ethnicity matters. A standard deviation increase in the proportion of Maori students reduces aggregate pass rates by 1.3 percentage points in reading and 2.2 percentage points in math. Similar trends exist for Pacific Island student ratios. I'd be pretty cautious in interpreting this one: if you run things decile-by-decile, the effects mostly disappear. The biggest negative effect seems to hold in high decile schools, but by the time you get to Decile 10 schools, the median school has only 5.9% Maori students. Results then may be a bit sensitive to a few outliers on the right hand side. Like Luis, I'll refrain from doing much more until the official results come out.

Single sex schools seem to do well; boarding schools seem to do poorly.

I generated residuals from each of the three specifications above. The residual tells us whether a school had a higher or lower pass rate than we would have expected given its characteristics. This either tells us how good (or bad) the school is at teaching, or how good (or bad) it is at grade inflation. Without external moderation, it's hard to tell. The residuals from the three specifications correlate strongly with each other: schools that are good (or grade inflate) tend to do so across the board. The lowest pairwise correlation was 0.57; the highest was 0.63. I averaged the residuals to get a composite score. A high residual means that the school's actual pass rate was higher than what we would have expected given its characteristics.

I'm not confident enough in the model to put up my own league table of residuals. But I will put this up. This is a scatterplot of the residuals showing just how much school performance varies after we have corrected for decile, ethnicity, and everything else in the above model. That can point to its being a bad model, the underlying data being bad, strong differences in teaching quality across schools, or a combination of all three.


There are decile 1 schools providing pass rates twenty percentage points or more above what we'd expect, given their characteristics (that's the 0.2 number on the y-axis); there is one decile ten school providing pass rates more than twenty percentage points below what we would expect given its characteristics. Differences in school performance simply do not come down only to decile. Decile's the most important thing. But differences in performance among schools of the same decile by definition have to be about something other than decile. I can't tell from this data whether it's differences in stat-juking, differences in unobserved characteristics of entering students, differences in school pedagogy, or something else. But there's something here that bears explaining. 

Update: Note also Luis's very sensible cautions about selection into the dataset with the preliminary data. Schools chose whether or not to release their results to Fairfax. I'd guess the bottom tail isn't in this distribution. 

10 comments:

  1. Is 'boarding school' just a binary variable? If so it could be very misleading as almost all, if not all, boarding schools take day students as well. For example my high school was technically a boarding school but only around 10% of the students boarded there.

    ReplyDelete
  2. It is an indicator variable for schools that provide residential services not including special schools which were excluded.

    ReplyDelete
  3. Eric,


    How does the residual variance compare to the binomial variance? Luis is getting residual standard errors about 1.5 (percentage points) from simpler models, and at p=0.75, that would correspond to n of about 750. Is the typical school really large enough that there is a lot of extra-binomial variation to explain on top of this model?


    [What I wanted to do first with the data was some funnel plots to see how much unexplained variation there was, but I didn't because the file didn't have sizes for each school.]

    ReplyDelete
  4. Luis and I used different weightings. I used Stata's aweight command with school's total enrolment because the dependent variable is an average where reporting units are of different size; Luis did something in R that weighted observations by their total enrolment to put more weight on larger schools. So we'll get different results partially from that. I'm using Luis's data though, which does have that school size variable.

    ReplyDelete
  5. Excellent work! Can't wait to see what people come up with when we have the full data sets.


    P.S. I think "MELAA" stands for "Middle-East, Latin America and Africa", not "Melanesian".

    ReplyDelete
  6. Hi Thomas,


    I'm hoping to move to more detailed analyses once we get the full dataset. total.roll in the csv file contains the school size and looks like this:


    # summary(standards$total.roll)# Min. 1st Qu. Median Mean 3rd Qu. Max. NA's # 11.0 112.0 219.0 270.3 372.0 1978.0 1


    so the typical school is somewhere near 250.

    ReplyDelete
  7. Really nice analysis. Coupla things. The parameters on single-sex are interesting -- they are equivalent to more than a full decile gain at the upper end, right? On the residual plot: are the upper deciles a little tighter than the lower ones (heteroscedasticity)? If so, there's an element of reducing uncertainty in the quality of an unknown good.

    ReplyDelete
  8. The residuals have heterogeneous variances and you can see that they reflect the much larger variance for low deciles in the raw data. There is plenty of variation that needs more explaining and many of us are looking forward to have access to the full data set.

    ReplyDelete
  9. Next time around, I will add the cancer-curing magic command ", robust" to my regressions. Hoorah!


    Bill: note that there are fewer observations in the lower decile groups too.

    ReplyDelete