In advance of the semi-final between the Crusaders and the Sharks this evening, it is timely to look at the fairness of the Super 15 schedule. The Crusaders are playing at home, a massive advantage that they earned by virtue of finishing one point ahead of the Sharks in the regular season. But was that a fair reflection of the two teams?
The Super 15 rugby competition is a bit unusual in its unbalance. There
are five teams from each of three countries. Each team plays the other four
teams in its country twice, home and away; it plays four of the five teams from
each of the other two countries once, two games at home and two away; and it
doesn’t play the remaining two teams at all. This leads to three reasons why a
schedule may favour some teams over others: First, teams from stronger
countries have to play more games against each other making it harder for the
best teams from those countries to finish ahead of the best teams from weaker
countries; second, a team is favoured if the two teams it doesn’t have to play
are relatively weak strong; and third, for the best teams, there is an advantage to
playing the stronger teams from other countries at home to get the benefit of
home-field advantage, and play away against weaker teams who can be expected to
lose in any location.
Mark Reason recently had an article in the Dominion Post suggesting that
these factors led the Crusaders (who finished the regular season in second
place overall) to have been favoured in this year’s competition and to have
penalised the Hurricanes (who finished seventh and out of the playoffs) . His
logic seemed impeccable to me; certainly it seemed that the Crusaders benefited
from the luck of the draw this year relative to recent years when they had to
play the best South African teams in South Africa.
I am currently doing some research constructing rankings for
international cricket, and thought it would be fun to use the same method to
infer how teams would have finished in the Super 15 had they had a balanced
schedule. Kirdan Lees has beaten me to it, in a welcome new blog: Sport Loves Data. Kirdan has reevaluated the ranking of the 15 teams, taking into account the imbalance in the schedule, and has posted his results here. Given that Kirdan’s method is very different
from mine, I decided to see how the two methods would compare. The table below
gives the actual points table, and my revised points table adjusted for schedule
unfairness. (The TL;DR explanation of my method is detailed at the bottom of this post.)
Team
|
Actual
|
Predicted
|
Waratahs
|
58
|
58.5
|
Crusaders
|
51
|
50.1
|
Sharks
|
50
|
52.4
|
Brumbies
|
45
|
45.0
|
Chiefs
|
44
|
42.6
|
Highlanders
|
42
|
37.7
|
Hurricanes
|
41
|
40.7
|
Western Force
|
40
|
40.2
|
Bulls
|
38
|
38.3
|
Blues
|
37
|
38.1
|
Stormers
|
32
|
33.4
|
Lions
|
31
|
33.8
|
Reds
|
28
|
24.5
|
Cheetahs
|
24
|
23.9
|
Rebels
|
21
|
23.1
|
Kirdan's method gives rankings rather than points, so the following table shows just the assumed finishing position:
Team
|
Actual
|
Predicted
|
Kirdan
|
Waratahs
|
1
|
1
|
1
|
Crusaders
|
2
|
3
|
2
|
Sharks
|
3
|
2
|
4
|
Brumbies
|
4
|
4
|
3
|
Chiefs
|
5
|
5
|
6
|
Highlanders
|
6
|
10
|
7
|
Hurricanes
|
7
|
6
|
5
|
Western Force
|
8
|
7
|
11=
|
Bulls
|
9
|
8
|
9
|
Blues
|
10
|
9
|
10
|
Stormers
|
11
|
12
|
8
|
Lions
|
12
|
11
|
11=
|
Reds
|
13
|
13
|
14
|
Cheetahs
|
14
|
14
|
13
|
Rebels
|
15
|
15
|
15
|
The interesting thing is that my and Mark Reason’s intuition about how
much the Crusaders were favoured this year turns out to have been overblown,
although the method does result in my having the Crusader’s ranked just behind
the Sharks rather than slightly ahead. And yes, the Hurricanes would have
qualified for the playoffs as one of the top six teams using my or Kirdan's rankings, but using my method the reason is not that the method pushed them up but rather that the big mover was the
Highlanders, who appear to have been hugely favoured by the schedule this year.
Postscript: Kirdan has another post looking at home field advantage in
the Super 15. My probit regression method, would require a lot more data to
analyse team-specific home field advantage, but in a model which assumes that
the advantage is constant across teams, the result is that home-field matters
so much in the Super 15 that, in a match between two teams of equal ability,
the one playing at home has a 75% chance of winning. It is no surprise that the
Super rugby competition has almost always been won by the team that finished
first in the regular season, and who therefore are not only likely the
strongest team, but also earn home-field advantage throughout the playoffs.
TL;DR Explanation of Method:
- There are two separate LHS variables, each estimated by an ordered probit regression: table points scored by home team, table points scored by away team. Each can take the values 0, 1, 2, 3, 4, 5.
- My database only included the scores, not the bonus points scored. The actual points earned by each team for winning, tying, or losing by 7 points or less, can be inferred from the scores, but not bonus points for scoring 4 tries or more. I proxied this by assigning a bonus point if the team scored 30 points or more. The method proceeds as follows:
- Generate a dummy for each of the 15 teams that equals 1 if that team was the home team, and -1 if it was the away team.
- Run two ordered probits, one for points scored by the home team, and one for points scored by the away team, in each case run on the 15 dummies (one dropped) and a constant.
- Predict the probability of scoring 0,1,2,3,4,5 points for each of the 210 possible matchups (each team playing each other home or away), and found the expected points for each.
- Then sum these to get the total points in a balanced competition where every team plays every other twice, home and away.
- Finally, normalise these by a linear transformation to get the same mean and s.d. as the actual super 15 points table.
You say a team is favoured if the teams it doesn't have to play are relatively weak. Does the point ranking then count a win against a weak team for much less than a win against an average team?
ReplyDeleteCrusader’s ranked behind the Sharks. Crusaders 38, Sharks 6. Say no more.
ReplyDeleteDear economists,
ReplyDeleteIf I have a statistics blog where a substantial fraction of the traffic is driven by rugby, and there is an increase in the number of other NZ blogs posting intelligent data-based things about rugby, am I correct in thinking this is a Bad Thing for me?
Are there recognised strategies for encouraging market failure so as to reduce competition? Do any of them work without lots of money? Or am I allowed to just think of the benefits to society as a whole?
Yours sincerely,
Statistically Troubled And Tending Slightly to Concerned Hypochondria About Traffic
Hmm. I realised my prior reply potentially broke your careful anonymity and so I have deleted it.
ReplyDeleteI note that the NZ econ blogosphere does much better now that it's more than just TVHE, AntiDismal and us.
Does it not depend on whether the other blogs are a complement or substitute for your blog?
ReplyDeleteDear STATSCHAT,
ReplyDeleteNaturally, a naive economist like me would think only of the social good. But a public choice expert like Eric might be inclined to suggest that the best course of action would be to encourage the government to introduce a licencing regime such that it would be prohibitively expensive for outsiders like me to enter into your space!
Best of all, however, would be to note that there are positive spillover benefits in blogging, so that more arrivals in this space is not necessarily a bad thing. Actually, I genuinely had intended to reference the rugby rankings on a blog coincidentally called Statschat. Their and my rankings are asking a different question and hence use a different method but do complement each other.
See my point about home-field advantage. My model had Sharks ranked ahead of the Crusaders in a balanced competition, but the Crusaders heavily favoured on Saturday night, simply because they were playing at home. And that was without the model estimating the additional home-field advantage that comes when the opposition has had to travel between South Africa and New Zealand
ReplyDeleteNo the point ranking doesn't have make a distinction. But home field advantage will have a bigger impact on the probability of winning, the closer the two teams are in ranking. It is to a good team's advantage to reduce its chance of winning against a weak team from 95% to 90% while increasing its chance against a strong team from 25% to 75%, simply by changing which one it plays at home.
ReplyDeleteIt would be interesting to split that home-field advantage into a) the amount that the home team plays better, and b) the amount that the referee favours the home team. It could probably be done by looking at the key stats and isolating the impact of penalties on the final result, then seeing if home teams receive more penalties than the other stats would suggest that they should.
ReplyDeleteBut Kirdan tells us
ReplyDelete"And the Crusaders? Don’t put down your house on them winning at home. Home form has been woeful and the Sharks classy away "
So the Crusaders don't have much of a home-field advantage. So that can't be the story.
You need to explain to me slow-like why it's better not to have racked up more wins by playing the real easy teams.
ReplyDeleteThat's not the right thought experiment. Yes, it is better to have a schedule that has you playing weaker teams. That is one of the sources of imbalance. But now, let's say that you are a top NZ team and it is given that you will be playing, say, both the Warratahs (Australia's top team) and the Rebels (their weakest team), one at home and one away. Which one would you rather have your home game against? Now think probit. Gaining home field advantage pushes you a distance to the right on the latent variable axis (horizontal), losing it pushes you the same distance to the left. If you are near the mean (which is the case when two teams are of roughly equal strength), the slope of the cumulative normal is high, so the change in probability from a given change in latent variable is high; when you far from the mean (one team is much stronger than the other), the change in the probability is small. So you want the evenly matched game at home and the mismatch on the road.
ReplyDeleteSo I should have read "second, a team is favoured if the two teams it doesn’t have to play are relatively weak" to mean "have to play AT HOME are relatively weak"?
ReplyDeleteAhhrgh. I completely misread your original question. Now I see the typo in my original post. Now corrected.
ReplyDelete