In advance of the semi-final between the Crusaders and the Sharks this evening, it is timely to look at the fairness of the Super 15 schedule. The Crusaders are playing at home, a massive advantage that they earned by virtue of finishing one point ahead of the Sharks in the regular season. But was that a fair reflection of the two teams?
The Super 15 rugby competition is a bit unusual in its unbalance. There
are five teams from each of three countries. Each team plays the other four
teams in its country twice, home and away; it plays four of the five teams from
each of the other two countries once, two games at home and two away; and it
doesn’t play the remaining two teams at all. This leads to three reasons why a
schedule may favour some teams over others: First, teams from stronger
countries have to play more games against each other making it harder for the
best teams from those countries to finish ahead of the best teams from weaker
countries; second, a team is favoured if the two teams it doesn’t have to play
are relatively weak strong; and third, for the best teams, there is an advantage to
playing the stronger teams from other countries at home to get the benefit of
home-field advantage, and play away against weaker teams who can be expected to
lose in any location.
Mark Reason recently had an article in the Dominion Post suggesting that
these factors led the Crusaders (who finished the regular season in second
place overall) to have been favoured in this year’s competition and to have
penalised the Hurricanes (who finished seventh and out of the playoffs) . His
logic seemed impeccable to me; certainly it seemed that the Crusaders benefited
from the luck of the draw this year relative to recent years when they had to
play the best South African teams in South Africa.
I am currently doing some research constructing rankings for
international cricket, and thought it would be fun to use the same method to
infer how teams would have finished in the Super 15 had they had a balanced
schedule. Kirdan Lees has beaten me to it, in a welcome new blog: Sport Loves Data. Kirdan has reevaluated the ranking of the 15 teams, taking into account the imbalance in the schedule, and has posted his results here. Given that Kirdan’s method is very different
from mine, I decided to see how the two methods would compare. The table below
gives the actual points table, and my revised points table adjusted for schedule
unfairness. (The TL;DR explanation of my method is detailed at the bottom of this post.)
Team
|
Actual
|
Predicted
|
Waratahs
|
58
|
58.5
|
Crusaders
|
51
|
50.1
|
Sharks
|
50
|
52.4
|
Brumbies
|
45
|
45.0
|
Chiefs
|
44
|
42.6
|
Highlanders
|
42
|
37.7
|
Hurricanes
|
41
|
40.7
|
Western Force
|
40
|
40.2
|
Bulls
|
38
|
38.3
|
Blues
|
37
|
38.1
|
Stormers
|
32
|
33.4
|
Lions
|
31
|
33.8
|
Reds
|
28
|
24.5
|
Cheetahs
|
24
|
23.9
|
Rebels
|
21
|
23.1
|
Kirdan's method gives rankings rather than points, so the following table shows just the assumed finishing position:
Team
|
Actual
|
Predicted
|
Kirdan
|
Waratahs
|
1
|
1
|
1
|
Crusaders
|
2
|
3
|
2
|
Sharks
|
3
|
2
|
4
|
Brumbies
|
4
|
4
|
3
|
Chiefs
|
5
|
5
|
6
|
Highlanders
|
6
|
10
|
7
|
Hurricanes
|
7
|
6
|
5
|
Western Force
|
8
|
7
|
11=
|
Bulls
|
9
|
8
|
9
|
Blues
|
10
|
9
|
10
|
Stormers
|
11
|
12
|
8
|
Lions
|
12
|
11
|
11=
|
Reds
|
13
|
13
|
14
|
Cheetahs
|
14
|
14
|
13
|
Rebels
|
15
|
15
|
15
|
The interesting thing is that my and Mark Reason’s intuition about how
much the Crusaders were favoured this year turns out to have been overblown,
although the method does result in my having the Crusader’s ranked just behind
the Sharks rather than slightly ahead. And yes, the Hurricanes would have
qualified for the playoffs as one of the top six teams using my or Kirdan's rankings, but using my method the reason is not that the method pushed them up but rather that the big mover was the
Highlanders, who appear to have been hugely favoured by the schedule this year.
Postscript: Kirdan has another post looking at home field advantage in
the Super 15. My probit regression method, would require a lot more data to
analyse team-specific home field advantage, but in a model which assumes that
the advantage is constant across teams, the result is that home-field matters
so much in the Super 15 that, in a match between two teams of equal ability,
the one playing at home has a 75% chance of winning. It is no surprise that the
Super rugby competition has almost always been won by the team that finished
first in the regular season, and who therefore are not only likely the
strongest team, but also earn home-field advantage throughout the playoffs.
TL;DR Explanation of Method:
- There are two separate LHS variables, each estimated by an ordered probit regression: table points scored by home team, table points scored by away team. Each can take the values 0, 1, 2, 3, 4, 5.
- My database only included the scores, not the bonus points scored. The actual points earned by each team for winning, tying, or losing by 7 points or less, can be inferred from the scores, but not bonus points for scoring 4 tries or more. I proxied this by assigning a bonus point if the team scored 30 points or more. The method proceeds as follows:
- Generate a dummy for each of the 15 teams that equals 1 if that team was the home team, and -1 if it was the away team.
- Run two ordered probits, one for points scored by the home team, and one for points scored by the away team, in each case run on the 15 dummies (one dropped) and a constant.
- Predict the probability of scoring 0,1,2,3,4,5 points for each of the 210 possible matchups (each team playing each other home or away), and found the expected points for each.
- Then sum these to get the total points in a balanced competition where every team plays every other twice, home and away.
- Finally, normalise these by a linear transformation to get the same mean and s.d. as the actual super 15 points table.