As rugby fans, we get given lots of rugby stats but just how useful are they?
I should warn you there are no pretty pictures. It would just be lines of coloured dots that aren’t really helpful. Sorry.
While we see, on screen, some other stats, like points per entry into the 22, and so on, the stats we see made publicly available are better than they used to be, but are any of them useful?
I went through all the matches in the RWC for which stats are available from their website where the points margin was 14 or lower. This let me include all the quarter finals, but it ruled out some of the absolutely crazy results where I’m not sure the results would be helpful. One match, Wales v Fiji, was excluded because there are no stats.
For almost all the stats I took the winning teams shown value and subtracted the losing team. So metres carried and South Africa v France, I looked at the Bokke total (424), the French total (524) and subtracted the second from the first to give -100.
For discipline results I did it the other way around, France 0 YC – SA 1 YC, -1 YC to SA – in this category you want fewer penalties, fewer cards after all.
A Quick Sidebar On The Stats
When you do A-B type stats, in this kind of setting, you’re looking for a bias. If a particular stat has a lot of occurrences where A (or B) is higher, you want to spot that.
Imagine possession is important, more possession is better than less. So you count the cases where the winner’s possession is higher than the loser’s and you’d expect there to be a lot more of those than cases where the loser’s possession is higher than the winner’s.
Mean and standard deviation are a bit trickier. Normally for something like this, you’d look at the winner’s values and the loser’s values, perform a t-test (if appropriate) on each pair of pools. But there are pools (cards for example) where I can tell you the t-test will have a hissy fit, and others where the data don’t fit the assumptions so either they need transforming or a different test. I can certainly do that, but explaining it here gets complicated. If this was for a longer article, or a stats paper, I would, but I’m guessing most of you don’t want to read all of that.
But, taking the mean and the standard deviation of A-B gives some useful measure of what’s going on. If there’s no difference due to that measure you’d expect a mean of 0. To use possession again, if possession is not important, then the average difference should be zero. You’ll rarely get that perfectly with noisy data, but as a quick and dirty measure, if the standard deviation is larger than the mean, then the mean can’t really be distinguished from zero. (There are formal ways to test this, but this is close enough for what we’re doing here.) If that’s the case, the measure is not telling you anything useful, at least as first approximation. If the standard deviation was particularly high, I calculated the range, and have quoted that instead. Range is highest value – lowest, and sometimes it’s more intuitive to grasp that.
Rugby Stats considered
I’m going to try and call the stats from the games parameters from here on, because having statistics from the games and doing statistics on them gets confusing.
For all the parameters I used the numbers provided on the official World Cup website via the Match Centre for each match. This isn’t necessarily the best place, but it’s a consistent set of parameters.
I looked at the old standbys of possession and territory.
For running, I looked at total metres carried, total runs, number of carries over the gainline, defenders beaten and offloads.
As a measure of forward success (and because it was there) I looked at total set pieces won, and percentage scrum and lineout success.
To measure discipline, penalties conceded, yellow and red cards conceded.
For kicking, I looked at total number of kicks, kick metres and kicks regathered (this could be important, we see some teams kicking to regather kicks, others less so). I didn’t look at numbers of kicks to touch or charged down.
For defensive effectiveness I looked at tackle percentage, total tackles and turnovers won.
I carried out a very simple test, and looked at the sign of the values. There are 15 for most of these statistics (not for cards, and a few have no difference so there’s a 0 and only 14 signed values). If you have basically the same number of results above 0 as below, that will suggest this parameter is of little use, certainly on its own.
I also took the mean, standard deviation and range of the results.
Possession and Territory
Possession and territory are meaningless. Comparing their ± values gives a score of 1 and -1, so they both have essentially the same number above and below zero. Their averages are 2.2 and 1.1 respectively, but their ranges are huge, at 42 and 50. Possession varies from +22% to -20% and territory from +24% to -26%. Remember for these values, they will normally be shown split between the teams, so +22% would be one team 11% over 50, or 61% and the other 11% under 50 or 31%. But, on average you’re looking at 51% to 49% and it could go either way.
Running with the ball
Carrying that theme on, metres carried has a ± of -1, an average of -29.5m and a range of 450m. It’s slightly better to let your opponents outrun you… but there’s no value to looking at this. The same is true for total runs, ± is -1, average is -6.9 and the range is 146. Offloads also clearly give us nothing to see, ± is -1, mean is -1.5, range is 19.
Carries over gainline, the ± is -3, so there’s definitely a lean, although not the way you might expect, the average is -4.8 and the range is still huge at 92. I wouldn’t read anything into this.
For defenders beaten we have our first really strong trend, the ± is -7, again not the way you’d expect it to be. Teams that lose beat more defenders, the average is 3.1±13.7 and the range is 57. This is still pretty shaky “evidence” and not something you’re going to base a winning strategy on.
Winning your set pieces is good. The ± is +5, the mean is 2.1±5.8 and the range is 18. Having a ± of +5 means 5 sides won despite their pack losing more set pieces than they won, 10 sides won, to some extent based on winning up front. But the forwards winning, overall, is an indicator that you’re likely to win.
For scrum success the ± is 0, as it is for lineouts. In different matches the scrum% and lineout% were identical, so there are only 14 non-zero numbers. The means are really close to 0 but the ranges are up in the 60s (69.4% and 66.1% respectively).
For penalties, the ± is 7 and the mean is 3.1±3.8. Giving away fewer penalties is good, but not as strongly good as you might think. The range is 13, from +11 to -2. So you can be a bit behind, but not too far. Note that 2 sides ended up level on penalties here.
For yellow cards the ± is +5, but the mean is 0.25±1.2. This is mostly because a couple of teams won despite giving away two yellow cards. Definitely better not to, but not a disaster. There are only 11 data points here, four matches had no yellow cards.
For red cards there were only two matches affected by them, and the other side always won. (NOTE: England v Argentina, the example where the side penalised was victorious was also a victory by more than 14 so isn’t included.)
Kicking more is a benefit. The ± for kicking is +6 (one match had identical numbers of kicks). The mean difference was 3.3±5.5 and the range was 18.
Metres from kicks was slightly positive (+2, again one match had exactly equal kick metres, but not the one with equal numbers of kicks). The average was 74.3m but the range is 859m.
Kicks regathered looks odd at first, there are only 12 numbers, in three of the 15 matches they matched each other. There are also a match of six and six so the ± is 0. Unsurprisingly this gives us a mean of almost exactly 0 (0.5) and there is quite a high range (9.0).
Tackle% is, surprisingly virtually even, ± is -2, but when you look at the mean, -0.07, perhaps it’s less shocking. Essentially that means there’s no difference between the winners and losers in tackle%. The range of 23, from +12 to -11 says it can be quite different in individual matches though.
Total tackles has a ± of 1 and a no meaningful difference. There’s a mean of 10.5, a range of 187. Pooling the data is hiding something here.
With Turnovers Won we are out to a ± of +5. Finally something we might expect, winning turnovers is good! There is a mean of 1.2±3.4 and a range of 11, from 7 to -4.
Although I largely haven’t quoted the standard deviation (this is a measure of how spread out your data are) they are all much larger than the means. The smallest is about twice the mean, the largest is over 100x the mean. This means that, for all the parameters given, we can’t tell the difference between the mean value and zero. I can’t guarantee that performing t-tests or similar on the underlying data wouldn’t identify some significant differences, but I’d be quite surprised.
I could have increased the sample size by including the blow out matches, or setting my cut-off higher. We do see matches with 30 point margins more routinely, and 50 point margins between tier 1 sides are not unknown. That would still cut out France v Namibia and the like, which are unusual results. However, from this dataset there are so few statistics that show any sign of being meaningful, I’m not convinced it would he useful to look at more data in this way.
I would have expected at least some of these to be somewhat useful. Winning set pieces, giving up fewer penalties and fewer cards seems pretty obvious? Winning more turnovers than your opponent seems good but obvious too.
The penalties one makes me wonder which way causation lies – defending teams seem to get penalised more (offside at the ruck, hands in the ruck, no clear release and the like) so that might simply be a measure that “if you are defending there are more ways to be penalised.”
Total kicks seems odd to me. This might obscure a “style of play” factor though. As a very quick, crude measure, I worked out if total kicks and kicks regathered were both positive and then again if they were both negative. If they out kick AND out regather you, you’re in trouble – that’s got a ± of -9! This would suggest that a “good kicking game” is what matters. (I’m not going to define what a good kicking game is, these data really don’t let me, but that’s really strong.) This might not be the best measure of it, but it’s what it would seem to measure.
This brings us on to the idea of pooling some of these parameters to get a measure of a good strategy.
If you think of how Fiji and England play, Wales and Argentina, France and New Zealand, South Africa and just about anyone… do they pull the averages in different directions? You have teams that kick to regather, teams that kick to apply pressure, teams that rarely kick. There ought to be teams that kick to touch, but that is not as much of a tactic any more, numerically speaking. You have teams with massive, well-drilled packs that focus that way and teams with weaker, less dominant packs that can still win enough ball for their backs to be effective.
To try and look at the play style values without diving back into the whole dataset, I looked at all the values I had already, but just for the games where Argentina, England and Fiji (separately) won. These are the only three sides with multiple close wins in the data already.
Obviously these are really small samples, but they do seem to suggest that there are different styles of successful play that you can observe. For example, England tend to play more territory than Fiji. Fiji offload a lot more than England. England’s kick distance differential to their opponents (this is games they won remember) is really tight and positive, 61.5±14.8m, while Fiji’s is really variable but negative -144.5±105.4m. Fiji turn the ball over more than their opponents while England don’t. But for everything else they’re actually hard to separate. I’m not going to give statistics simply because the sample size is so small.
We can also compare winning Fiji with losing Fiji (two of each) and while we can’t do meaningful stats on this, there are changes in how they run (if they win they run more, make more metres, if they lose, both of these switch). When they lose they also cross the gainline a lot less, the penalty count comes back to closer to even and they kick more than their opponents and for more metres.
Defenders beaten being a negative result, suggesting that the losing team tends to beat more defenders surprises me still. I think there’s a play style interaction that I can’t quite tease out with the parameters we’re given. If the sign of Runs, Carries Over Gainline and Defenders Beaten tends to be the same sign more often than not. So if you’re a running side, you win if carry over the gainline more and beat more defenders. But there’s a strategy to win if you let the other side do it too. Things were not told, like “runs originating in your half” vs “runs originating in their half” and “carries over the gainline in their half” might be quite powerful. Although France and New Zealand only appear in this list twice each, there’s a suggestion that they both tend to kick from their half and run in the other half. It’s a bit more complicated than that, but as a one-sentence summary, that’s close enough. The parameters we’ve got don’t show that at all, and it might be really interesting to see.
If you want to follow a side, or a group of sides, simply looking at the rugby stats you’re shown really won’t tell you much. Sometimes they appear to reveal which is the better side, who is the best player, but all too often the eyeball test, or the scoreboard, just disagree.
However, with enough time and effort you might be able to find key indicators that say when they do well, we see this pattern. I have little confidence in the data from the Fiji results, there are only two samples in each pool, but it does suggest the kind of thing you could do. That might sound pointless but the data suggest that Fiji win when the run more and win on the penalty count, when they lose, they kick more than their opponents. You could watch a match and get a sense of all of those pretty easily.
Are any of these useful for a coach? Probably not. As a coach you want some tactical advice. You want to be able to say “if they’re doing this, these are your options” and “we want to do this in attack, here are the moves.” But we know coaches look at parameters like these to set up strategies. Why do England kick so much and so quickly? Because Borthwick has it in his mind that successful teams kick for other 1,000m in a game. (I can tell you that losing teams typically kick for over 1,000m too, at least at international level.)
But, as a coach, you’d have more specific things like ‘at the lineout, they leave a gap here, so we’ll work a move to attack it’. The try that Jordan scored against Ireland would be an example of that. Someone said ‘there’s a gap here’ whether that was spotted during the match or in analysis before the match, and Mo’unga ran the game, offloaded and the rest was history. These high level stats don’t give you that detail.
However, as a fan, if you can see what your side do when they win, and when they lose, and if you have the patience what the opposition do when they win and when they lose, you can make some informed criticisms. If the Australian Way is to run it from everywhere, how does that work out against different sides? How does it make their stats change from when they play other teams?
Ultimately, of course, there’s only one statistic that matters: that’s the scoreline. None of these match statistics were anything like strong enough for me to do a plot and look at in more detail to see if they were at all predictive of a result. It’s possible, if you look at the results for a single team, you might get something that lets you do that. Honestly, I think you’ll get a complex linear model – something that, at it’s simplest, says ‘we need to have 3 fewer penalties, 20 more carries, 150m more kick metres” or something like that. (Properly you’ll have an equation that plots this out so you can plug the numbers of these things in to predict the result.) I’m not going to do that for you though. A word to the wise, developing linear regression models is typically at least a three year long process, and I think you could reasonably expect it to change (at least a bit) every World Cup cycle as teams, coaches, strategies and so on change. But if you fancy doing it, don’t let me stop you!
Without a full linear model, there’s clearly no single stat that we get shown you can use as the be-all and end-all indicator of a good performance. Rugby might be a simple game, but there are lots of ways to express that simplicity, lots of moving parts, and no single stat shows that.