Mean & Median at Midway: part iv – streaks

The third type of parity I’ve looked at is the small picture, game to game kind. In a league with perfect parity, we would expect winning and losing streaks to appear with a predictable frequency, not that they wouldn’t exist, but that they would appear at a rate in keeping with parity. A simple statistical test of randomness is that if 9 consecutive passes or fails occur, the results are likely not random. This is based on that principle.

In contrast to the other posts on the subject, for simplicity reasons I had to use 41GP for all teams as the midway point. Below is the table of winning and losing streaks up to the halfway point this season.

Streak Length Expected Frequency Winning Streaks
Losing Streaks
Games Represented
1 161.25 173 160 333
2 78.75 77 89 332
3 38.44 40 45 255
4 18.75 13 14 108
5 9.14 7 7 70
6 4.45 5 7 72
7 2.17 2 0 14
8 1.05 0 1 8
9 0.51 3 0 27
10 0.25 0 0 0
11 0.12 1 0 11
12 0.06 0 0 0
13 0.03 0 0 0
14 0.01 0 0 0
15 0.01 0 0 0
  Total 1230

The losing streaks column isn’t too far off of what would be expected, and is in keeping with what we see most seasons. The winning streaks column is entirely different. It has far too many 1-game winning streaks and far too many winning streaks longer than 8 games. I get that the flaw in this parity metric is that it allows what would otherwise be seen as anecdotes and outliers to skew the value of the rest of the data, but that being said, it’s hard to say you have unprecedented parity when Florida can win 11 games in a row (game #41, the midpoint of their season, was the 11th win in their 12-game win streak), and Montreal, the New York Rangers and Washington can each win 9 games in a row.

Advertisements

Mean & Median at Midway: part iii – correlation coefficient

If the only parity metric is the standard deviation of point totals, other details can be missed. Even if the teams are packed relatively tightly together consistently, you can hardly say that you’ve achieved parity if the same few teams are consistently at the same extremes of the curve. One way to check this different kind of parity is with a correlation coefficient.

The premise behind this test is that in a league with perfect parity, there should be no connection between the way a team performs in one preset time span versus how it performs in another preset, similarly long set of time. We generally know that this is not the case because certain teams seem to perform consistently well. The correlation coefficient allows us to quantify it.

The way it works is you take two sets of data, ie. one season’s winning percentages and another season’s winning percentages, and it tells you how closely those two datasets are related. The three key possible results are 1, 0, and -1. Using consecutive NHL seasons as an example, a score of 1 would mean that the winning percentages were exactly the same, ie. good teams stay good and bad teams stay bad. A score of 0 would mean that there was exactly no connection between the two seasons, that it was entirely random. A score of -1 would mean that the teams had completely switched, ie. that good teams had become bad, and bad teams had become good.

The correlation coefficient of adjusted winning percentages between the first and second quarters of the 2015-16 NHL season was -0.0899. Which means that the connection between the two quarters was random.

It isn’t uncommon for this to happen between different quarters within an NHL season, but it’s quite rare for it to happen with consecutive quarters.The only exception since 2004 is between Q2 and Q3 in 2010-11 with a score of -0.0529.

Mean & Median at Midway: part ii – bell curves

Early this year, many people were commenting on how high the level of parity was. Brian Costello of The Hockey News did so here. Writing slightly after the one quarter mark of the season, he argues that the league saw unprecedented parity to that point in the season. What parity metric was he using? He simply pointed out that the last place Calgary Flames at that point had more points than any other last place team had ever had at that point. He writes, “Parity, of course, can have a variety of definitions, but the quality of the lowest-place team is one of them.”

My parity analysis has to be better, that’s all I do, so I was curious to see if other metrics showed the same results. To measure this overall season kind of parity, I use a bell curve. Others use standard deviation, which generally uses the same logic as the “a league is only as good as it’s worse team” idea that Costello employs.

The trouble is the standard (deviation) metrics assume a parity spectrum exists with “competitive” on one side and “non-competitive” on the other.  What I’ve come to understand is that there are three key points on the parity spectrum; non-competitive, competitive, and fair. Before I explain the three points, let me draw you a pretty picture.

compfair

To explain the different, let’s imagine a league with an arbitrarily chosen 21 teams. After they’ve each played 20 games, there are almost infinite ways the standings can look, but the three key points on the parity spectrum can be illustrated on the above bar and line graph. In a non-competitive league, the best team wins every game and the second best team wins every game except when they play the best team, and so on. Those are the yellow bars in the graph. Depending on how the schedule is drawn up, this outcome will produce the highest possible standard deviation. At this end of the parity spectrum I am in agreement.

But what does competitive look like? Some will argue that in a truly competitive league will have all teams at .500, represented by the blue bar in the graph. Not only is this statistically almost impossible, but if you saw this kind of result in any other lab test, you would be certain that there was outside interference. But, this would produce the “best” possible standard deviation score of 0. In this league the league’s worst team would be as good as the league’s best team. But it would be hard to call this competitive.

The red line indicates what we in math circles call a binomial distribution, which draws on Pascal’s triangle. In simpler terms, imagine a peg board with 21 rows, you would expect the pieces to randomly fall here and there, but more or less stack up in this kind of curve. This curve is also probably impossible to achieve, and doesn’t produce a tidy standard deviation score, but statistically speaking, this is what parity would/should/could look like.

So, after all that, let’s see how the first half and quarters of 2015-16 stack up.1516Q1Q2

This is histogram of team results by quarter. The first quarter is displayed in blue and the second quarter in yellow, with a red line showing what can be expected using Pascal’s triangle. Both quarters had two teams in group 16, with an adjusted winning percentage between 0.714 and 0.762, which is interesting because only 0.44 are expected per quarter, and because those are four different teams.

Using a deviation score I’ve developed, the first quarter has a score of 0.545, which is lower (ie. more competitive) than any season since the needless 2004-05 lockout, but only marginally lower than the 0.576 score from the 2014-15 season. The second quarter gave a score of 0.897, which is pretty much average.1516H1

Looking at the first half, a greater level of competition is “expected” with more games played, but again this is consistent with Pascal’s triangle. The league never quite catches up with these “expectations,” but the NHL this year has done quiet well. Using the same deviation score, this season’s first half gives a score of 0.618, which is easily the best score since the 2004-05 season didn’t happen.

So, using my overall season analysis methods, the 2015-16 season is indeed shaping up to be the most competitive one we’ve ever seen. Time, and other tests, will tell if that remains to be the case.

 

 

 

Mean & Median at Midway: part i – point systems

We’re in a buffer zone between the midway point of the NHL season and the All-Star game, and since I am the self-appointed expert on NHL parity, I thought it would be time for a few posts reflecting on the level of parity so far this season.

While others are happy to simply calculate the standard deviation of all teams’ point totals or their points percentages, but I see the parity conversation as needing to be more nuanced than that. If I were to write a book on the subject, my working title so far is “Annecdotes of Parity.” Otherwise I could use it as an Indie band name.

For all of these posts, I’m using game #615 as the midway point of the season, which was when the Hurricanes beat the Jackets 4-3 in OT in Columbus on January 9th. It’s a little awkward for analysis purposes, 1) because it is now over a week and a half ago and 2) because other games were played on that day that will technically count as being in the second half of the season, but half is half, and so I’m happy to use #615 as the divider. I could just take all teams’ results after their 41st game or just add the rest of the games from January 9th, but those results would be artificial and uneven respectably. This way some teams have more than 41 GP and some have less, but that’s the way the schedule crumbles.

While point systems aren’t necessarily a parity metric, they do demonstrate the spread between teams in different ways. I’ve written elsewhere about these different systems (here), but I’ll summarize them briefly as well. The LP is the Loser Point system the NHL currently uses (2 points for any win and 1 point for an overtime or shootout loss). The WLP is the wins and losses percentage, calculating the percentage of games won. The 3PG means 3pts for a regulation win, 2pts for an overtime or shootout win, and 1pt for an overtime or shootout loss. Finally, the 2PTB is a 2 point system with a tie breaker, so 2pts for a win in regulation or overtime, one point for any game that goes to a shootout (basically treating it like a tie) and a bonus tie-breaker point to the shootout winner, to be used in breaking ties in the standings.

Teams are arranged in order based on the Loser Point totals, with playoff teams marked with an asterisk (*)

Eastern Conference
Atlantic RW-OW-OL-RL LP WLP 3PG 2PTB
 Florida 19-6-4-12 54 * 0.610 * 73 * 46 *
 Montreal 20-3-3-16 49 * 0.548 * 69 * 45 *
 Detroit 14-7-7-13 49 * 0.512 * 63 * 43 *
 Boston 18-3-4-14 46 * 0.538 * 64 * 41 *
 Tampa Bay 15-5-4-17 44 0.488 * 59 * 41
 Ottawa 12-7-6-16 44 0.463 56 39
 Toronto 12-4-7-16 39 0.410 51 32
 Buffalo 12-3-4-22 34 0.366 46 32
Metropolitan RW-OW-OL-RL LP WLP 3PG 2PTB
 Washington 26-4-3-7 63 * 0.750 * 89 * 59 *
 NY Islanders 17-5-5-14 49 * 0.537 * 66 * 43 *
 NY Rangers 19-3-4-14 48 * 0.550 * 67 * 45 *
 New Jersey 12-8-5-17 45 * 0.476 57 42 *
 Pittsburgh 15-4-5-16 43 0.475 58 40
 Carolina 13-5-7-18 43 0.419 56 37
 Philadelphia 9-8-7-15 41 0.436 50 36
 Columbus 12-3-4-24 34 0.349 46 29
Western Conference
Central RW-OW-OL-RL LP WLP 3PG 2PTB
 Dallas 24-5-4-10 62 * 0.674 * 86 * 58 *
 Chicago 17-9-4-13 56 * 0.605 * 73 * 51 *
 St. Louis 18-5-7-14 53 * 0.523 * 71 * 46 *
 Minnesota 20-1-8-11 50 * 0.525 * 70 * 43 *
 Colorado 19-2-3-18 45 * 0.500 * 64 * 42 *
 Nashville 16-3-7-16 45 0.452 61 37
 Winnipeg 18-1-3-19 41 0.463 59 40
Pacific RW-OW-OL-RL LP WLP 3PG 2PTB
 Los Angeles 18-8-2-12 54 * 0.650 * 72 * 51 *
 Arizona 17-4-4-16 46 * 0.512 * 63 * 43 *
 Anaheim 15-2-7-16 41 * 0.425 56 * 34
 Vancouver 12-4-9-16 41 0.390 53 31
 Calgary 10-9-2-19 40 0.475 * 50 38 *
 San Jose 14-4-2-18 38 0.474 52 35
 Edmonton 8-9-3-22 37 0.405 45 31

You can read a little or a lot out of a table like this. There isn’t a lot of variation in results, but the logjams we see with the Loser Point system get spread out better in the 3PG system.

Tomorrow I’ll look at how this year’s teams are spread out, and compare that with previous years and what a season with perfect parity would look like.

New Analytical Tool: X-plots

I’ve been wanting for a while to find a way to differentiate between teams with the same record. At the end of the season two teams might arrive with the same number of points, but how did they get there?  I’m not looking for a new way to break ties, or even to argue which team was better, but just to paint a picture of a season. I came up with X-plots.

Ten games is a critical mark in the NHL. It’s the amount of time some young players get to prove they deserve to stay or should get sent back down to junior, but it’s also the time when teams and their fans are unofficially allowed to panic about their poor starts.

X-plots use 10-game segments as a base unit (or at least periods of time in which a team could reasonably be expected to play ten games), and it borrows heavily from the concept of a box-plot (although the math is different.

Xplots201415

In the chart, the plus sign in the middle is the team’s P0ints Percentage (points earned/possible points earned or Pts/(GPx2)). The blue Min line indicates the points percentage achieved in that team’s least productive 10-game stretch of the season and the purple Max line the team’s most productive 10-game stretch. Then the whole season was divided into eight segments (ie. from game #1 to game #153, etc.) and I took the average of the best two segment PP and made that the top (light blue down triangle) and the average of the worst two segments (green triangle).

In the chart, the teams are sorted only by division. I plan to do the same idea with different groupings (ie. playoff teams, same franchise over the years, etc.) but a few interesting teams jump out.

  • Colorado and Florida both showed the lowest range of performance. In their best 10-game stretch, they only scored 7 more points than in their worst 10-game stretch.
  • Both Toronto and Columbus showed the biggest range of performance. When they were at their best, they were among the league’s best, and when they were at their worst, they were among the league’s worst. Both teams were 16 points better in their best 10-game stretch than in their worst 10-game stretch.
  • Interestingly, all four of these teams missed the playoffs. I’ll look over other seasons to see if this is a trend that continues.
  • Anaheim seemed to point out a flaw in the design which may warrant a re-design. In their best 10-game stretch, they scored 16 points for a PP of .800, but in their two best segments, they had PPs of .800 and .818, the latter only possible because they played fewer than 10 games in that segment.

More to come.

2015-16 NHL predictions

A big part of what makes hockey analytics interesting is that it provides at least a speculative way to predict future success. This also means that analysts themselves can be measured and analyzed based on the accuracy of their predictions. The bad ones can be written off as hacks and the good ones can find fame, fortune, and, perhaps most lucratively, an NHL job. Studying team parity as I do affords me neither the perils nor the luxury of player analytics. The only exception to that might be to predict what parity has to say about what we can expect the next season’s standings to look like. Since I haven’t developed that kind of  tool, my own personal assessment will have to do, but watch this space this year as I test out parity metrics to forecast future parity.

Perhaps there won’t be any correlations worth noting at all, but here are a few parity-related assertions that I will test. All other things being equal:

  • Teams that go deep into the playoffs start the season with less energy and can be expected to under-perform
  • Teams that saw the greatest variation in their winning percentages from one quarter to another one season are more likely to see improvement the following season
  • Teams that see the greatest turnover of starting lineup personnel are less likely to improve on the previous season’s record
  • Success (and failure) can only last for so long

So, in the meantime, here are my 2015-16 NHL projections, based on my personal feeling, not on my analytic prowess.

Atlantic Metropolitan Central Pacific
1. logoTBL * 1. logoWAS * 1. logoDAL * 1. logoAD *
2. logoMON * 2. logoCBJ * 2. logoSTL * 2. logoSJS *
3. logoOTT * 3. logoNYI * 3.logoWIN  * 3. logoCAL *
4. logoFLO 4. logoPIT * 4. logoNAS * 4. logoEDM *
5. logoBUF 5. logoNYR * 5. logoMIN 5. logoVAN
6. logoDRW 6. logoPHI 6. logoCHI 6. logoLAK
7. logoBB 7. logoCAR 7. logoCOL 7.logoARI
8. logoTML 8. logoNJD

Playoff teams are indicated with an asterisk (*)

And my Stanley Cup final four playoff picks are:

logoTBL defeats logoNYIlogoSTL defeats logoAD

and then logoSTL defeats logoTBL

Finally, my trophy picks:

  1. Art Ross -> John Tavares
  2. Hart -> Steven Stamkos
  3. Norris -> Victor Hedman
  4. Vezina -> Frederik Andersen
  5. Rocket Richard -> Phil Kessel
  6. Jack Adams -> Lindy Ruff
  7. Calder -> Connor McDavid

Winning Percentage vs. Points Percentage

While it’s probably impossible to expect fans of any non-championship winning team to acknowledge their satisfaction with their team, one can reasonably expect a fan to be content if their team has given them as much joy as sorrow. Not even the most advanced analytics writer will pretend to be able to quantify those emotions, but that is essentially the idea behind a metric like .500.

.500 almost seems too simple to call it a statistical metric, both because it is easy to calculate and because it has been used for so long. It’s also used in the three other traditionally understood major North American sports. In baseball, basketball and football, .500 means having the same number of wins as losses, except when a football game ends in a tie, but even then it still works.

In this context, .500 has a lot of meaning. Perhaps the most important is the qualitative understanding that a team playing at or above .500 is respectable. Numerically speaking, besides winning half of their games a .500 team is almost certainly better than half of the other teams in the league, and if half of all teams make the playoffs, a .500 record should be good enough to allow a team to qualify. There will always be variations in where individual teams finish, but in all of these other sports, .500 will always be the mean average of all teams winning percentages (assuming they have all played the same number of games).

Hockey analysts use .500 in their conversations too, but .500 means something else in hockey than it does in all of the other sports. Since all of the other major leagues rely mostly on wins, for them .500 is a winning percentage (WP). In hockey, .500 is a points percentage (PP). These are not the same thing.

One could argue that because of the possibility of ties in football theirs is a PP too, but in football, just like the NHL before the extra point was introduced in the 1999-2000 season, a tie was worth half of a win. In those settings, the league either awarded a win at the end of a game, or they awarded two half-wins.

For instance, if the 2014 Carolina Panthers had finished with 7 wins, 7 losses and 2 ties, instead of going 7-8-1, it would be debatable if they were as good as a team that finished 8-8, it would certainly be debatable if they deserved to make the playoffs with that record, but it would be indisputable that they had a .500 record. That hypothetical .500 would be both a WP and a PP.

WP is calculated by dividing the total number of wins by the total number of games played. In leagues with a WP, if a team has the same number of wins and losses, they will have a WP of .500. PP is calculated by dividing the number of points earned by the number of points available. These percentages are only slightly different in calculation, but can be very different in its meaning.

As a test case, let’s use the Atlantic Division in the current 2014-15 season up to the All-Star break. The teams with their W-L-OTL records are as follows: Tampa Bay 30-14-4, Detroit 27-11-9, Montreal 29-13-3, Boston 25-16-7, Florida 20-14-10, Ottawa 19-18-9, Toronto 22-23-3, Buffalo 14-30-3. So which teams are under .500 and which teams are over? By both measures, the top four teams are over five hundred. Tampa Bay, Detroit, Montreal, and Boston are all averaging more than a point a game and have more total wins than total losses. Both Toronto and Buffalo are below .500 by both definitions, but Ottawa and Florida are both averaging more than a point per game but have more combined losses than wins. So, of these eight teams, four are over .500, two are under, and two are sort of.

At first glance it would simplify things to simply use the WP .500 as the metric of success. It fits the criteria of what .500 should be, that is mean, median and mode. No matter what happens, if all teams have played the same number games, the mean average of all winning percentages will be .500. We can expect half of all teams to finish with a WP above .500 and half to finish below (there will always be variations, but the median should never really be far off of .500. Ideally, as long as there were an even number of games played, we could expect .500 to be the most frequent WP, but the NHL doesn’t follow a binomial distribution, and that’s a little above the scope of this article anyway. The main problem with WP .500 is that two teams that both have the same number of wins and losses could have vastly different point totals. For example, in the lockout shortened 2012-13 season, four teams finished with a WP of .500. Two of them made the playoffs, Detroit with 56 points and the New York Islanders with 55, and two missed the playoffs, Columbus with 55 points and Winnipeg with 51.

What is needed is a kind of “.500” that satisfies the mean and median criteria but is consistent with the current NHL point system.

A few simple formulae will produce the same result that is in keeping with our previous understandings of .500.

  • (Total points awarded) / (Total GP for all teams)

    • There is an important distinction here that since each game involves two teams we need either (total GP) or (total games) * 2

  • .500 + (Total overtime games) / (Total GP for all teams)

Both of these will produce an average points per game or PP that will more accurately serve as a benchmark of success. That number for the 2013-14 season was 0.562 and for the 2014-15 season before the All-Star break was 0.565. Also, if we knew the average OL/team, we could easily measure the number of points the benchmark should be. Up to the All-Star break in 2014-15, there were 179 games decided in extra time, meaning that roughly 6 points over a point a game is a better benchmark. In the 2013-14 season, 307 games were decided in extra time, so the benchmark was roughly 10 games over a point a game.

This may not solve any real problems, but our current use of .500 is either incorrect or insignificant. Consider that in the 2013-14 NHL season, only 5 teams finished below 82 points (a PP of .500). 82 points was 9 short of a playoff spot in the West and 11 short of a playoff spot in the East. Whereas .500 once represented the 50th percentile (ie. A .500 team was better than 50% of all other teams) last season it represented the 18th percentile (ie. A .500 team was better than 18% of all other teams). No fan base would be satisfied to know that their team was merely in the 18th percentile.