BigSoccer regular and fellow SKC fan vividox started a blog earlier this year where he posts his Elo ratings and corresponding points predictions for MLS teams (http://wsasu.blogspot.com). His site includes a link to a paper in which he explains his methodology in some detail. There is also a decent introductory article at Wikipedia (http://en.wikipedia.org/wiki/Elo_ratings). One shortcoming of the Elo system is that there is no obvious or intuitive way to relate a team's rating directly to its strength. I thought that it would be helpful to transform the numbers such that a rating of 100 (rather than 1500) would be average, a rating of 150 would be roughly 50% better than average, and a rating of 50 would be roughly 50% worse than average. This is similar to the notion behind "advanced" baseball statistics like OPS+ for hitters and ERA+ for pitchers. With that in mind, I have developed a new rating system derived from Elo's model called ELO+. The key concept is that a team's ELO+ rating should represent how much better it is expected to perform against an average opponent on a neutral field than another average team would do. For example, a 150-rated team should perform 50% better against a 100-rated team than another 100-rated team. Likewise, a 150-rated team should do 25% better against a 100-rated team than a 120-rated team (150/120=1.25). If the 150-rated team plays the 120-rated team, the expected outcome is the same as when a 130-rated team plays a 100-rated team; as with the Elo system, the difference between ratings is what actually goes into the calculations. With that brief introduction, here are the final ELO+ ratings for the 2011 MLS season, along with the predicted and actual points standings; playoffs are not included. Code: Rank Team ELO+ Rank Team Predict Rank Team Actual 1 SEA 139 1 LAG 65.2 1 LAG 67 2 LAG 132 2 SEA 61.6 2 SEA 63 3 SKC 125 3 SKC 53.9 3 RSL 53 4 HOU 120 4 PHL 52.1 4 FCD 52 5 PHL 111 5 RSL 51.8 5 SKC 51 6 CHI 110 6 HOU 49.7 6 COL 49 7 COL 105 7 NYR 49.6 7 HOU 49 8 NYR 104 8 COL 48.9 8 PHL 48 9 CLB 97 9 FCD 48.4 9 CLB 47 10 RSL 97 10 CLB 45.7 10 NYR 46 11 FCD 96 11 CHI 45.6 11 CHI 43 12 SJE 94 12 DCU 42.4 12 POR 42 13 POR 90 13 SJE 41.9 13 DCU 39 14 DCU 87 14 POR 41.0 14 SJE 38 15 CHV 86 15 CHV 39.7 15 CHV 36 16 TOR 78 16 TOR 32.5 16 TOR 33 17 VAN 69 17 VAN 29.4 17 NER 28 18 NER 60 18 NER 29.1 18 VAN 28 These numbers come from going through the entire schedule twice: first with all teams starting at 100, and then with all teams starting at their ratings from the end of the first run. The ratings themselves did not change much the second time, but the points predictions improved considerably; the largest discrepancy is 4.1 points (less than 10%), which seems pretty good to me. Home-field advantage is incorporated by always increasing the home team's rating by 22, since home teams did 22% better than average overall. Note that the top-rated team at the end of the year (SEA) was not the one projected with the most points (LAG); this is because, again like Elo ratings, ELO+ ratings reflect variations in form over time, as well as overall quality. My next post will provide the current 2012 ratings and corresponding points projections using each team's actual points earned to date and remaining schedule. I welcome any and all feedback here, including specific questions about how ELO+ ratings are calculated relative to conventional Elo ratings.
There are different schools of thought on what starting point to use when calculating Elo ratings for each new season. Some, including vividox, carry over the ratings from the previous year; he even includes the post-season, despite the fact that this penalizes teams that make the playoffs and then lose, especially if they do so at home. Another approach is simply to start all teams at 100 and let the ratings sort themselves out from there. Still others split the difference; for example, a team that finishes one year at 1600 would start the next year at 1550. The same decision arises with ELO+ ratings. For now, I will be providing two sets of numbers, which should eventually converge: Regular ELO+, which started each team at its final 2011 rating as posted above (playoffs not included); and Neutral ELO+, which started each team at 100. My thought here is that Regular ELO+ may better capture a team's long-term strength, while Neutral ELO+ reflects only its performance so far this year. As an expansion team, Montreal started at 100 in both versions. Here are the 2012 MLS Regular ELO+ ratings and points projections for 04/29. Code: Rank Team ELO+ Rank Team Points 1 SEA 142 1 SKC 73.2 2 SKC 138 2 SEA 69.5 3 SJE 121 3 SJE 62.1 4 HOU 118 4 HOU 57.3 5 LAG 116 5 RSL 54.3 6T NYR 110 6 DCU 52.9 6T RSL 110 7 NYR 52.1 8 CHI 107 8 LAG 51.8 9 DCU 106 9 CHI 50.1 10 COL 105 10 COL 48.9 11 PHL 98 11 PHL 43.1 12 FCD 94 12 VAN 42.2 13 MON 88 13 FCD 42.0 14 VAN 86 14 MON 37.8 15 CLB 85 15 CLB 37.4 16T CHV 79 16 CHV 34.0 17T POR 79 17 POR 32.0 18 NER 64 18 NER 29.2 19 TOR 54 19 TOR 17.7 Here are the 2012 MLS Neutral ELO+ ratings and points projections for 04/29. Code: Rank Team ELO+ Rank Team Points 1 SJE 129 1 SKC 67.6 2 SKC 124 2 SJE 65.9 3 SEA 113 3 SEA 55.3 4T DCU 112 4 RSL 55.1 4T RSL 112 5 DCU 54.5 6 NYR 109 6 NYR 52.3 7 VAN 106 7 VAN 50.9 8 HOU 103 8 HOU 50.1 9 COL 101 9 COL 46.4 10T CHI 97 10 CHI 44.7 10T FCD 97 11 FCD 43.6 10T LAG 97 12 LAG 43.4 13 PHL 93 13 PHL 40.8 14 NER 90 14 NER 40.1 15T CHV 89 15 MON 38.3 15T MON 89 16 CHV 38.2 17T CLB 86 17 CLB 37.5 18T POR 86 18 POR 34.9 19 TOR 67 19 TOR 22.7 Which of these four rankings looks the most correct to you, and why?
I'm certainly no Elo expert, but in chess, isn't it scaled so a point differential implies an expected outcome--e.g., 200 point difference means the stronger opponent would be expected to win 75% of the time (draws are abstracted from). Does vividox's MLS Elo have some sort of scaling along these lines? Whereas...what does it mean for a team to perform 50% better than another on a neutral field...does this mean winning 2/3 of the time? I also wonder if the soccer stats forum might provide some useful feedback. Off-topic, but just in case you're interested in baseball stats: ERA+ does not mean percentage better (or worse) than park-neutral league average. The pitcher's ERA is the denominator, that is: ERA+ = (100*league ERA)/pitcher's ERA. Many people have suggested changing the equation but it's not happened...probably due to inertia (the stat is rarely referred to on broadcasts but has been well-known on the intertubes for a while). Tango's Inside the Book blog has had some good discussions on the topic, as has Fangraphs, I'm sure.
Yes, Elo rating differences translate to expected outcome values, denoted E[X], based on win=1, draw=0.5, and loss=0. However, the equation for this is non-linear, so the conversion is not obvious. With ELO+, after adding 22 to the home team, you simply divide the rating difference by 200 and add 0.5 to get the higher-rated team's E[X]. Close; an Elo rating difference of 200 translates to E[X]=0.760 for the higher-rated team, which corresponds to an ELO+ rating difference of 200*0.760-100=52. We then have to assume something for the probability of a draw, P(D), in order to estimate the probability of a win or loss: P(W)=E[X]-0.5*P(D) and P(L)=1-E[X]-0.5*P(D). The boundary conditions are that P(D)=0 for E[X]=0 and E[X]=1, and P(D)=1/3 for E[X]=0.5. I use P(D)=4/3*E[X]*(1-E[X]); vividox has his own equation, but the results are not terribly different. For a 100 team playing a 100 team on a neutral field, E[X]=0.500. For a 150 team playing a 100 team on a neutral field, E[X]=(150-100)/200+0.5=0.750, which is 50% higher. Using my P(D) equation, P(W)=0.333 for 100 vs. 100 and P(W)=0.625 for 150 vs. 100. I actually was not aware of that forum and will have to check it out. Thanks! Right, ERA+ is a bit upside-down since it is better to have a lower ERA. Still, an ERA+ of 150 implies (at least to me) that someone is 50% better than a league-average pitcher; mathematically, it means that his ERA is 2/3 of the league average.
Great...now this is stuck in my head: [ame="http://www.youtube.com/watch?v=Bo0RpBGHjwA"]ELO, Dont Bring Me Down - YouTube[/ame] Bonus tie-in: The Regular 2012 ELO ratings have DCU at 9, versus (a tie for) fourth in the Neutral ratings. To which I say, "Don't bring me down."
ELO ratings for football clubs: 1) Birmingham City 2) Everyone else [ame="http://www.youtube.com/watch?v=JlP8onVXtD8"]Jeff Lynne - Keep Right On To The End Of The Road - YouTube[/ame]
Just wanted to clarify: you have modeled expected outcomes based on your own formulas, Elo ratings do not inherently translate to expected outcomes when you include draws. Furthermore, E[X] is not based on draw=0.5. An E[X]=1 means a team is guaranteed to win, E[X]=0 means a team is guaranteed to lose, and E[X]=0.5 means a team has a 50% chance to win. There is absolutely no notion of drawing in the expected value, that is a concept we have to artificially create.
Agreed, for contests in which there can only be a winner and a loser. However, a soccer team (or chess player) with E[X]=0.5 does not actually have a 50% chance to win due to the mere possibility of a draw. What E[X]=0.5 always means is that the two competitors are evenly matched. Right, we have to assume a separate model for the probability of a draw when that is one of the potential outcomes, and it affects only the points projections--not the Elo or ELO+ ratings themselves. Thanks for the clarification!
Fist I had to adjust to SUM. Then they whipped out the casual use of DP. Then MLSsoccer.com confused me. Now you're taking ELO? What's next? LMFAO? Land Marketing for Football Association Organizations
Hey -- my chess rating in high school was higher than Colorado's current rating! Awesome! (Seriously -- this is pretty cool. But I'd be remiss if I didn't point out that ratings pioneer Jeff Sagarin has been doing this for my former employer for a few years: http://www.usatoday.com/sports/sagarin/mls12.htm )
I don't much care for ELOs for clubs, because it includes data from previous seasons which are largely irrelevant and can confound rather than enhance the insight. For instance, the ELOs still had the Galaxy number one, until they lost three out of four games and were obviously playing badly. And at that point, they were still third. ELOs for chess players work because over time, the player evolves, but it's still the same guy. Even for National Teams, the ability to replace your roster wholesale is constrained by the available national pool. With a club, you can turn over (or be deprived of, through injury) big parts of your roster in one offseason. I saw a regression analysis done at a blog (maybe Sounder at Heart) that said last season's results only explain about 10% of this season's performance, and anecdotally, it seems valid to me, so I'm pretty skeptical of the level to which that formula takes prior years' results into account.
Yes, but he refuses to reveal his methodology; in particular, how he converts Elo ratings to a number that can be used to predict point/goal spreads. Show your work, Jeff!
It is a valid concern; but in general, the teams that were good (or bad) last year are the ones most likely to be good (or bad) this year, as well. In any case, by the end of the season, things pretty much work themselves out such that the remaining influence of the previous season's data is minimal. Then you should definitely prefer the Neutral ELO+ ratings, which do not include any data from the previous season except the assumed value of home-field advantage.
True for conventional Elo ratings, but I went with ELO+ for my system as an homage to the baseball metrics that helped inspire the whole concept. Besides, maybe at some point I will come up with a suitable acronym!
That's good. Hopefully so, as like I said, from what I've seen, the statistical correlation of previous seasons to the present one in MLS has been something like 10%.
That seemed a bit hard to believe, so I pulled the data from 2007-2012(thus far into the season). The histroci (2007-2012) correlation coefficient (R) is 0.395 for all teams, which gives an R^2 of 0.156. So the previous season's results only have about a 16% impact on a team's current performance. Interestingly, for the 2010 to 2011 seasons, there was much higher correlation, (R = 0.703, R^2 = 0.494) while for 2009-2010 there was basically no correlation (R = 0.026, R^2 = 0.001).
Too early to say but that higher correlation could be the start of a trend as expansion slows down and franchises begin to fall into a natural order as their ambition and spending begin to impact more and more on results. I'm sure there is a much higher correlation season on season in the EPL for example and even with the wage cap I expect the disparity in spending between MLS franchises to grow over time.
One of the most attractive features of Elo (and ELO+) ratings is that they are self-correcting over time. Starting all of the teams at "average" (1500 for Elo, 100 for ELO+) at the beginning of 2011 gives you ratings at the end of that season that are very similar to those obtained by going all the way back to 1996. I will be surprised if the Regular ELO+ and Neutral ELO+ numbers are much different from each other on 10/29.
Here are the 2012 MLS Regular ELO+ ratings and points predictions for 05/03. Code: Rank Team ELO+ Rank Team Points 1 SEA 145 1 SKC 73.1 2 SKC 138 2 SEA 71.8 3 SJE 125 3 SJE 65.1 4 HOU 118 4 HOU 57.4 5 LAG 113 5 RSL 54.3 6 RSL 110 6 NYR 52.0 7 NYR 110 7 DCU 50.5 8 CHI 107 8 CHI 50.0 9 DCU 102 9 LAG 49.8 10 COL 100 10 COL 45.3 11 PHL 98 11 PHL 43.0 12 FCD 94 12 VAN 42.2 13 MON 88 13 FCD 42.0 14 VAN 86 14 MON 37.6 15 CLB 85 15 CLB 37.3 16 CHV 79 16 CHV 34.0 17 POR 79 17 NER 33.0 18 NER 69 18 POR 32.1 19 TOR 54 19 TOR 17.8 Here are the 2012 MLS Neutral ELO+ ratings and points predictions for 05/03. Code: Rank Team ELO+ Rank Team Points 1 SJE 133 1 SJE 68.9 2 SKC 124 2 SKC 67.4 3 SEA 117 3 SEA 58.1 4 RSL 112 4 RSL 55.0 5 NYR 109 5 NYR 52.3 6 DCU 108 6 DCU 52.1 7 VAN 106 7 VAN 50.9 8 HOU 103 8 HOU 50.2 9 CHI 97 9 CHI 44.7 10 COL 97 10 COL 43.6 11 FCD 97 11 FCD 43.5 12 NER 94 12 NER 43.2 13 LAG 93 13 LAG 41.0 14 PHL 93 14 PHL 40.8 15 MON 89 15 MON 38.2 16 CHV 89 16 CHV 38.2 17 CLB 86 17 CLB 37.4 18 POR 86 18 POR 35.0 19 TOR 67 19 TOR 22.9