Soccer rankings

guhbahl · Nov 5, 2003

I've been playing around with determining the "optimal" RPI-like soccer rankings formula to augment the coaches poll to rank high schools since those polls only rank top teams. Any other stat geeks out there with ideas or links to research on:

1. Best home field adjustments
2. Best way to include draws in winning percentage(1/2 win seems to be standard)
3. Best weighting for strength-of-schedule measures
4. Best weighting of recent scores to all scores

I'm against using goals in my calculations for "ethical" reasons, but if you have done some work showing, I'll be openminded.

Currently, I am getting best results with

1. No home field adjustment
2. Draws = 1/3 win
3. .3 win pct .4 opp win pct .3 opp opp win pct, but opp and opp opp win pct have lower bound of .333 to prevent poor opponent from ruining your ranking.
4. .67 to last 50% of matches, .33 to first 50% of matches

My criteria for "best" weighting is cumulative winning percentage of higher ranked over lower ranked (based on rankings on the day the match was played).

Kevin in Louisiana · Nov 5, 2003

I'd say it would make most sense to give draws the worth they have in your area. But are you talking nationwide rankings? That could get sticky.

I'd recommend figuring out a way to involve goals. It's gonna be tricky and you're gonna have to figure out a way to reward teams for playing well against good opponents and not for running up the score against lousy teams (set a cap of a couple goals or something).

Craig P · Nov 5, 2003

Originally posted by guhbahl
I've been playing around with determining the "optimal" RPI-like soccer rankings formula to augment the coaches poll to rank high schools since those polls only rank top teams.
Click to expand...

RPI is a poor model. It's completely ad hoc, and since it doesn't actually model anything, it's impossible to test it and see how it's doing. All you get is a vague ranking assigning each team some numbers -- those numbers don't really predict anything, except that theoretically a team ranked higher should be favored to beat a team ranked lower. (The NCAA doesn't even seem to know how the factors that go into RPI were originally obtained -- reportedly, the statisticians who made it up are dead.)

If there's enough interplay, I'd recommend using something recursive. The fact of the matter is, even RPI needs a computer to calculate, so if you're going to do a computer calculation, why half-ass it? If all you want to use is W/L/D, I'm personally fond of Bradley-Terry in the college hockey arena, but YMMV. I'm not sure how well Bradley-Terry would respond to the point variations in soccer.

beineke · Nov 5, 2003

Re: Re: Soccer rankings

I'd like to second Craig P's advice. RPI is a dinosaur with a lot of flaws, and it's worth taking advantage of something more modern.

Here's an option that might be practical:
1. Use Voros' algorithm to create your ratings ... he has offered to share his macro in his "By Request" thread on this forum.

2. As Kevin suggests, you can modify his rating in the following way: whenever a game is decided by more than N goals, record it as an N-goal win. I would guess that N=4 is a reasonable choice.

guhbahl · Nov 6, 2003

You will see that most of my reply is unique to the fact that I want to apply this to only high school (and maybe club) soccer.

I had seen Voros's and discarded it and any other method that includes goal differential as no high school application would get much support because of the incentive to run up score and substitute less. Also, high school teams change personnel more frequently from year to year so the data set is more limited and ranking method must include some smoothing component since teams are not at "steady-state" earlier in the year.

If I understand the Bradley-Terry method correctly, it only works when you have repeated matches between opponents which is not often the case in high school.

I agree about the "dinosaur" nature of RPI and the poor statistical foundations. But, and I hate typing this as much as you hate reading it, if the algorithm is something that the average fan, player or coach cannot understand it may not be accepted. I would rather have a good rating system that is used rather than the best system that is ignored. And the fact that it is one of the factors considered in NCAA tournament selection and seeding means it is already institutionalized in the mind of coaches to some extent.

I don't think the goal of RPI is too flawed in that it simply tries to weight outcomes by the quality of the opponents involved. The flaw seems to be only in the arbitrary assignment of the various weights used, boundary problems, lag problems and linearity assumptions. Statistical techniques for properly fitting weights that best explain past results might prove interesting. But with only 1 season of data, I was looking for others that might have been down some "modified RPI" dead ends so I can avoid them.

By the way, the purpose here is not to predict future specific outcomes as much as it is to give ALL teams objective, statistical feedback on their relative performance to date, performance trend and strength of othewise unknown opponents. Top 10 polls don't serve that purpose.

Kevin in Louisiana · Nov 6, 2003

If the cap on goal differential is set at, say, 4, as beineke suggested, it's not like running up the score will be a huge problem. I've seen plenty worse than a 4 goal difference. 4-0 isn't really running up the score.

Craig P · Nov 6, 2003

Originally posted by guhbahl
If I understand the Bradley-Terry method correctly, it only works when you have repeated matches between opponents which is not often the case in high school.
Click to expand...

Not true. It only works when you have a chain of both positive and negative results connecting schools, but then again, if no such chain exists, do you really have any basis for comparing them on the basis of RPI?

I agree about the "dinosaur" nature of RPI and the poor statistical foundations. But, and I hate typing this as much as you hate reading it, if the algorithm is something that the average fan, player or coach cannot understand it may not be accepted. I would rather have a good rating system that is used rather than the best system that is ignored. And the fact that it is one of the factors considered in NCAA tournament selection and seeding means it is already institutionalized in the mind of coaches to some extent.
Click to expand...

My argument is that the RPI isn't good and the NCAA is foolish to continue to use it. Unfortunately, it's too entrenched among people who think it tells them something useful (highly debateable) and that it's easy to calculate (it's deceptively difficult).

I happen to think that Bradley-Terry is easy to explain, in the implementation that I know. If two teams are playing, with the rating of one being A and the other being B, than A / (A + B) is A's predicted chance of winning. The ratings are calculated by figuring out what ratings give the best fit to the games that have taken place.

I don't think the goal of RPI is too flawed in that it simply tries to weight outcomes by the quality of the opponents involved. The flaw seems to be only in the arbitrary assignment of the various weights used, boundary problems, lag problems and linearity assumptions.
Click to expand...

That sounds about right (I don't have the background to be sure).

As far as I'm concerned, the lack of recursion is really the big killer for RPI. It produces a "quality" rating for each team (which is presumably better than winning percentage), but prefers winning percentage in the actual calculation. When you really think about it, it doesn't make a whole lot of sense.

By the way, the purpose here is not to predict future specific outcomes as much as it is to give ALL teams objective, statistical feedback on their relative performance to date, performance trend and strength of othewise unknown opponents. Top 10 polls don't serve that purpose.
Click to expand...

Bradley-Terry does an excellent job of this, I can assure you.

Also, Bradley-Terry can be modified to give weight to margin of victory. Just for fun, for college hockey I did up a rating that counted 1 goal as 0.77 win, 2 goals as 0.9 win, 3 goals as 0.98 win, anything higher as a full win (IIRC). In concert with a tie counting as 0.5 win, this produced an aesthetically pleasing curve.

Bradley-Terry, as applied to college hockey:
http://www.mscs.dal.ca/~butler/krachexp.htm
http://slack.net/~whelan/tbrw/2003/krach.shtml

voros · Nov 10, 2003

Originally posted by Kevin in Louisiana
If the cap on goal differential is set at, say, 4, as beineke suggested, it's not like running up the score will be a huge problem. I've seen plenty worse than a 4 goal difference. 4-0 isn't really running up the score.
Click to expand...

And the reality is, in my model, that running up the score has limited value because while it was increasing your goal differential, it would also be decreasing the strength of your schedule. For example, Australia pasted American Samoa 31-0. If they had instead stopped at 4-0, they're rank wouldn't have changed from 54th. American Samoa's rank changes slightly (209th to 208th), but that's only fair as losing 31-0 is much more of a sign of a lack of quality than losing 4-0.

In other words, the quality of your opponents is important in the rankings, and if you paste someone 31-0, the quality of that opponent is going to suffer mightily. Hence the beating will be of little value to you.

But yeah, you could always limit the goal differential.

But in my opinion, if you want the system to _determine_ rank as opposed to _estimate_ it, like I do, then IMO you're 100% correct in only counting wins, losses and draws. You could still use my system, but simply count all wins and losses at 2.3 diff, and all draws at 0.

If you're interested, I could send you the file and you could try it out. I have no clue at estimating home field at that level though.

The big issue is that it's a very dense system and very difficult occassionally to explain why and how the system works.