Home > Soccer Forum > World of Soccer > Statistics and Analysis

Reply
 
Thread Tools Search this Thread Rate Thread Display Modes
Old 05 Nov 2003, 11:52 AM   #1
guhbahl
BigSoccer Newbie
 
Join Date: Nov 2003
Default Soccer rankings

I've been playing around with determining the "optimal" RPI-like soccer rankings formula to augment the coaches poll to rank high schools since those polls only rank top teams. Any other stat geeks out there with ideas or links to research on:

1. Best home field adjustments
2. Best way to include draws in winning percentage(1/2 win seems to be standard)
3. Best weighting for strength-of-schedule measures
4. Best weighting of recent scores to all scores

I'm against using goals in my calculations for "ethical" reasons, but if you have done some work showing, I'll be openminded.

Currently, I am getting best results with

1. No home field adjustment
2. Draws = 1/3 win
3. .3 win pct .4 opp win pct .3 opp opp win pct, but opp and opp opp win pct have lower bound of .333 to prevent poor opponent from ruining your ranking.
4. .67 to last 50% of matches, .33 to first 50% of matches

My criteria for "best" weighting is cumulative winning percentage of higher ranked over lower ranked (based on rankings on the day the match was played).
guhbahl is offline   Quote 

TRY BIGSOCCER
NOW!
NEWS, SCORES & TABLES FOR 1,300 CLUBS

Connect in the web's largest forums.
Blog about soccer from your point of view.
Shop 17,000 authentic soccer items.




On sale for $24.99
at our soccer store

On sale for $102.44
or buy soccer jerseys

Old 05 Nov 2003, 02:06 PM   #2
Kevin in Louisiana
BigSoccer Member
 
Join Date: Feb 2003
Location: Metairie, LA
Default

I'd say it would make most sense to give draws the worth they have in your area. But are you talking nationwide rankings? That could get sticky.

I'd recommend figuring out a way to involve goals. It's gonna be tricky and you're gonna have to figure out a way to reward teams for playing well against good opponents and not for running up the score against lousy teams (set a cap of a couple goals or something).
Kevin in Louisiana is offline   Quote 
Old 05 Nov 2003, 03:47 PM   #3
Craig P
BigSoccer Member+
 
Craig P's Avatar
 
Join Date: Mar 1999
Location: South Bend, IN
Default Re: Soccer rankings

Quote:
Originally posted by guhbahl
I've been playing around with determining the "optimal" RPI-like soccer rankings formula to augment the coaches poll to rank high schools since those polls only rank top teams.
RPI is a poor model. It's completely ad hoc, and since it doesn't actually model anything, it's impossible to test it and see how it's doing. All you get is a vague ranking assigning each team some numbers -- those numbers don't really predict anything, except that theoretically a team ranked higher should be favored to beat a team ranked lower. (The NCAA doesn't even seem to know how the factors that go into RPI were originally obtained -- reportedly, the statisticians who made it up are dead.)

If there's enough interplay, I'd recommend using something recursive. The fact of the matter is, even RPI needs a computer to calculate, so if you're going to do a computer calculation, why half-ass it? If all you want to use is W/L/D, I'm personally fond of Bradley-Terry in the college hockey arena, but YMMV. I'm not sure how well Bradley-Terry would respond to the point variations in soccer.
Craig P is offline   Quote 
Old 05 Nov 2003, 08:04 PM   #4
beineke
BigSoccer Member
 
Join Date: Sep 2000
Default Re: Re: Soccer rankings

I'd like to second Craig P's advice. RPI is a dinosaur with a lot of flaws, and it's worth taking advantage of something more modern.

Here's an option that might be practical:
1. Use Voros' algorithm to create your ratings ... he has offered to share his macro in his "By Request" thread on this forum.

2. As Kevin suggests, you can modify his rating in the following way: whenever a game is decided by more than N goals, record it as an N-goal win. I would guess that N=4 is a reasonable choice.
beineke is offline   Quote 
Old 06 Nov 2003, 01:55 PM   #5
guhbahl
BigSoccer Newbie
 
Join Date: Nov 2003
Default

You will see that most of my reply is unique to the fact that I want to apply this to only high school (and maybe club) soccer.

I had seen Voros's and discarded it and any other method that includes goal differential as no high school application would get much support because of the incentive to run up score and substitute less. Also, high school teams change personnel more frequently from year to year so the data set is more limited and ranking method must include some smoothing component since teams are not at "steady-state" earlier in the year.

If I understand the Bradley-Terry method correctly, it only works when you have repeated matches between opponents which is not often the case in high school.

I agree about the "dinosaur" nature of RPI and the poor statistical foundations. But, and I hate typing this as much as you hate reading it, if the algorithm is something that the average fan, player or coach cannot understand it may not be accepted. I would rather have a good rating system that is used rather than the best system that is ignored. And the fact that it is one of the factors considered in NCAA tournament selection and seeding means it is already institutionalized in the mind of coaches to some extent.

I don't think the goal of RPI is too flawed in that it simply tries to weight outcomes by the quality of the opponents involved. The flaw seems to be only in the arbitrary assignment of the various weights used, boundary problems, lag problems and linearity assumptions. Statistical techniques for properly fitting weights that best explain past results might prove interesting. But with only 1 season of data, I was looking for others that might have been down some "modified RPI" dead ends so I can avoid them.

By the way, the purpose here is not to predict future specific outcomes as much as it is to give ALL teams objective, statistical feedback on their relative performance to date, performance trend and strength of othewise unknown opponents. Top 10 polls don't serve that purpose.
guhbahl is offline   Quote 
Old 06 Nov 2003, 02:21 PM   #6
Kevin in Louisiana
BigSoccer Member
 
Join Date: Feb 2003
Location: Metairie, LA
Default

If the cap on goal differential is set at, say, 4, as beineke suggested, it's not like running up the score will be a huge problem. I've seen plenty worse than a 4 goal difference. 4-0 isn't really running up the score.
Kevin in Louisiana is offline   Quote 
Old 06 Nov 2003, 04:43 PM   #7
Craig P
BigSoccer Member+
 
Craig P's Avatar
 
Join Date: Mar 1999
Location: South Bend, IN
Default

Quote:
Originally posted by guhbahl
If I understand the Bradley-Terry method correctly, it only works when you have repeated matches between opponents which is not often the case in high school.
Not true. It only works when you have a chain of both positive and negative results connecting schools, but then again, if no such chain exists, do you really have any basis for comparing them on the basis of RPI?

Quote:
I agree about the "dinosaur" nature of RPI and the poor statistical foundations. But, and I hate typing this as much as you hate reading it, if the algorithm is something that the average fan, player or coach cannot understand it may not be accepted. I would rather have a good rating system that is used rather than the best system that is ignored. And the fact that it is one of the factors considered in NCAA tournament selection and seeding means it is already institutionalized in the mind of coaches to some extent.
My argument is that the RPI isn't good and the NCAA is foolish to continue to use it. Unfortunately, it's too entrenched among people who think it tells them something useful (highly debateable) and that it's easy to calculate (it's deceptively difficult).

I happen to think that Bradley-Terry is easy to explain, in the implementation that I know. If two teams are playing, with the rating of one being A and the other being B, than A / (A + B) is A's predicted chance of winning. The ratings are calculated by figuring out what ratings give the best fit to the games that have taken place.

Quote:
I don't think the goal of RPI is too flawed in that it simply tries to weight outcomes by the quality of the opponents involved. The flaw seems to be only in the arbitrary assignment of the various weights used, boundary problems, lag problems and linearity assumptions.
That sounds about right (I don't have the background to be sure).

As far as I'm concerned, the lack of recursion is really the big killer for RPI. It produces a "quality" rating for each team (which is presumably better than winning percentage), but prefers winning percentage in the actual calculation. When you really think about it, it doesn't make a whole lot of sense.

Quote:
By the way, the purpose here is not to predict future specific outcomes as much as it is to give ALL teams objective, statistical feedback on their relative performance to date, performance trend and strength of othewise unknown opponents. Top 10 polls don't serve that purpose.
Bradley-Terry does an excellent job of this, I can assure you.

Also, Bradley-Terry can be modified to give weight to margin of victory. Just for fun, for college hockey I did up a rating that counted 1 goal as 0.77 win, 2 goals as 0.9 win, 3 goals as 0.98 win, anything higher as a full win (IIRC). In concert with a tie counting as 0.5 win, this produced an aesthetically pleasing curve.

Bradley-Terry, as applied to college hockey:
http://www.mscs.dal.ca/~butler/krachexp.htm
http://slack.net/~whelan/tbrw/2003/krach.shtml
Craig P is offline   Quote 
Old 10 Nov 2003, 11:13 PM   #8
voros
Totalled Football
 
voros's Avatar
 
Join Date: Jun 2002
Location: Parts Unknown
Default Soccer rankings

Quote:
Originally posted by Kevin in Louisiana
If the cap on goal differential is set at, say, 4, as beineke suggested, it's not like running up the score will be a huge problem. I've seen plenty worse than a 4 goal difference. 4-0 isn't really running up the score.
And the reality is, in my model, that running up the score has limited value because while it was increasing your goal differential, it would also be decreasing the strength of your schedule. For example, Australia pasted American Samoa 31-0. If they had instead stopped at 4-0, they're rank wouldn't have changed from 54th. American Samoa's rank changes slightly (209th to 208th), but that's only fair as losing 31-0 is much more of a sign of a lack of quality than losing 4-0.

In other words, the quality of your opponents is important in the rankings, and if you paste someone 31-0, the quality of that opponent is going to suffer mightily. Hence the beating will be of little value to you.

But yeah, you could always limit the goal differential.

But in my opinion, if you want the system to _determine_ rank as opposed to _estimate_ it, like I do, then IMO you're 100% correct in only counting wins, losses and draws. You could still use my system, but simply count all wins and losses at 2.3 diff, and all draws at 0.

If you're interested, I could send you the file and you could try it out. I have no clue at estimating home field at that level though.

The big issue is that it's a very dense system and very difficult occassionally to explain why and how the system works.
voros is offline   Quote 
Share

Reply

  Home > Forums > World of Soccer > Statistics and Analysis


On sale for $144.99
at our soccer store

On sale for $49.99
or buy soccer jerseys

Share
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Forum Jump

World of Soccer
On The Pitch
Equipment & Gear
Soccer Store
England
Europe
USA
Americas
Asia, Oceania & Africa
Women's Soccer
Not Soccer Related
Customer Service







All times are GMT -5. The time now is 04:48 AM.



 

Copyright © 2009 Big Internet Group, LLC. All rights reserved. PRIVACY POLICY. TERMS OF USE.
The BigSoccer name and logo and 'Share the Passion!' are service marks of Big Internet Group, LLC.
The BIG Network: Soccer | Aussie Rules Football | Travel | Cricket | Lacrosse | Music
Views expressed by the bloggers and users of BigSoccer do not represent the views of Big Internet Group, LLC.