2012 rpi

Cliveworshipper · Oct 21, 2012

bigsoccerdad said: ↑

Great RPI discussion. Thanks. But have a question and not sure if this was brought up earlier in this thread.. Will this year's RPIs be adjusted for the number of players out at the U20 World Cup for many of the top 20 teams. Any other team was smart to play these squads in the preseason with the hope of beating a weakened roster to accumulate RPI points. (Teams weakened: UNC, Duke, UCLA, Wake Forest, Penn State for example). Almost want to throw the first three weeks this year.
Click to expand...

Huh? The top four teams in the current adjusted RPI all sent players to the u20 world Cup.

It wasn't adjusted in 2008 or 2004, the last two times that cups had major impacts on the season. Nor had it been adjusted in any previous year. It's just part of the whole package you buy into when you recruit players you know will probably make commitments to national teams.

Good teams play past those issues. Lets look at your list. ( i'll note at the start you omitted Stanford)

UNC sent two players to the u20 World Cup DURING THE TOURNAMENT in 2008 and won a national championship. The u20 team won the World Cup as well that year. This year, two of their three losses are conference losses. It is true that they also have a u17 absence. I think that is the price you play for having top players. Lots of teams would love to make that bargain. UNC has lost games before and after the first three weeks they have two conference losses and are still looking at a #2 seed with a shot at a #1 seed. Hardly an argument for fixing the RPI.

It's hard to say UCLA was weakened. They are 13-0-2. They have a high RPI because all their their opponents except two have only double digit RPI's.
The worst RPI team they played in non- conference has a current RPI of 132.They are currently a #1 seed. If they drop, it won't be because they sent players to the u20's.

Duke has lost four games, only one of which was in the first three weeks, and that loss was Florida, who had the same number of players committed to national teams. Their other losses are in the conference season. You can hardly blame Nat team commitments for that.

Wake Forrest likewise lost one game in the first three weeks and two games after that period. They have played 7 games in the latter part of their season where they have scored one goal or fewer. I'm not sure you can blame that on the World Cup. That they played a team with an RPI of 277 and four other teams with RPI over 123 (UNC Greensboro, their non con loss) is, I submit, a bigger factor in their RPI than losses that can be attributed the absence of Nat players.

Penn State's two losses were from before 9/8, it is true. Again, their first loss was to Stanford, who also had commitments to the u20's. They also lost to BYU, which is a formidable team. They gave up 3 goals in that game, which looks like a defensive breakdown to me. Penn State is as much a lock for a #1 seed as you can get anyway. their RPI is fine.

But the biggest argument against cooking the RPI is Stanford. They sent as many players as anyone to the u20's and are still #1 in the RPI by a long shot. They played the hardest schedule and only have the one loss ( in that 3 week period, BTW). Stanford's high RPI is because they didn't schedule cupcakes and did well anyway, Nat team absences or no.

kolabear · Oct 21, 2012

bigsoccerdad said: ↑

Great RPI discussion. Thanks. But have a question and not sure if this was brought up earlier in this thread.. Will this year's RPIs be adjusted for the number of players out at the U20 World Cup for many of the top 20 teams. Any other team was smart to play these squads in the preseason with the hope of beating a weakened roster to accumulate RPI points. (Teams weakened: UNC, Duke, UCLA, Wake Forest, Penn State for example). Almost want to throw the first three weeks this year.
Click to expand...

Nope. Not in RPI, not in an Elo rating. Any tinkering like that would either involve too much subjective judgment (as to what the effect is of these players' absence) - or introduce a highly imprecise variable that would appear to be subjective.

It's a hard effect to measure anyway.* Look at the WNBA finals right now. Indiana has won 2 out of 3 so far without their 2nd leading scorer Katie Douglas. A lot of people wrote them off when Lefty went down with a sprained ankle.

This is territory for the Sagarins and Masseys, if they choose to accept it - when they're trying to play the prediction game for fans... and gamblers.

*CliveW beat me to the point - and using examples from the soccer season, too.

kolabear · Oct 21, 2012

Soccerhunter said: ↑

OK, this is all helpful. Thanks Clive.

I am trying to simplify things for a basic understanding. This means trying to understand the big picture and context. (My personality type likes to first get a sense of the the lay of the forest before trying to look at the trees in detail.)

So what I am getting is this: In the world of ranking systems, there is, in fact, quite a bit of commonality...
Click to expand...

I'm still trying to give the most useful answer I can to Soccerhunter. Maybe it helps to "play along" with the Elo system for a minute, ignore the obvious (and legit) question "how do they do it for soccer?" -- and see what an Elo system is doing, what it's purporting to do, for either chess or soccer.

I suppose it's still easier to start off with chess because you have established players with established ratings to compare to. This part is pretty basic. If the average rating of who you play against is, say, 1500 and you score 2.5 points out of 4 on average, your rating will be 100 points higher: 1600. If instead you only score 1.5 points out of 4, your rating will be 100 points lower: 1400.

Albyn Jones used a different scale for his ratings - a 100 point difference corresponded to a .667 winning percentage.

But in both cases you have players (or teams) with ratings which are designed to correspond to the results they've obtained against opponents whose ratings determine what your rating is. And in turn, their ratings are determined by, among their other results, the result of your game and your rating.

A key presumption (or feature) of the system is shown in this example - (using the Albyn Jones scale), let's say Team A has a .667 winning percentage against opponents with an average rating of 1500. So Team A's rating is 1600. Team B has an .800 percentage against teams with an average rating of 1400. Their rating is also 1600. Team C has a losing record of .333 against teams with an average rating of 1700. Their rating is also 1600. The concept is this - that Teams A, B, and C are all equal in strength and would be expected, when they play each other and other teams rated 1600 to have 50-50 records in those games against each other.

And, generally speaking, that's what happens. It's part of the internal checks of a system like Massey, Sagarin or Albyn Jones. Teams with almost the same ratings as others have 50-50 records against each other. Games with opponents rated 100 points apart have the higher-rated teams winning .667 of the time (or whatever the expected winning percentage is according to the scale chosen by Massey, Sagarin, Jones, etc) . There's some small anomalies of course because real life is messy, but in general these relationships and winning percentages have to be there or otherwise the algorithm is wrong and unacceptable and it's back to the drawing board.

cpthomas · Oct 21, 2012

bigsoccerdad said: ↑

Great RPI discussion. Thanks. But have a question and not sure if this was brought up earlier in this thread.. Will this year's RPIs be adjusted for the number of players out at the U20 World Cup for many of the top 20 teams. Any other team was smart to play these squads in the preseason with the hope of beating a weakened roster to accumulate RPI points. (Teams weakened: UNC, Duke, UCLA, Wake Forest, Penn State for example). Almost want to throw the first three weeks this year.
Click to expand...

No. It's no different than teams having injured players, players who miss games for family reasons, and so on. All games are counted the same no matter what roster issues teams had. Further, for at large selection purposes, the Women's Soccer Committee must make its decisions based on defined considerations that do not include missing players. For seeding, the Committee uses the considerations but is not limited to them so that, in theory, it could consider missing players. I don't believe, however, that it historically has done that. Part of the reason is, how would they get a list of missing players since, presumably, the list would not be limited to those missing for the U20 World Cup, and how would they evaluate the impact on game outcomes of the players being missing? It would open up a tremendous can of worms. So, basically, I think the answer is pretty clearly "No."

Further, from a technical RPI perspective, the first three weeks of the year are critical. The RPI depends on teams from different regions playing each other. The first three weeks of the season are when a great number of the inter-regional games are played. There already are not enough inter-regional games, so that the RPI has issues rating all the teams nationally in a single system. If one were to disregard the first three weeks of the season, the problem would be much worse.

A sometimes misconception of the NCAA Tournament at large selection process is that it is intended to select, for the 34 at large positions, the 34 teams that are the best at the end of the season. That is not correct. It is intended to select the 34 teams that have performed the best over the course of the entire season, with no game counting any differently than any other game.

cpthomas · Oct 21, 2012

kolabear said: ↑

I'm still trying to give the most useful answer I can to Soccerhunter. Maybe it helps to "play along" with the Elo system for a minute, ignore the obvious (and legit) question "how do they do it for soccer?" -- and see what an Elo system is doing, what it's purporting to do, for either chess or soccer.

I suppose it's still easier to start off with chess because you have established players with established ratings to compare to. This part is pretty basic. If the average rating of who you play against is, say, 1500 and you score 2.5 points out of 4 on average, your rating will be 100 points higher: 1600. If instead you only score 1.5 points out of 4, your rating will be 100 points lower: 1400.

Albyn Jones used a different scale for his ratings - a 100 point difference corresponded to a .667 winning percentage.

But in both cases you have players (or teams) with ratings which are designed to correspond to the results they've obtained against opponents whose ratings determine what your rating is. And in turn, their ratings are determined by, among their other results, the result of your game and your rating.

A key presumption (or feature) of the system is shown in this example - (using the Albyn Jones scale), let's say Team A has a .667 winning percentage against opponents with an average rating of 1500. So Team A's rating is 1600. Team B has an .800 percentage against teams with an average rating of 1400. Their rating is also 1600. Team C has a losing record of .333 against teams with an average rating of 1700. Their rating is also 1600. The concept is this - that Teams A, B, and C are all equal in strength and would be expected, when they play each other and other teams rated 1600 to have 50-50 records in those games against each other.

And, generally speaking, that's what happens. It's part of the internal checks of a system like Massey, Sagarin or Albyn Jones. Teams with almost the same ratings as others have 50-50 records against each other. Games with opponents rated 100 points apart have the higher-rated teams winning .667 of the time (or whatever the expected winning percentage is according to the scale chosen by Massey, Sagarin, Jones, etc) . There's some small anomalies of course because real life is messy, but in general these relationships and winning percentages have to be there or otherwise the algorithm is wrong and unacceptable and it's back to the drawing board.
Click to expand...

Kolabear, I could give similar information for the RPI. In other words, I could tell you that on average, a team with X rating superiority over its opponent, will win a game Y% of the time (and I could give you specific rating adjustments to make depending on whether the game is home or away). Of course, the NCAA itself doesn't provide the relationship between rating difference and win likelihood, but that's because the NCAA doesn't use the RPI to predict game outcomes. Nevertheless, it can be done.

Also, there also is at least one anomaly in the systems you referred to and in the RPI that is not simply because life is messy. It is because of the playing pool problem that I write about regularly. Thus, for example, looking at years 2007 through 2011, I don't think any of the systems over that combined period of time would have teams from Division I women's west regional playing pool having 50-50 records against teams with the same ratings from the other four playing pools. Rather, they all would show the west regional playing pool, in those games, doing better than 50-50. This is because the west pool is a good deal stronger on average than the other pools and therefore is underrated. The only reason this might not occur would be if the creator of the system introduced a specific regional adjustment "tweak" to deal with the regional pool problem. Once when I talked to Albyn Jones, and advised him that according to my tests his ratings had a regional pool problem, he told me that he had not recently done any regional comparisons and so had not "tweaked" his system. This suggested to me the possibility that Elo-like systems, like the RPI, also can have bells and whistles to deal with specific rating problems. I don't know if this is really what he meant, but it's how I interpreted what he said.

Carolina92 · Oct 21, 2012

Can someone breakout what the current strength of schedule rankings are after this weekend's results? That seems to always be something the committee pays particular attention to in seeding and selection.

Carolina92 · Oct 21, 2012

Seems like Elo ratings not that great. Had Missouri as #11, who then lost to Arkansas and LSU this weekend (and LSU #78 topped them 3-1).

Cliveworshipper · Oct 21, 2012

Carolina92 said: ↑

Can someone breakout what the current strength of schedule rankings are after this weekend's results? That seems to always be something the committee pays particular attention to in seeding and selection.
Click to expand...

You can do it yourself anytime.

https://www.nc-soccer.com/wsoccer/2012/index.php

last column is strength of schedule. click on the header to order them.

click again to order from best to worst.

kolabear · Oct 21, 2012

cpthomas said: ↑

Kolabear, I could give similar information for the RPI. In other words, I could tell you that on average, a team with X rating superiority over its opponent, will win a game Y% of the time (and I could give you specific rating adjustments to make depending on whether the game is home or away). Of course, the NCAA itself doesn't provide the relationship between rating difference and win likelihood, but that's because the NCAA doesn't use the RPI to predict game outcomes. Nevertheless, it can
Click to expand...

it's different. Yes you could calculate some correlation in RPI but there's bound to be far more deviations from that. In an Elo system, it's an integral part of the system. It's how the ratings are calculated & defined.*

*ie the system as a whole has to have these correlations and the individual teams have to show these correlations to the maximum degree possible. It isn't incidental in other words. It's an integral feature of the system.

kolabear · Oct 21, 2012

Carolina92 said: ↑

Seems like Elo ratings not that great. Had Missouri as #11, who then lost to Arkansas and LSU this weekend (and LSU #78 topped them 3-1).
Click to expand...

All systems limited in how well they predict. But I'll look at this example later when I'm at a computer.

kolabear · Oct 21, 2012

Cliveworshipper said: ↑

You can do it yourself anytime.

https://www.nc-soccer.com/wsoccer/2012/index.php

last column is strength of schedule. click on the header to order them.

click again to order from best to worst.
Click to expand...

And prepare to be shocked. Simply shocked, maybe not with the first ten but as you go down the line past that.

* actually ignore what I said. I'm on cellphone and probly misread.

cpthomas · Oct 21, 2012

kolabear said: ↑

it's different. Yes you could calculate some correlation in RPI but there's bound to be far more deviations from that. In an Elo system, it's an integral part of the system. It's how the ratings are calculated & defined.*

*ie the system as a whole has to have these correlations and the individual teams have to show these correlations to the maximum degree possible. It isn't incidental in other words. It's an integral feature of the system.
Click to expand...

I know that Massey isn't an Elo-based system, but after the season is over, I'll run calculations for both Massey and the RPI to see whether, at comparable rating difference levels, the RPI has more deviations than Massey. I know that when I ran Jones and Massey through a comparison several years ago, overall there were about equal deviations. However, at that time I did not break the number of deviations down into comparable rating difference levels (or, I didn't save them if I did). I don't know, but you may be surprised (again) about the similarities in results. It will be fun to see.

(I run my comparisons by seeing how Massey does with his most closely rated 750 games compared to how the RPI does with its 750 most closely rated games; how they do for the next 750; and so on through all the games.)

kolabear · Oct 21, 2012

Cpthomas is right that usually RPI and Elo/Albyn/Massey* wind up closer than you'd think based on the criticisms RPI is subject to.

* the asterisk is because Massey has one rating which is supposed to based only on win/loss/tie while he has another rating which incorporates goals scored.

Carolina92 · Oct 21, 2012

Cliveworshipper said: ↑

You can do it yourself anytime.

https://www.nc-soccer.com/wsoccer/2012/index.php

last column is strength of schedule. click on the header to order them.

click again to order from best to worst.
Click to expand...

Thanks!!!

Carolina92 · Oct 21, 2012

kolabear said: ↑

And prepare to be shocked. Simply shocked, maybe not with the first ten but as you go down the line past that.

* actually ignore what I said. I'm on cellphone and probly misread.
Click to expand...

Nope, I'm definitely shocked. I'm not sure this is right though. Approx Strength is the strength of schedule rating? I thought it was the overall rating for that team. I think the link pointed to the wrong page. Should it have pointed here instead?....

https://www.nc-soccer.com/wsoccer/2012/?sched

There is no way BYU and San Diego State have a top 5 hardest schedule.

cpthomas · Oct 22, 2012

Carolina92 said: ↑

Nope, I'm definitely shocked. I'm not sure this is right though. Approx Strength is the strength of schedule rating? I thought it was the overall rating for that team. I think the link pointed to the wrong page. Should it have pointed here instead?....

https://www.nc-soccer.com/wsoccer/2012/?sched

There is no way BYU and San Diego State have a top 5 hardest schedule.
Click to expand...

I think you're right. Approx Strength is approximately what a team contributes to its opponents' strengths of schedule. SS is a team's strength of schedule and is on the page to which you provided the link.

cpthomas · Oct 22, 2012

I now have posted my weekly RPI report covering games through Sunday, October 21, 2012, here: https://sites.google.com/site/rpifordivisioniwomenssoccer/rpi-reports. The report is in the form of a downloadable Excel workbook attached at the bottom of the page and titled RPI Report 10.21.12.xls. It contains three pages, one of which is devoted to teams' RPIs, the second to conferences' RPIs and the third to regions' RPIs.

As a reminder, the teams' page shows which teams, based on the last five years' experience, are reasonable possibilities for #1 seeds, #2 seeds, #3 seeds, and #4 seeds in the NCAA Tournament; which teams relatively high in the RPI rankings are reasonable possibilities for not getting at large selections for the Tournament; and which teams farther down in the rankings are reasonable possibilities for getting at large selections.

Soccerhunter · Oct 22, 2012

kolabear said: ↑

I'm still trying to give the most useful answer I can to Soccerhunter. Maybe it helps to "play along" with the Elo system for a minute, ignore the obvious (and legit) question "how do they do it for soccer?" -- and see what an Elo system is doing, what it's purporting to do, for either chess or soccer.

I suppose it's still easier to start off with chess because you have established players with established ratings to compare to. This part is pretty basic. If the average rating of who you play against is, say, 1500 and you score 2.5 points out of 4 on average, your rating will be 100 points higher: 1600. If instead you only score 1.5 points out of 4, your rating will be 100 points lower: 1400.

Albyn Jones used a different scale for his ratings - a 100 point difference corresponded to a .667 winning percentage.

But in both cases you have players (or teams) with ratings which are designed to correspond to the results they've obtained against opponents whose ratings determine what your rating is. And in turn, their ratings are determined by, among their other results, the result of your game and your rating.

A key presumption (or feature) of the system is shown in this example - (using the Albyn Jones scale), let's say Team A has a .667 winning percentage against opponents with an average rating of 1500. So Team A's rating is 1600. Team B has an .800 percentage against teams with an average rating of 1400. Their rating is also 1600. Team C has a losing record of .333 against teams with an average rating of 1700. Their rating is also 1600. The concept is this - that Teams A, B, and C are all equal in strength and would be expected, when they play each other and other teams rated 1600 to have 50-50 records in those games against each other.

And, generally speaking, that's what happens. It's part of the internal checks of a system like Massey, Sagarin or Albyn Jones. Teams with almost the same ratings as others have 50-50 records against each other. Games with opponents rated 100 points apart have the higher-rated teams winning .667 of the time (or whatever the expected winning percentage is according to the scale chosen by Massey, Sagarin, Jones, etc) . There's some small anomalies of course because real life is messy, but in general these relationships and winning percentages have to be there or otherwise the algorithm is wrong and unacceptable and it's back to the drawing board.
Click to expand...

Hi Kolabear,

Thanks for taking the time to get back with the above post on the Elo systems vs the RPI differences. Let me test my "big picture" understanding at this point. That being that while both systems rank teams by their results as modified by the strength of their opponents (and their opponents' opponents, etc.) the RPI uses a blunt seemingly arbitrary swag at framing the calculation (25%-50%-25%) while Elo's more sophisticated math is based on a system which presumably embeds a recursive function whereby the system smooths out the calculated results and automatically check on the the validity. (eg your 1600 example above) Also, the Elo uses its ranking determinations to give predictive percentages based on ranking while the RPI makes no effort at results prediction.

Would this be a fundamentally correct view?

Somewhat different subject....
Noting how you (Cliveworshiper and others) seem always to come back to the chess or tennis examples when explaining Elo, I have the impression that the Elo system is more suited to the chess (or tennis) environment than it is to single season college soccer. In other words, the Elo's accuracy and usefulness is more evident when it can track the same player over her/his entire career and build a much larger (multi-year) and more nuanced database. Applying Elo to single season college sports teams does indeed seem problematic. It would be like trying to track a chess or tennis player where each season he miraculously has lost 25% (sometimes more, sometimes less) of his former skills and knowledge, and replaced them with a potentially weaker set of skills and savvy, but sometimes the incoming skills and knowledge are not so bad as thought or maybe worse. (Perhaps like trying to rank a chess player with mild Alzheimers that comes and goes on an annual basis.)

The end result is that, based on a small sample size in a single season combined with the season-by-season shift in skills and knowledge, an arbitrary swag at a calculation method (RPI) seem less suspect. Is there any real evidence that Elo could do a better job at identifying the top teams for an end of season tournament than the RPI? (What I see CPThomas saying is that for such purposes, the RPI just might have a good a track record as an Elo system.) No?

kolabear · Oct 22, 2012

Carolina92 said: ↑

Seems like Elo ratings not that great. Had Missouri as #11, who then lost to Arkansas and LSU this weekend (and LSU #78 topped them 3-1).
Click to expand...

Surely you can't blame any rating system for having Missouri over both Arkansas and LSU. In games against common opponents prior to this weekend:

Missouri: 7-2 (.778)
Arkansas: 4-4-1 (.500)

Missouri: 8-2 (.800)
LSU: 3-5-2 (.400)

Based on these games alone, you'd expect Missouri to be rated about 180 points higher than Arkansas and about 260 points above LSU.(using the old Albyn Jones scale)

*
Missouri and Arkansas shared 9 common opponents before the weekend. Against these common opponents, Missouri's winning pct was (.778) while Arkansas' was (.500). Whatever the average rating of the opponents is, Arkansas' performance rating was equal to it of course (.500) while Missouri's (.778) corresponds to about 180 points higher in the Albyn Jones scale.

Missouri and LSU shared 10 common opponents. Against them, Missouri's winning pct was (.800) while LSU's was (.400). Whatever their average rating is, LSU's losing record corresponds to a performance rating that's about minus-60 of it while Missouri's is 200 points above it.

For simplicity sake I didn't take homefield into account

kolabear · Oct 22, 2012

Soccerhunter said: ↑

Hi Kolabear,

Thanks for taking the time to get back with the above post on the Elo systems vs the RPI differences. Let me test my "big picture" understanding at this point. That being that while both systems rank teams by their results as modified by the strength of their opponents (and their opponents' opponents, etc.) the RPI uses a blunt seemingly arbitrary swag at framing the calculation (25%-50%-25%) while Elo's more sophisticated math is based on a system which presumably embeds a recursive function whereby the system smooths out the calculated results and automatically check on the the validity. (eg your 1600 example above) Also, the Elo uses its ranking determinations to give predictive percentages based on ranking while the RPI makes no effort at results prediction.

Would this be a fundamentally correct view? I think so. Except as far as the recursive function, in chess (the original Elo rating) there isn't (but they have the advantage of long-established ratings for players to draw upon). The recursive method is necessary, as I understand it, to try to simulate the chess method when you're talking about teams that change every year and you're trying to measure a team's strength over the course (at least primarily) over one season and not over several years, as might be appropriate for measuring individual chess players or tennis players (or online game players).

Somewhat different subject....
Noting how you (Cliveworshiper and others) seem always to come back to the chess or tennis examples when explaining Elo, I have the impression that the Elo system is more suited to the chess (or tennis) environment than it is to single season college soccer. In other words, the Elo's accuracy and usefulness is more evident when it can track the same player over her/his entire career and build a much larger (multi-year) and more nuanced database. Applying Elo to single season college sports teams does indeed seem problematic. It would be like trying to track a chess or tennis player where each season he miraculously has lost 25% (sometimes more, sometimes less) of his former skills and knowledge, and replaced them with a potentially weaker set of skills and savvy, but sometimes the incoming skills and knowledge are not so bad as thought or maybe worse. (Like a chess player with mild Alzheimers that comes and goes on an annual basis.) - I think so. Correct. Although it isn't so much a shortcoming of Elo systems as much as the single college soccer season is a more difficult problem to solve, for any attempt at rating the teams. It's fairly ingenious on the part of mathematicians how they even do an Elo system for something like college soccer.

The end result is that, based on a small sample size in a single season combined with the season-by-season shift in skills and knowledge, an arbitrary swag at a calculation method (RPI) seem less suspect. - I wouldn't call RPI less suspect than Elo even under these circumstances. But I would say that under these circumstances, given its difficulty, that the superiority of an Elo system may be less pronounced, less obvious.

Is there any real evidence that Elo could do a better job at identifying the top teams for an end of season tournament than the RPI? (What I see CPThomas saying is that for such purposes, the RPI just might have a good a track record as an Elo system.) No?- cpthomas does make the case, yes. A rather admirably diligent effort, too. I know the math people usually disagree to some extent although they might say the differences would be small in many ways, for example in predicting the results of a tournament bracket.
Click to expand...

cpthomas · Oct 22, 2012

Soccerhunter, one thing: The RPI's 25-50-25 weighting is not arbitrary, although at first glance it seems that way.

If you were to look at the top to bottom rating spread for teams' winning percentages (RPI Element 1, weighted at 25%), you would see that it is a pretty wide spread. The rating spread for the average of teams' opponents' winning percentages (against other teams) (RPI Element 2, weighted at 50%) is narrower. And, the rating spread for the average of teams' opponents' opponents' winning percentages (RPI Element 3, weighted at 25%) is even narrower. Thus although RPI Element 2 has twice the formula weight of Element 1, the weight is applied to a narrower spread; and similarly although RPI Element 3 has the same formula weight as Element 1, the weight is applied to an even narrower spread.

Looking at the top to bottom spread of the three Elements for games through yesterday, October 21, the spread for Element 1 is 0.938; the spread for Element 2 is 0.431; and the spread for Element 3 is 0.218. What this means is that Element 1's effective weight is 46.5%; Element 2's is 42.7%; and Element 3's is 10.8%. There will be more compression of the numbers over the next few weeks, with Elements 2 and 3 likely compressing more than Element 1. The end result is that, although there is some variation from year to year, in general the effective weight of the three Elements at the end of the regular season (which is the only time that matters so far as the RPI is concerned) is 50% for Element 1, 40% for Element 2, and 10% for Element 3. Since Elements 2 and 3 combined are strength of schedule, this means that a team's winning percentage counts for roughly half of its rating and its strength of schedule counts for the other half.

The NCAA, over the years, has experimented some with different weightings for the three Elements. Although I do not know it for sure, I believe that they do studies to see how well ratings correlate with how teams actually performed over the course of the season from which the ratings were generated. I also believe that the current weighting (which is the one they originally started out with, I believe) is the one they have found to correlate best with the season's performance.

cpthomas · Oct 22, 2012

I now have posted, at the RPI for Division I Women's Soccer website, my weekly report comparing the RPI's ratings and rankings to those of the "Improved" RPI I developed to address the RPI's problem rating teams from different playing pools in a single system. You can find it here: https://sites.google.com/site/rpifordivisioniwomenssoccer/rpi-reports. It is the form of a downloadable Excel workbook and is an attachment at the bottom of the webpage titled RPI to Improved RPI Comparison 10.21.12.xls

Cliveworshipper · Oct 22, 2012

It should probably be noted here that the RPI has changed over time, sometimes drastically, to better reflect what selection committees wanted it to reflect. It started for Basketball in 1981 with FOUR elements weighted with wins being 40% of the total, and is now implemented differently in different sports. At one time basketball used the secret adjustments, but no longer does. And basketball and men's soccer weigh home away differently. Baseball also does also, but using different percentages than the other two sports.

Women's soccer doesn't weigh home/away differently except in the mystery bonus/penalty system.
Until the end of the 1992 season, the women's soccer RPI was 20-40-40. It would be interesting and possibly instructive to see the current standings using the 20-40-40 weights.

Here is a clip from a memo last spring from Gary K. Johnson, Associate Director of Statistics, to the men's basketball selection committee, which I found recently on the ESPN website and gives a history specific to basketball. The evolution of the women's soccer RPI is somewhat parallel, but has evolved slightly differently.

There is also a brief mention of one of the reasons that RPI doesn't use scores in the formula, coaches manipulating the rankings by running up the score. Not mentioned was point shaving scandals.

Brief History of the RPI
In the first RPI formula in 1981, winning percentage was assigned 40 percent and opponents’ success, opponents’ strength of schedule and road success were weighted at 20 percent each. Winning percentages and road success, which essentially added another 10 percent to winning percentages, overpowered the other two factors.
It became apparent, through numerous experimentations, that winning percentage should receive a lesser weighting than the strength of schedule although its real strength is larger.
From 1983 to 1993, the formula was 20-40-40. In 1994, the formula was evaluated to determine the mathematical validity of the weighting system. A group of mathematicians reviewed the formula and recommended the current 25-50-25. They indicated that won-lost record percentage should be increased to add more strength to a team's record and that Factor III should be decreased to ensure that a strong team from a weaker conference was not disproportionately penalized by competing against teams that it is required to play as a league member. Also, a team has less control over Factor III than the other two factors. From 1994 to 2004 along with the three basic factors, a fourth factor existed of bonuses and penalties. Factor IV produced an “Adjusted RPI.” The additional factor assigned different weightings that were added or subtracted from the basic rankings. Positive weightings were assigned for scheduling 50 percent of the non-conference opponents ranked in the Top 50 by the RPI and for wins against all teams ranked in this grouping. Negative weightings were assigned for scheduling 50 percent of the non-conference schedule against teams ranked in the bottom half by the RPI and for losing to any team in this grouping. Also, the sites of games (home, neutral, away) were taken into account for the weighting of good wins and bad losses. All these bonuses and penalties were added or subtracted from the basic RPI.
In 2005, the RPI underwent a major revision. The 25-50-25 formula remained but the values of home wins and losses, and road wins and losses, were changed to the current weightings. Also, the committee dropped the bonuses and penalties of Factor IV.
Throughout the years, home winning percentages in Division I men’s basketball has always been at 66 or 67 percent. Men’s committee chair Bob Bowlsby explained the reason for the won-lost weightings in 2005 as, “The committee adopted a formula that more accurately reflects the historical data regarding a team’s performance at home.”
Road success had been considered repeatedly through the years by the committee. The change in 2005 made road and home success a part of the overall formula instead of its own factor, or a part of bonuses and penalties. When the committee eliminated road success in 1983, it was its own factor and distorted rankings were produced when there was a high number of road wins regardless of the strength of the opponent.
Through the years victory margins also have been studied. Some have argued the RPI would be strengthened if it reflected scoring differentials in addition to won-lost records. But it is believed that point spreads would leave the program open to manipulation by coaches who desire to enhance their teams' ranking by "running up the score." Although point margins are not a factor in the RPI, individual game scores for each Division I team are included in the materials provided to the committee when the selection process is taking place.
Click to expand...

http://espn.go.com/photo/preview/!pdfs/120305/espn_04_rpi_explanation.pdf

cpthomas · Oct 22, 2012

Clive, thanks for posting Gary Johnson's report -- I know I have it somewhere, but it's helpful to see it as a reminder.

One thing I always read with caution is the NCAA's explanation of why they made a change. Each sport has its own RPI issues depending on numbers of games played, the extent of home field advantage, the size of strength differences among conferences and among regions, and the amount of inter-conference and inter-region play. The one area of problems that the NCAA is reluctant to acknowledge is the problem caused by significant strength differences among regions combined with limited inter-region play. I am confident the NCAA is well aware this is a problem area, but it is a hard one to solve using any of the RPI variants the NCAA historically has been willing to consider -- thus the NCAA's reluctance to admit there is a problem. Nevertheless, it wouldn't surprise me if some of the formula changes that specific sports have adopted, notwithstanding how they are explained to the public and even to the sport committee members that approve them, are intended at least in part from an NCAA staff perspective to do the best they can with mitigating the RPI's region problem; and it definitely wouldn't surprise me if the staff works behind the scenes at least to be sure that committees don't approve changes that will make the problem worse.

By the way, although the bonus and penalty adjustments historically have been secret, and although so far as I know the NCAA still has not announced the current adjustment amounts publicly, they now are easily discoverable, with a little work, through the Team Sheets that the NCAA began publishing this year at its RPI Archive.

Cliveworshipper · Oct 22, 2012

Another issue that the NCAA refuses to acknowledge or even discuss, as far as I have seen, is standard error due to small numbers of games. They treat rankings as consistent over ranges far smaller than is mathematically sound in any ranking system. They hint at it by saying, for instance, that "other factors" should also be used for teams closer than 20 games in women's soccer. What they don't say is that the RPI has little meaning across just about that range in the middle of the field ( at about the tail of the selection field)