Use of the RPI for Division I Women's Soccer

cpthomas · Jan 25, 2008

As some of you may have read in other threads, I have a strong interest in the NCAA’s Ratings Percentage Index (RPI) as applied to Division I women’s soccer. The Division I Women’s Soccer Committee makes all decisions about at large selections for the NCAA tournament, makes all seeding decisions, and makes all game siting decisions. The RPI is the only statistical rating system and the only ranking system that the Committee uses in making those decisions. Starting about two years ago, in trying to understand some of the Women’s Soccer Committee’s at large selections and seeds, I began to wonder whether the RPI is able to accurately rank teams from different regions in relation to each other, given the limited number of inter-regional games. I felt it would be possible to develop hypothetical scenarios to answer my question.

In order to run these scenarios, I used Excel to program the RPI formula, entered the individual games data for a number of hypothetical scenarios, and then determined the RPI ratings for the teams in those scenarios. In addition, I entered all the real individual games data for the 2007 season so that I could determine the RPI the Women’s Soccer Committee used in making its tournament decisions. (The NCAA published some RPI rankings during the season and also published its RPI rankings based on all games including the tournament games. The NCAA did not, however, publish either its RPI ratings as distinguished from rankings or the RPI rankings the Women’s Soccer Committee used.) Finally, I ran a further hypothetical scenario based on the 2007 real individual games data with modifications, to test whether what my other hypothetical scenarios showed would apply to a real NCAA Division I women’s soccer season.

Based on this work, I prepared a paper on the RPI. I am providing, for those interested in this topic, a copy of the paper in pdf format as an attachment to this entry. I also am attaching, as a separate pdf file, the NCAA staff’s rationale for using the RPI as issued in a Frequently Asked Questions format. I am including that FAQ since I refer to it in the paper.

What I am hoping to generate in this thread is serious conversation about the NCAA’s use of the RPI for Division I women’s soccer. I’m hoping people will read the paper, comment on it, critique it, suggest added or modified substance for it, and add your own substance proving the paper's conclusions right or wrong. I’m also hoping people will pass the paper on to others interested in the Women’s Soccer Committee’s decision-making process. I’m also hoping that people will not use this thread just to vent their feelings (pro or con the RPI, or the Women’s Soccer Committee’s decision-making, or the paper’s conclusions), although I realize that anyone can say whatever they want.

I have provided the paper to all members of the Women’s Soccer Committee, to all members of the NCAA’s Championships/Competition Cabinet (which oversee’s the Committee), to the NCAA in house statistical staff people who are involved with the RPI as used for Division I women’s soccer, to all Division I women’s soccer coaches of West Region teams (I live in the West Region and am a Portland Pilots fan), to the publishers of the SoccerRatings and Massey ratings, and to various soccer media. My hope is to generate a serious look by the NCAA, as well as by the Division I women’s soccer “public,” at the way in which the NCAA uses the RPI for Division I women’s soccer, with a view towards the Women’s Soccer Committee improving its decision-making process.

The following is a short summary of the paper. (The paper itself is 53 pages and the NCAA’s FAQ is several additional pages.)

In Part 1 of the paper, I describe how the Division I Women’s Soccer Committee uses the RPI in making decisions for the Championship tournament, how the NCAA computes the RPI, and what the NCAA’s rationale is for using the RPI. In Part 2, I describe two serious problems with the RPI. In Part 3, I describe the limitations the Committee should impose on itself in using the RPI, due to the RPI’s problems. In an appendix, I describe some lesser problems with the RPI and suggest some changes the NCAA could make to address those problems.

The central part of this paper is Part 2. Part 2 demonstrates that the RPI, as a rating system, has two very serious limitations:

First, the RPI is not able to compare teams from one region of the country to teams from other regions if there are differences in strength among regions. And, in fact, there are differences in strength among regions and they are significant. As a result, the RPI unfairly discriminates against teams from strong regions and unfairly discriminates in favor of teams from other regions, and significantly so. In 2007, the data demonstrate that the West Region was significantly stronger, on average, than the other regions, and the RPI caused the Committee to discriminate against it significantly in its selection of at large teams to participate in the Championship tournament. This does not mean that the RPI always is biased against the West. Rather, it means that the RPI is biased against the strongest region or regions, whichever ones they might be from time to time. In 2007, the data simply demonstrated that the West was the strongest on average, to a significant degree, so that it was the region that suffered.

Second, the RPI is quite inexact. For example, it cannot reliably distinguish between the team it rates as #16 and the team it rates as #40. It cannot reliably distinguish between the team it rates as #47, which is roughly the “bubble” point for selection for an at large position in the Championship tournament, and the team it rates as #90. Even expecting this level of accuracy may be overly optimistic.

Since the NCAA appears committed to continued use of the RPI, I do not suggest in the paper that the Committee not use the RPI. However, given the RPI’s serious problems, I urge that the Committee understand the problems and be very disciplined in limiting how it uses the RPI. Part 1 of this paper suggests that the Committee’s present practice is to rely heavily on the RPI, much more so than is appropriate and in a way that has made its decision making process unfair. If that is true, then the Committee will need to make major changes in its decision making process if it is committed to fairness.

kolabear · Jan 25, 2008

cpthomas said: ↑

...I’m also hoping that people will not use this thread just to vent their feelings (pro or con the RPI, or the Women’s Soccer Committee’s decision-making, or the paper’s conclusions)...
Click to expand...

Well, darn...!

(I like the effort actually and will try to wade into your analysis. You'll probably hear from me again in 6 months...)

khsoccergeek · Jan 26, 2008

As someone who's been fascinated by the RPI but just had a skimming idea of how it works, I really appreciate this.

kolabear · Jan 26, 2008

Too lazy to rigorously check the work, but, tentatively, this looks like a very substantial way of exposing the flaws of the RPI. Of course, it's a lot of number-crunching so it's going to be a little hard (I suspect) for a lot of us to feel like we really have a handle on the problem.

Perhaps one way of fairly summarizing the approach is this:

1) it's a series of "controlled experiments" to see how well the model (the RPI) explains or evaluates the results. In each case or "scenario", one conference is clearly superior to the other conference or conferences, but the RPI fails to reflect that in its rankings

2) one primary reason for the RPI's failure is the primacy still given to the overall won-lost record, which in the case of the top teams in a weak conference will be very good, regardless of its poor performance against teams in another conference.

That seems, to me, to be the trouble at least in this controlled "laboratory" experiment.

3) another lesser distortion or effect comes from the opponent's won-loss record, especially the difference of playing a bottom-ranked team in a strong conference versus playing a top-ranked team in a weaker one. For the RPI obviously it's an advantage to play a team with a high won-loss percentage in a weak conference rather than a better team with a low won-loss percentage in a stronger conference.

The bottom team in a strong conference may have the extra couple wins from its non-conference games, but that won't compensate for all its losses within its conference. And the RPI has no way to account for that.

Now an interesting question would be to see how a different method (let's say Albyn Jones) would fare under these controlled circumstances. Since, with no other match data to go on (in other words no prior records hence no prior ratings) Albyn-Jones would also have to begin with an assigning all teams the same rating to start with, I wonder if the sample is large enough, or whether there's "enough time" for the Albyn-Jones method to properly rank the teams. Or, at least, whether we could clearly see that the Albyn-Jones ratings are more quickly approaching the rankings that commonsense tells us should result from the experiment.

casocrfan · Jan 27, 2008

One of the ways to help all polls is to delay them until league play begins. There should be no polls until after about a month of play where real data exists to start cross comparing.

JML11 · Jan 28, 2008

All statistical ranking systems are flawed. There are too many factors to be taken into account. The Albyn-Jones rankings are more accurate but are still a flawed system. For instance when looking up the Men’s National Team rankings the Albyn-Jones system had the Canadian Men’s National Team at number 5 and Uruguay was number 10. Both of these teams are ranked too high, especially Canada.

A better alternative is what UEFA uses for the Champions League. Here is how the system would work when applied to NCAA soccer.

The structure of the tournament will work like this:

Qualifying Round 1(32 Teams)
Qualifying Round 2 (32 Teams)
Qualifying Round 3 (32 Teams)
Group Stage 32 Teams divided into 8 groups
Knockout Round of 16
Knockout Round of 8
Knockout Round of 4
Finals

Every conference is given a ranking coefficient based on the success of its member teams in past NCAA tournaments. The coefficient changes from year to year. The higher the coefficient the more entries a conference will get. Some conferences may have 2 teams automatically in the group stage and two entered into the third qualifying round while another conference may have only 1 team entered into the 1st qualifying round.

The 32 teams in the 1st qualifying round will be divided up into 16 matchups drawn at random. Each team will play a home and away game. The team with the highest combined score advances to qualifying round two. If the score is tied, then the team with the most away goals advances. If away goals are tied then two 15 minute overtime halves are played. If the score is still tied the matchup will be decided by a penalty shootout.

The 16 winning teams from qualifying round one will be matched up against the 16 teams who enter the tournament in qualifying round two. The matchup will be decided the same way it was in round one.

Round three will work the same as round two except that the winner of the matchup will be entered into the group stage.

Every team in the group stage is given their own ranking coefficient based on the strength of their conference, their previous success in the NCAA tournament over the last 5 years and their performance in the current year. The top 8 teams are put into pot 1, the next 8 in pot 2, the next 8 in pot 3 and the last 8 in pot 4. The groups are then drawn at random. Each team in a group plays the other team twice, once at home and once away. The top two teams from each group advance into the knockout round.

The knockout round 1 matchups are drawn at random and decided the same way as in the qualifying rounds. From qualifying round one up until knockout round 1, you can be drawn against or in the same group as a team from your own conference. In knockout round one, you cannot be drawn against the other team from your group as well. The matchups for Knockout round 2 are drawn at random in bracket form and are decided the same way as before. Knockout round 3 is decided the same way. The finals are decided by one match played at a neutral location.

Each conference is able to allocate the spots into the “new tournament” in any way they like. For example if a team has 4 bids (2 into the group stage, 2 into qualifying round 3) They may choose to give 1 group stage bid to the conference regular season winner and one to the conference tournament winner and the last two bids to the 2 or 3 place teams in the conference regular season standings(If the winner of the conference tournament is the regular season champion as well then the 2nd,3rd and 4th place teams would get the bids) or they may choose to take the top 8 teams and play 1 against 8, 2 against 7, ect and the winner of each matchup gets a bid and the group stage bids go to the highest regular season finishers who win their matchup)

Another benefit from changing the tournament to the UEFA champion’s league format besides fairness is that it will be a lot more exciting as well.

Morris20 · Jan 28, 2008

JML11 said: ↑

All statistical ranking systems are flawed. There are too many factors to be taken into account.
Click to expand...

That's the truth! The problem with all these ranking systems is the committees have to get political cover for making fine distinctions in choosing the 64 (or however many) teams. Inevitably the difference between #65 and #64 is going to be pretty darn minimal (actually the issue is between at large #31 and #30 or something like that since team #64 is an automatic qualifier).

The NCAA is going to have a 64 team (or so), single-elimination tournament in soccer just like every other sport (except D1 Fat Cat division football), and allocating the bids is a HIGHLY political process. You have to provide access to everyone at some level, and balance that by providing enhanced access to teams that are usually stronger and come from schools that generate most of the revenue (i.e. from BCS conferences). THEN you need a system for a group of people who know relatively little about teams outside their region to quickly and logically select at-large teams.

I mean how do you really choose between Western Illinois and Duke (for example)? One has a way better record and probably achieved the realistic maximum a team could achieve with their schedule, one played a MUCH tougher conference schedule and probably has better players. I doubt ANYONE saw both of these teams play (certainly no one on the championship committee did) and they had no common opponents. There's going to be a flawed numerical system in there somewhere . . . eventually they'll give up on RPI and go with something else, but it's just a matter of who's going to get the short end, not a question of there being total absolute fairness (whatever that is).

kolabear · Jan 28, 2008

Morris20 said: ↑

I mean how do you really choose between Western Illinois and Duke (for example)? . . . eventually they'll give up on RPI and go with something else, but it's just a matter of who's going to get the short end, not a question of there being total absolute fairness (whatever that is).
Click to expand...

Ouch, really?! You have a tough time making that call?!

Sure 16-3-1 sounds good and any team can slip once or twice, but 3 losses and a tie against powerhouses like Evansville, Oakland, Creighton and East Illinois when your victories are against similar opposition should tell you something.

Of course, if we're relying on RPI admittedly we may become confused. Although the RPI (fortunately!) ranked Duke 29 and Western Illinois 69, if you calculate their impact on an opponent's strength of schedule you'd get a slightly different picture: To their opponents, Western Ill is a tougher opponent (!)than Duke according to RPI and by a significant margin: 0.763 to 0.624.

Albyn Jones ratings: Duke 1754, Western Illinois 1501. (A 200 point rating difference indicates that the superior team would be expected to win 4 out of 5 games)

No rating system is going to be perfect. That's not the issue. But when playing Western Illinois helps your strength-of-schedule more than playing Duke does, isn't something seriously wrong? You decide.

kolabear · Jan 28, 2008

JML11 said: ↑

All statistical ranking systems are flawed. There are too many factors to be taken into account. The Albyn-Jones rankings are more accurate but are still a flawed system. For instance when looking up the Men’s National Team rankings the Albyn-Jones system had the Canadian Men’s National Team at number 5 and Uruguay was number 10. Both of these teams are ranked too high, especially Canada .
Click to expand...

No system is perfect, true. But before throwing up our hands (and by default consenting to the continued use of RPI to the exclusion of all other methods), we might ask why the National Team rankings would appear to be more vulnerable to error than, say, the college rankings.

The lack of samples is one problem. National teams often play far fewer games in a year than the Division 1 teams we're talking about. And even if any one national team we're assessing plays 15 to 20 games in a year, many of their opponents may be playing far fewer, making the results less reliable.

If lack of interconference or inter-regional games is a problem for college rating systems, it must be even more so for National Teams. And then there's the issue of "friendlies", games in which the national team may not even call up its best players but instead allow them to stay with their club teams.

Attempting to rate national teams presents special problems that are unique or especially severe. So to dismiss the college ratings because of the problems with national team ratings may be an over-reaction.

Morris20 · Jan 28, 2008

kolabear said: ↑

Ouch, really?! You have a tough time making that call?!

Sure 16-3-1 sounds good and any team can slip once or twice, but 3 losses and a tie against powerhouses like Evansville, Oakland, Creighton and East Illinois when your victories are against similar opposition should tell you something.

Of course, if we're relying on RPI admittedly we may become confused. Although the RPI (fortunately!) ranked Duke 29 and Western Illinois 69, if you calculate their impact on an opponent's strength of schedule you'd get a slightly different picture: To their opponents, Western Ill is a tougher opponent (!)than Duke according to RPI and by a significant margin: 0.763 to 0.624.

No rating system is going to be perfect. That's not the issue. But when playing Western Illinois helps your strength-of-schedule more than playing Duke does, isn't something seriously wrong? You decide.
Click to expand...

Snarky, snarky. Here's Duke's non-conference win list: Yale, Elon, Dartmouth, San Francisco, and Ohio State (12-10, 5-5). Want to play "find the powerhouse" in that list? If you don't have a tough time with it, I submit you're just using brand names and looking beyond their performance on THIS season.

A-J is using results from previous seasons - so it's more in line with fan expectations, but inherently unfair when you're trying to compare a team's "body of work" for THIS season. But invoking it aligns you with my point that the committee needs some sort of numerical ranking because it provides some basis for an opinion.

You don't want the national committee having a snark-off to see who gets the last couple of spots (unless it's on espn2 and gets good ratings - maybe that IS the best solution).

JML11 · Jan 28, 2008

kolabear said: ↑

No system is perfect, true. But before throwing up our hands (and by default consenting to the continued use of RPI to the exclusion of all other methods), we might ask why the National Team rankings would appear to be more vulnerable to error than, say, the college rankings.

The lack of samples is one problem. National teams often play far fewer games in a year than the Division 1 teams we're talking about. And even if any one national team we're assessing plays 15 to 20 games in a year, many of their opponents may be playing far fewer, making the results less reliable.

If lack of interconference or inter-regional games is a problem for college rating systems, it must be even more so for National Teams. And then there's the issue of "friendlies", games in which the national team may not even call up its best players but instead allow them to stay with their club teams.

Attempting to rate national teams presents special problems that are unique or especially severe. So to dismiss the college ratings because of the problems with national team ratings may be an over-reaction.
Click to expand...

You raise an excellent point but I believe the most important factors to consider when coming up with an NCAA postseason format are Fairness and Excitement. The UEFA champions league format is more fair and will provide for a more exciting tournament then any 64 team single elimination tournament with entries based on a statistical ranking system will.

kolabear · Jan 28, 2008

Morris20 said: ↑

Snarky, snarky. Here's Duke's non-conference win list: Yale, Elon, Dartmouth, San Francisco, and Ohio State (12-10, 5-5). Want to play "find the powerhouse" in that list? If you don't have a tough time with it, I submit you're just using brand names and looking beyond their performance on THIS season.
Click to expand...

Sorry, really not trying to be snarky. Just surprised (aghast really) that defense of the RPI can come down to this even among some of the more knowledgeable contributors on the forum.

Yes. Correct about Elon, Yale and the non-conference as a whole. Although I should say something about Ohio St and San Diego, for the moment let's agree that any attempt at ranking faces the challenge of having few non-conference games (or a bad selection) to go by.

But what about the ACC???

This is the virtue of Elo-like ratings like Albyn Jones, at least as a one-time hack chessplayer sees it. (The chess rating system was originally devised by a physics professor Arpad Elo). Duke's non-conference results this year are of limited use for our purpose (comparing Duke with Western Illinois) but the Elo-type ratings MAKE USE OF ALL THE INTERCONFERENCE INFORMATION FROM BOTH ONE TEAM'S SCHEDULE (e.g Duke) AND ITS OPPONENTS. Particularly pertinent here of course is the non-conference schedule of all Duke's ACC opponents.

Ties with Wake Forest (which beat Louisville, Long Beach, Pepperdine in non-conference), Florida State (defeated Auburn) says something about the credibility of the team and its performance.

Okay, some of us aren't West Coasters, so who the heck is Long Beach and Pepperdine. (I wish you would trust us West Coasters on this one - they ain't Elon, North Illinois, East Illinois, Illinois State and Evansville)

Louisville and Auburn aren't powerhouses either. But like Ohio State (12-10, 5-5), they aren't Elon, North Illinois, East Illinois, etc either.

I routinely criticize Louisville, Auburn and Ohio St as being overrated by the RPI in the context of NCAA selections. But in this context, c'mon, they're credible teams in major conferences.

It really is surprisng that we can't see beyond Ohio State's barely-over .500 record to see that they shouldn't be lumped in with smaller programs.

It's misleading to say that Duke's higher ratings in Albyn jones is due to performance in years past. Far more important is the use an Elo-type system makes of all the games and interconnections between the diffrent teams. Not just the two degrees of separation that, superficially, the RPI uses, but the 3rd and 4th degrees and beyond. As Pipsqueak said, it sees deeper into this web of interrelated results than you or I can manually trace. But we can see how it works, or get a sense how it works...

cpthomas · Jan 28, 2008

A point of information on the Albyn Jones system. Although at the beginning of the year, it relies heavily on past seasons' data, the reliance diminishes over the course of a season. There still is some reliance on past data at the end of the season, but with a small impact. Jones has done it this way because he has found that from a statistical perspective, this results in his system being a better predictor of game results than if he does not do it. (He is able, by the way, to run his ratings with no reliance on past data so that he can compare which approach results in better predictions.)

The Massey system is much like the Jones system, but also considers game scores. Massey also begins the season with reliance on past data, but by the end of the season has reduced the reliance on past data down to zero.

The RPI, on the other hand, has no reliance at any time on past data. Instead, the RPI starts out treating all teams as equal -- and, by extension, starts out treating all conferences and regions as equal. This initial "weighting as equal" continues throughout the season and is changed only to the extent that there are sufficient game results to change it. The RPI paper (attached to the first message on this thread) demonstrates that given the limited number of inter-regional games, this initial "weighting as equal" still has a significant effect at the end of the season, causing the RPI to make significant errors in ranking teams from different regions in relation to each other. (As illustrated by the problem of Western Illinois as an opponent helping your RPI more than Duke as an opponent, discussed previously on this thread.)

Here's a scenario that shows how the "weighting as equal" works. I've actually thought that weak conferences could set their schedules to achieve this. The Southwestern Athletic Conference (SWAC) is the weakest conference in DI Women's Soccer. It has ten members. Suppose those members, instead of playing out-of conference games, played only a double round robin within their conference. And, suppose the top SWAC team went 18-0 in the conference, the second team went 16-2, the third team went 14-4, etc. The top team would get into the tournament as a conference winner. The RPIs (unadjusted) of the second and third teams would be 0.5972 and 0.5699 respectively. The adjustment bonuses and penalties make very little difference in this area of the ratings (moving teams up or down at most by 3 or less positions). So, with these RPIs, the second team in the SWAC would be #31 in the RPI rankings and the third team would be #47 (which is right where the bubble group begins). Further, the NCAA's main criteria for considering at large teams, other than the RPI, are head-to-head results and results against common opponents, of which there would be none by which to compare the SWAC teams to other teams. This would put the Women's Soccer Committee, if it is going to follow its rules, in the position of probably having to give the second SWAC team an at large position and having to give the third SWAC team strong consideration for an at large position. (I do note that if the SWAC persisted in doing this over a period of years, both the Jones and Massey systems would have a similar problem, but not as extreme for Jones since his system would carry forward even to the end of the season at least some of the effect of the SWAC teams getting defeated in the first round of the preceding year's tournament.)

Morris20 · Jan 28, 2008

kolabear said: ↑

Sorry, really not trying to be snarky. Just surprised (aghast really) that defense of the RPI can come down to this even among some of the more knowledgeable contributors on the forum.

Yes. Correct about Elon, Yale and the non-conference as a whole. Although I should say something about Ohio St and San Diego, for the moment let's agree that any attempt at ranking faces the challenge of having few non-conference games (or a bad selection) to go by.

But what about the ACC???
Click to expand...

Actually, on this one, I think you're showing a bi-coastal bias and discounting flyover teams like Evansville, Iowa State, and Creighton (all are more than competitive with Ohio State - which is why OSU won't travel and play them away).

And there's an unspoken disagreement - WIU BEAT people. Duke DIDN'T.

Certainly it's more impressive to tie Florida State than beat Southern Utah, but Duke didn't really beat ANYONE except for OSU (lost to Wake in the first round of the ACC tournament). I think when a team wins its league and handles a legitimate schedule for its region, that should carry more weight. Most years, Duke is more deserving, but this is ONE year, and this year, Western Illinois had a better season by any measure (except that they're not in the ACC, kind of a tough bar to jump). Duke won only 3 of 10 conference matches - they beat NC State, VaTech, and Miami). I know they had some great ties . . . but they're being rewarded for scheduling a pretty mediocre non-conference schedule (which they didn't dominate), and then simply competing in the ACC. If you're comfortable with that bar, I really don't agree.

kolabear · Jan 29, 2008

Morris20 said: ↑

Actually, on this one, I think you're showing a bi-coastal bias...

And there's an unspoken disagreement - WIU BEAT people. Duke DIDN'T.

Certainly it's more impressive to tie Florida State than beat Southern Utah, but Duke didn't really beat ANYONE except for OSU (lost to Wake in the first round of the ACC tournament)...
Click to expand...

Bicoastal bias? What bicoastal bias? What coast is Creighton on?

I'm sure we all tend to overlook teams in different regions but, hey, I'm actually sticking up for Ohio State and Louisville in this, aren't I?!

Well, the question is whether we believe in the principle of "strength of schedule" or not. Because if we do, then it's perfectly understandable how one team's .500 record (against strong opposition) can outweigh another team's near-perfect record against much weaker opposition. That's what "strength of schedule" is about, right? - the ability, or willingness, to look beyond the simple won-loss record.

As you acknowledge, "Certainly it's more impressive to tie Florida State than beat Southern Utah." Certainly it's not a stretch then to think that it's more impressive to have a whole schedule full of ties with Florida States than wins over Southern Utahs. And that's what I'm saying about ties with FSU, Wake Forest, Boston College, and, yes, San Diego compared to wins over your Southern Utah, North Illinois, Illinois State, South Dakota State - even though those teams have above .500 records.

While I'm sure there's people on both sides, for once I'm pretty sure I'm not in a tiny minority and there's a fair number that have persuaded themselves that Duke is the better team, not only because of their reputation in years past, but on the basis of their current record even if that record isn't much above .500 - The interesting challenge (as I see it) is whether, using any number of examples like this, we can demonstrate how the RPI is far more deficient in revealing these differences than a Elo-like, or chessrating-like, system like Albyn Jones.

Morris20 · Jan 29, 2008

kolabear said: ↑

Bicoastal bias? What bicoastal bias? What coast is Creighton on?

I'm sure we all tend to overlook teams in different regions but, hey, I'm actually sticking up for Ohio State and Louisville in this, aren't I?!
Click to expand...

No. You're sticking up for large conference teams, mostly Duke.

My point is if you compare the teams that WIU and Duke beat there's not any difference in strength of schedule. The question is balance. Strength of schedule IS important, but the question is at what point are the scales balanced, after all you can only beat the teams on your schedule and that will schedule you. IMHO, if you don't actually WIN some of those games (which Duke didn't - at ALL), I think you're simply taking a team as recognition that they already have every advantage in resources, media, reputation, etc.

If we renamed these schools A & B, and defined their schedules statistically, I think you'd be on the other side of the argument.

I'm increasingly interested in the snark-off idea. It certainly makes as much sense as any of the numerical systems - this is about entertainment, right? Btw, you can't possibly make use of a system that goes beyond the current season to actually make decisions on who goes to NCAA's in a given year? Can you? (not to criticise A-J, it makes sense, it's just not designed as a selection tool)

casocrfan · Jan 29, 2008

I'm not a math-person so I won't argue which statistical method is better, but I will agree that whichever one you choose will be criticized.

However, I do think that some basic changes in how teams are selected and seeded would go a long way at solving the issues:

1. National and Regional seeding for the tournament!! This is a must. How many times do Stanford and Santa Clara have to meet in the first round? There is no reason to rank teams if you are not going to seed them in a manner that represents the actual rankings!! (the change this year to prevent conference match-ups through the second round is a good move)

2. Eliminate automatic bids to conference tournament winners. The team that wins the regular season deserves the bid, not a team that gets hot for a minor tournament. Eliminating this would solve a significant amount of bickering over teams that get in with losing records!! Reward teams that have shown consistency over a season, not a week.

3. No team that does not finish with more wins than losses AND in the top half of their conference gets a bid. Come on folks - if you can't win in your own league you don't deserve to represent your conference.

4. Do not use any ratings system that uses previous season's results in any way to calculate its numbers. There is enough data by the end of October to calculate a reasonable statistical analysis for strength of schedule, conference rankings, etc. Rankings, RPI, ABJ before the end of October are meaningless and only get in the way of having an honest conversation about THIS year's performance.

kolabear · Jan 29, 2008

Morris20 said: ↑

No. You're sticking up for large conference teams, mostly Duke.
Click to expand...

I'm sticking up for strong conference teams, and the strongest conferences the last few years have been the ACC, the Pac-10 and the WCC. WCC doesn't count as large conference , does it?

Morris20 said: ↑

My point is if you compare the teams that WIU and Duke beat there's not any difference in strength of schedule [...] after all you can only beat the teams on your schedule.
Click to expand...

That's a persuasive way of putting it and I'm sure many will be convinced of that. But the idea's usually put in a slightly different context - where one team, or one competitor, is unbeaten, or unbeaten over a long, long stretch. Like Roy Jones, the boxer -"Hey, he beat whoever they put in front of him." Or near the college Bowl Season when the annual BCS debates crank up and some team out of nowhere is 11-0. Or here on BigSoccer, where we're talking about Notre Dame and the Big East. (OK - that was snarky. I felt obliged to provide a certain amount of snark-content for entertainment value. )

While Western Illinois has an excellent won-loss record, it's not like they're perfect: three losses and a tie (against Evansville, Oakland, Creighton, and Eastern Illinois).

Now the better the team, the better their record is going to be against any set of opponents. Western Illinois' 16-3-1 (.825 percent--wise) is excellent but we'd expect North Carolina to do even better against the same opponents. In most years, I imagine most people would expect them to run the table unless they fell asleep from boredom. I imagine that there are several teams that could run the table against this schedule. Even a Big East team not named Notre Dame could probably do it...(courtesy of Snark Entertainment Network!) And in that case, yes, how do you compare this Big East team against, say, North Carolina? They beat who was put in front of them. Who knows just how good they are?

Even one loss or a tie could be pretty iffy to hold against them. (Anyone can have a bad day. Maybe they had the flu. Maybe they fell asleep from boredom.) But 3 losses and a tie is a trend. It's a good trend, compared to the rest of the field, but still it's a trend and that's what an Elo-like ratings system can do, is put a number on the trend.

(Not that it proves anything but, for what it's worth, the number Albyn Jones puts on this trend is 1502, Western Illinois' rating, while the median rating of its opponents is 1252. A team rated 500 points higher than another would be expected to win 32 out of 33 times or virtually every game in a season. So, very crudely speaking, you would expect any team rated 1750 or over to have a shot at a perfect record playing Western Illinois' schedule.)

*****

Regarding there being no difference in strength of schedule of teams that Duke and WIU beat, I don't think you're correct even on the face of it, because I don't see which of WIU's victories equals Ohio State and Miami, although maybe Creighton and Evansville get near the ballpark (and I really don't know them so there is that bias, yes)

But the other important thing is about the ties. Ties are important. As teams are closer and closer in strength, you expect ties to be more likely. It's an indicator of strength. As you said, a tie against Florida State is more impressive than a win over South Utah. And Duke showed that it wasn't a fluke with ties against Wake Forest and Boston College, etc. Not to mention getting to within one step of the Final Four (Geez, you picked this example, not me!)

I think, with your emphasis on victories, we're probably talking about slightly different things. It's about the kids at some place like Western Illinois, and wanting them to be rewarded for their accomplishments and succeeding in their conference, and victories always seem worth more of a celebration than draws. Of course, the conference champions automatically do qualify for the tournament, but still... I understand this belief carrying over even further. Someone said on an earlier thread, and maybe it was you, that the 6th or 7th team from a conference shouldn't be invited no matter how strong they are -- For the reason that they had their chance to put their stamp on the season. Let some other kids have a chance.

I take no position on this.

It's an interesting point of view but I'm here, for the moment, to focus on the RPI and its (in)adequacy as a measure of a team's strength as demonstrated by its record. Now oddly, for all the disagreement on Duke and WIU, the RPI actually ranks Duke over WIU! The example, though, as well as this discussion, can serve to illustrate its severe shortcomings, but I still need to better focus on that.

Cliveworshipper · Jan 29, 2008

casocrfan said: ↑

1. (the change this year to prevent conference match-ups through the second round is a good move)

2. Eliminate automatic bids to conference tournament winners. The team that wins the regular season deserves the bid, not a team that gets hot for a minor tournament. Eliminating this would solve a significant amount of bickering over teams that get in with losing records!! Reward teams that have shown consistency over a season, not a week.

3. No team that does not finish with more wins than losses AND in the top half of their conference gets a bid. Come on folks - if you can't win in your own league you don't deserve to represent your conference.

4. Do not use any ratings system that uses previous season's results in any way to calculate its numbers. There is enough data by the end of October to calculate a reasonable statistical analysis for strength of schedule, conference rankings, etc. Rankings, RPI, ABJ before the end of October are meaningless and only get in the way of having an honest conversation about THIS year's performance.
Click to expand...

I have some comments about your suggestions: for the sake of saving a little space, I have edited your suggestions a little. If I did you a disservice, let me know:

1) preventing matchups within conferences within the first two rounds, helps some teams, hurts others greatly. For the team I follow, for example, it hurts greatly. Portland is repeatedly forced to travel in the first round simply because other schools within the 350 mile radius haven't been very good lately. There are only 5 schools in that radius, and recently, they have mostly been bad. UW,WSU,OU OSU and Gonzaga heven't been good enough to make it, and now Gonzaga is eliminated as an opponent if they do get good.

The system seems designed for Eastern schools that are close together, not western schools that are in states larger than the 350 mile radius.

It's even worse for the PAC10 schools in the NW. If one of them gets in, the only way they get to stay home or near home is if either Gonzaga or UP get in.

A suggestion-- if you are going to reward big conferences with the same-conference rule, balance that by eliminating the 350 mile rule.

2) I don't have a problem with the tournament winners getting the automatic bid if that's what the conference wants. I think, however, that if you don't win your conference, you should be penalized. How about not counting RPI when comparing to teams outside the conference, and placing more weight on the other criteria, like head-to-head competition and wins outside of conference.

Also, as it currently stands, the large conference get it both ways, they get to claim low RPI, AND they get to count conference standings. I say pick one criteria. (see item 3 for further suggestion)

3) As to winning in your conference and a winning record, I totally agree, and the fix is easy. In the FBS, if you don't have a winning season, you don't get in, period. Make it the same for women's soccer. I'd go further and suggest a cap of perhaps 4 teams per conference (maybe even 3) If you can't place higher than that, you don't deserve to take up space in the bracket.

If conferences want to have a playoff over the 4 spots or use their own criteria, I'm fine with that, but the number of slots per conference should be limited, conference rank should mean something, whether that is by season or tournament play.

A conference gets a max of 3 or 4 teams - how they do it is their buisness

4) there is only enough data to compare regions if there is a lot of inter-regional play. Require it. If you are a serious contender, you should be playing all comers. and adjustment to the RPI for out of region play, for those wedded to that statistic, would force it. eliminate the use of elements 2 and 3 RPI within conference. It's basically meaningless anyway, since elements 2 and 3 (Rank against opponents and opponents of opponents) of the RPI are essentially the same for all schools in the conference.

That would make the conference rank more important within conference, and the RPI more important where it has meaning, (out of conference)

There is an alternative, and that is to make the number of teams per region fixed, with perhaps one or two wild cards. Then have the regions square off early to save travel. Pro leagues do it, NCAA LaCrosse does it (to a degree) and it seems to work.

kolabear · Jan 29, 2008

casocrfan said: ↑

I'm not a math-person so I won't argue which statistical method is better...

4. Do not use any ratings system that uses previous season's results in any way to calculate its numbers. There is enough data by the end of October to calculate a reasonable statistical analysis for strength of schedule, conference rankings, etc...
Click to expand...

You don't need to be a math-person to take sides on statistical methods - this is BigSoccer! No, seriously, your 4th suggestion rules out the use of Albyn-Jones as it's currently calculated, but the rationale behind it is, of course, quite understandable and even honorable.

In fact it's one of the best defenses for the RPI - that it's totally drawn on the results from the current season.

Someone here mentioned that Albyn Jones does run a separate rating system based on the current season's results only. Obviously, though, they find that this doesn't seem to predict game results as accurately as the published system. Contrary to your premise, there might not be enough games in a single regular season to provide "reasonable statistical analysis for strength of schedule, conference rankings, etc..."

casocrfan · Jan 29, 2008

Cliveworshipper said: ↑

I have some comments about your suggestions: for the sake of saving a little space, I have edited your suggestions a little. If I did you a disservice, let me know:

1) preventing matchups within conferences within the first two rounds, helps some teams, hurts others greatly. For the team I follow, for example, it hurts greatly. Portland is repeatedly forced to travel in the first round simply because other schools within the 350 mile radius haven't been very good lately. There are only 5 schools in that radius, and recently, they have mostly been bad. UW,WSU,OU OSU and Gonzaga heven't been good enough to make it, and now Gonzaga is eliminated as an opponent if they do get good.

The system seems designed for Eastern schools that are close together, not western schools that are in states larger than the 350 mile radius.

It's even worse for the PAC10 schools in the NW. If one of them gets in, the only way they get to stay home or near home is if either Gonzaga or UP get in.

A suggestion-- if you are going to reward big conferences with the same-conference rule, balance that by eliminating the 350 mile rule.QUOTE]

I agree 100% - the rule is nonsense. The highest rated team should ALWAYS host (or have the option to host).
Click to expand...

cpthomas · Jan 29, 2008

Since there's been discussion comparing Western Illinois and Duke as an example, and since one of the writers indicated an interest in how their non-conference strengths of schedule compared, I've computed their strengths of schedule as determined using the RPI.

The formula for determining their strengths of schedule is as follows:

((2 x Element 2) + Element 3)/4

Element 2 is the average of Team A's opponents' win-loss records against teams other than Team A (with a tie counting as half a win and half a loss).

Element 3 is the average of Team A's opponents' opponents' win-loss records computed in the same way.

Duke's non-conference opponents were Pepperdine, San Diego, Yale, Texas A&M, Elon, Dartmouth, San Francisco, and Ohio State. These teams end up giving Duke, as the strength of schedule portion of its RPI, a value of 0.4033.

Western Illinois' non-conference opponents were UW-Green Bay, Evansville, N. Illinois, Illinois State, E. Illinois, Missouri State, Iowa State, and Creighton. These teams end up giving Western Illinois a strength of schedule value of 0.3779.

Since the NCAA never publishes its actual RPI ratings, but rather only the rankings derived from the ratings, it might not be obvious whether this is a large or small difference in RPI ratings. Standing alone, without reference to Element 1 (team's win-loss record), it would make a difference of about 20 positions in the RPI end-of-season rankings. As a matter of interest, given where Duke and Western Illinois fall within the RPI ratings, this probably is well within the RPI's standard error, meaning that the differences in the two teams' strengths of schedule as calculated by the RPI is not statistically meaningful.

However, as I demonstrate in one part of the RPI paper, it appears that the Women's Soccer Committee considers the RPI to be quite precise, essentially ignoring the importance of paying attention to standard errors when using statistical rating systems.

For those of you who have not read the paper (it's an attachment to the initial entry on this thread), what I have suggested, if the NCAA insists on using the RPI and only the RPI (other than head to head results and results against common opponents), is that the NCAA calculate the average RPI for each region, using only inter-regional games. By doing that, the NCAA would be recognizing that teams mostly play within certain pods, consisting of the teams within their conferences and the teams within their regions. The RPI may be reasonable for comparing teams within their own pods, but what the paper demonstrates it cannot do without significant errors is compare teams from different pods to each other. In other words, using the Duke-Western Illinois example, the RPI is not capable of accurately comparing Duke to Western Illinois. The question then is how to compare teams from different regions to each other. Although no method is perfect, using the average RPI by region, as derived only from inter-regional games, would give the best measure one can get from the RPI of how the regions compare to each other. Then, when making decisions among teams for purposes of at large selections and seeding, the Committee always would give preference to teams from the stronger region(s) when the other criteria do not make the choices obvious. The Committee also might consider teams from the stronger regions that are farther down in the RPI ranks than teams from the weaker regions.

Morris20 · Jan 29, 2008

cpthomas said: ↑

Since there's been discussion comparing Western Illinois and Duke as an example, and since one of the writers indicated an interest in how their non-conference strengths of schedule compared, I've computed their strengths of schedule as determined using the RPI.
As a matter of interest, given where Duke and Western Illinois fall within the RPI ratings, this probably is well within the RPI's standard error, meaning that the differences in the two teams' strengths of schedule as calculated by the RPI is not statistically meaningful.
Click to expand...

cpthomas said: ↑

However, as I demonstrate in one part of the RPI paper, it appears that the Women's Soccer Committee considers the RPI to be quite precise, essentially ignoring the importance of paying attention to standard errors when using statistical rating systems.
Click to expand...

It comes back to the same problem - you have to let a bunch of people make some kind of distinction and choose a field. This is a political problem - NOT a mathematical one.

Aside from that, learning more about how the RPI works is pretty interesting stuff! (of course, if you win your league, you don't have to care)

Cliveworshipper · Jan 29, 2008

Morris20 said: ↑

It comes back to the same problem - you have to let a bunch of people make some kind of distinction and choose a field. This is a political problem - NOT a mathematical one.

Aside from that, learning more about how the RPI works is pretty interesting stuff! (of course, if you win your league, you don't have to care)
Click to expand...

You still have to care. cpthomas has demonstrated elsewhere that the committee is in lockstep with the RPI when it come to using it for seeding in the tournament. In fact, that and the travel restrictions are the ONLY two criteria used in practice.

It sorta matters if you draw Elon or Santa Clara for your next game, even if Santa Clara is having a down year.

JuegoBonito · Jan 29, 2008

Morris20 said: ↑

It comes back to the same problem - you have to let a bunch of people make some kind of distinction and choose a field. This is a political problem - NOT a mathematical one.

Aside from that, learning more about how the RPI works is pretty interesting stuff! (of course, if you win your league, you don't have to care)
Click to expand...

Yes, but that's the point. "Winning your league" is MUCH more difficult in certain regions of the country than others and thus, should have more value (to the bunch of people that need to make the distinction and choose the field). If you are in the Pac-10, its toughter to get in by winning your league, so the value incurred by coming in 2nd or 3rd or 4th should be higher than in other less competitive regions. From the discusison going on re: RPI, its clear that coming from a competitive region actually penalizes a team (with a lower ranking). I see that as a real problem, not just interesting stuff.