SCHEDULING AND THE NCAA TOURNAMENT

Discussion in 'Women's College' started by cpthomas, Feb 4, 2022.

  1. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    How teams schedule is very important in terms of their RPI ratings and ranks as well as in terms of other factors related to NCAA Tournament at large selections and seeds. I am hoping this thread can be a forum for discussion of scheduling in relation to the Tournament.

    As a starting point, I have two resources I have developed, in the form of Excel workbooks. One is a Team Histories and Simulated 2022 Ranks workbook, intended as a coach resource for any scheduling not yet completed for the upcoming 2022 season as well as for scheduling for 2023. The other is a Team Scheduling Tool that I custom make for any team that wants one. It lets the team try out various schedule options, to see how each option likely would affect the team in terms of its RPI rating and rank and also in terms of other NCAA Tournament-related factors. (These are free -- no charge.) If anyone is interested in either workbook, you can find information on how to get it using this link: Scheduling Resources
     
  2. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    #2 cpthomas, Feb 13, 2022
    Last edited: Feb 13, 2022
    One of the best ways to provide good information related to scheduling is to use specific teams as examples. So, in my posts here, I am going to do a good deal of that. I will try to spread around which teams I use. My apologies in advance to those who would rather I had left them out of the picture I am painting. With that said ....

    ******************************************************************************************************

    In evaluating a team as a possible opponent, it is important to know three things:

    How strong is the team?

    How strong will the RPI tell the Committee the team is?

    How strong will the RPI treat the team as being, when it computes the strength of schedule portion of my RPI? Under the RPI formula, my strength of schedule as the RPI computes it has an effective weight of 50% of my overall RPI. My own winning percentage counts for the other 50%.
    If the RPI were perfect, all three answers would be the same so that coaches would not have to worry about the answers to all three questions. They simply could look at the RPI and say, "That’s how strong they actually are, that’s how strong the RPI will tell the Committee they are, and that’s the strength they will contribute to my strength of schedule." Unfortunately, however, the RPI is not perfect. In fact, the answers to the three questions can be very different. This is an unfortunate reality that coaches will have to deal with so long as the NCAA insists on using the current RPI formula.

    Here are two charts as examples of what I am talking about. They provide answers to the three questions for each of two teams: Northwestern State and Miami, FL. I picked these two team because according to the RPI, in 2021 Northwestern State was the #153 team and Miami, FL was the #154 team. Thus according to the RPI they are equal in strength.

    upload_2022-2-12_16-49-20.png

    upload_2022-2-12_16-49-56.png


    For purposes of this explanation, the portions of these charts to focus on are the blue, red, and orange data points and solid lines:

    The red data points represent a team’s RPI rank for each year since 2007 (excluding the Fall 2020 to Spring 2021 season).

    The orange data points represent a team’s rank, under the RPI formula, as a contributor to its opponents’ strengths of schedule.

    The blue data points represent a team’s rank, according to the Kenneth Massey rating system. I consider the Massey ranks to be a good indicator of teams’ actual strength -- or, at least, a significantly better indicator than the RPI ranks. In a bit, I will explain why. But for now, I ask you to assume I am right and that blue is an indicator of true strength.
    The solid lines are computer generated trend lines. For purposes of this explanation, it does not matter why the Northwestern State trend line is a curved line (polynomial order 2) and the Miami FL line is straight. It has to do with rules I follow in forecasting what teams’ ranks will be next year.

    So, what do the charts show? For Northwestern State, the blue being at poorer ranks than the red says that the team’s actual strength is weaker than the RPI ranks tell the Committee it is. Furthermore, the red being at poorer ranks than the orange says that the team’s ranks as a strength of schedule contributor are better than the RPI says they should be, which makes them much better than what the team’s actual strength says they should be.

    On the other hand, for Miami it is just the opposite. For Miami, the blue being at better ranks than the red says that the team actually is stronger than the RPI ranks tell the Committee it is. Furthermore, the orange being at poorer ranks than the red says that the team’s ranks as a strength of schedule contributor are poorer than the RPI says they should be, which makes them much poorer than what the team’s actual strength says they should be.

    In terms of scheduling, what does this mean from an RPI and appealing-to-the-Women’s-Soccer-Committee perspective? Suppose I want to schedule a 2022 opponent in the 150 RPI range. Which would be better for me to play, Northwestern State or Miami? Remember, the RPI says that in 2021 they were next to each other in that range. The answer is: Northwestern State. Northwestern State will be weaker than its RPI rank will tell the Committee it is, where as Miami will be stronger than the RPI will tell the Committee it is. And even more, Northwestern State will contribute much more to my RPI as a strength of schedule contributor than even its RPI says it should and Miami will contribute much less to my RPI than its RPI says it should.

    Why do I prefer Massey ranks as indicators of teams’ strength? Here are two charts, completely different than the above team charts, that show the main reason for my preference:

    upload_2022-2-12_18-43-59.png

    upload_2022-2-12_18-45-4.png

    These charts show how well teams from the different conferences perform, in non-conference games, in relation to their ratings. The charts run from the conferences with the best average ratings, on the left, to the conferences with the poorest average ratings. (I exclude SWAC because it would make the trends artificially steep.) The red data points and trend lines show how well the teams perform in games where opponents are the most closely rated (the most closely rated 20% of games). The yellow data points and trend lines show performance across all games. If teams are performing in accord with their ratings, their data points will be at 100%. If they are performing better than their ratings say they should, they will be above 100%. And if they are underperforming they will be below 100%. The blue data points show the conference average RPI ratings.

    The RPI chart shows that teams from the strong conferences perform better than their ratings say they should when playing non-conference opponents and teams from the weaker conferences perform more poorly. In other words, the RPI discriminates against strong conferences and in favor of weak ones. Red is the best indicator of the extent of the discrimination, since it is in games between closely rated opponents where the effects of the discrimination are most likely to show up in game results. Yellow is an indicator of the overall effect of the discrimination.

    The Massey chart, on the other hand, shows virtually no discrimination in relation to conference strength.

    Thus Massey does a much better job of getting teams from conferences in the proper order in relation to teams from other conferences. It simply does not have the discrimination the RPI has, nor does it have any other discrimination I have detected.

    Both rating systems have a significant number of games where the actual game result do not match the ratings, but they both have about the same number of misses. (The RPI misses a few more overall, whereas Massey misses a few more when looking only at Top 60 teams.) The difference is that a good number of the RPI misses are systematically related to conference strength whereas the Massey misses appear to be randomly distributed. Thus I prefer Massey as better at getting teams in the proper rank order and therefore being a better indicator of teams’ true strength.

    Going back to Northwestern State and Miami, here are two more charts:

    upload_2022-2-13_11-7-20.png

    upload_2022-2-13_11-7-48.png

    In these charts, the red line is RPI ranks starting at 1 on the left and rising rank-by-rank across the chart. The orange data points are the team’s ranks as strength of schedule contributors, with each point set at the position in the chart that matches the RPI rank at which the team had that strength of schedule contributor rank. The solid orange line is a computer generated trend line based on the data points. It shows the expected rank as a strength of schedule contributor at any point along the RPI rank line. These two charts reinforce that when you play Northwestern State, your RPI strength of schedule will get a better contribution than the RPI says it should and when you pay Miami you will get a poorer contribution.

    Finally, here are two more charts:

    upload_2022-2-13_11-21-19.png

    upload_2022-2-13_11-21-51.png

    These charts show how Northwestern State and Miami have done their non-conference scheduling over time. The color coding is the same, but the data points are the average ranks of the Northwestern State and Miami non-conference opponents.

    The Northwestern State chart shows that it tends to play teams with the same pattern it has: The RPI overstates their strength and their ranks as strength of schedule contributors overstate their strength even more. Since Northwestern State is getting credit, in its strength of schedule, for playing stronger opponents than it actually is playing, it is no wonder that its own RPI overstates its strength -- that is exactly what you would expect. In addition, the chart suggests that Northwestern State has been slightly increasing the difficulty of its schedules over time, which is something the coach of another team might want to know in deciding whether to schedule them.

    The Miami chart is more mixed. It suggests, however, that Miami has been making its schedules easier over time -- although maybe it recently has been heading back in the other direction as suggested by the data points on the right-hand side of the chart.

    Final Comment: If you are a coach hoping to get an at large position in the NCAA Tournament, these are the kinds of things you must pay attention to. You cannot simply make your best guess as to where an opponent will end up within the RPI rankings and select them as an opponent based on that. Instead, you also must pay attention to how strong they really will be and to how strong the strength of schedule part of the RPI formula will treat them as being.
     
    Rank Cleats, BigBear and Val1 repped this.
  3. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    Here are some comparisons of teams adjacent to each other in the RPI rank order, to show the importance of looking at potential opponents’ true strength and ranks as contributors to strength of schedule, in addition to looking at their RPI ranks.

    The first pair is UNC Wilmington (2021 RPI #63) and Pittsburgh (#64). I picked these because teams ranked #58 or poorer appear to be outside the bubble for NCAA Tournament at large consideration. I say this because since 2007, no team ranked #58 or poorer has gotten an at large selection.

    upload_2022-2-14_15-40-18.png

    upload_2022-2-14_15-40-46.png

    From the UNC Wilmington chart, you can see that most of the time, its RPI rank, its true strength (Massey), and its rank as a contributor to your strength of schedule are pretty close: What you see is what you get. One might wonder, though, if more recently the RPI has tended to overrate it and its rank as a strength of schedule contributor has been better than it should be.

    From the Pittsburgh chart, on the other hand, you can see that its RPI rank is poorer than it should be and its rank as a strength of schedule contributor is even worse.

    This makes UNC Wilmington the better opponent from an RPI and NCAA Tournament perspective: You will be playing a significantly weaker team than Pittsburgh, will get strength of schedule credit for playing a significantly stronger team than Pittsburgh, and the RPI will tell the Committee the two teams are equal.

    This is especially unfortunate for Pittsburgh since it appears that its true strength, at least in 2021, should have made it a bubble team for NCAA Tournament at large consideration.

    The following two charts reinforce that UNC Wilmington is the better choice for an opponent:

    upload_2022-2-14_15-56-5.png

    upload_2022-2-14_15-56-54.png

    My next pair are teams from the 2021 #51 to #57 RPI rank range: Old Dominion (#53) and Oregon (#52). I picked from this range because it is at the outside edge of what appear to be the bubble teams for Tournament at large consideration.

    upload_2022-2-14_16-15-47.png

    upload_2022-2-14_16-13-7.png

    For Old Dominion, as you can see, the RPI consistently ranks it better than its true strength. Its rank as a strength of schedule contributor, however, is more variable, although as its strength has improved, it has crossed over into an area where its rank as a strength of schedule contributor is better than it should be. You might note that for Old Dominion the chart starts with 2013 rather than 2007. Their current coach, Angie Hind, began there in 2014. When a coach has arrive after 2007, but has been there at least four years, then I produce two charts: one starting at 2007 and the other starting the year before the coach arrived. I then apply certain rules to decide which chart best represents the team’s trend for purposes of forecasting its RPI rating for next year. In this case, the rules said to use the chart from 2013, so that is why I have used it here. (For practical purposes in this discussion, there is not a big difference between the two charts.)

    On the other hand, for Oregon, the RPI ranks it more poorly than its true strength, especially when its teams are in the poorer ranking ranges. Plus, its rank as a strength of schedule contributor is much poorer than it should be.

    So, in this case Old Dominion is the better opponent to play. These two charts reinforce this:

    upload_2022-2-14_16-37-19.png

    upload_2022-2-14_16-37-59.png

    NOTE: To be continued in the next post. (I have maxed out the number of files I can attach to a post.)
     

    Attached Files:

    Klingo3034 and upprv repped this.
  4. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    Next are two teams in the #41 to #50 range, Loyola Chicago (#45) and West Virginia (#44):

    upload_2022-2-14_18-12-59.png

    upload_2022-2-14_18-13-51.png

    As you can see, the Loyola Chicago chart is mixed. At its current RPI rank level, however, which is slightly poorer than its actual strength as indicated by Massey, its rank as a strength of schedule contributor is a lot better than it should be.

    For West Virginia, on the other hand, its RPI rank and its true strength consistently are close. On the other hand, its rank as a strength of schedule contributor is poorer than it should be.

    Thus if choosing between these two teams, because of their ranks as strength of schedule contributors, Loyola Chicago is the better team to play. The following charts reinforce this:

    upload_2022-2-14_18-21-57.png

    upload_2022-2-14_18-22-29.png

    Continuing into the better ranked teams, from the RPI rank #31 to #40 group are Grand Canyon (#37) and North Carolina State (#38). This rank area is just above the area of #30 and better, which always have gotten at large selections from 2007 to the present.

    upload_2022-2-14_18-34-48.png

    upload_2022-2-14_18-37-43.png

    The Grand Canyon chart suggests that with its improvement, the RPI now ranks it better than it really should and its strength of schedule contributor rank is even better than its RPI rank by a little.

    On the other hand, the North Carolina State chart shows its RPI rank being reasonably consistent with its true strength. But its rank as a strength of schedule contributor is much poorer than it should be.

    Thus Grand Canyon is the better choice as an opponent. These charts reinforce this:

    upload_2022-2-14_18-43-29.png

    upload_2022-2-14_18-44-2.png

    NOTE: To be continued in the next post.
     
    L'orange and Rank Cleats repped this.
  5. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    Next are two teams in the #21 to #30 rank range, Hofstra (#24) and Ole Miss (#23):

    upload_2022-2-15_11-25-52.png

    upload_2022-2-15_11-20-55.png

    As you can see, the RPI ranks Hofstra as better than its true strength, and at least recently its rank as a strength of schedule contributor has been even better than its RPI rank.

    The RPI ranks Mississippi pretty close to its true strength, although perhaps recently it has over-ranked Mississippi. On the other hand, the rank as a strength of schedule contributor is much poorer than it should be.

    Thus as between Hofstra and Mississippi, Hofstra is the better opponent to play. The following charts reinforce this:

    upload_2022-2-15_11-31-6.png

    upload_2022-2-15_11-31-53.png

    And finally, for teams in the #11 to #20 range, here are Harvard (#12) and UCLA (#13):

    upload_2022-2-15_11-34-7.png

    upload_2022-2-15_11-34-58.png

    The Harvard chart is mixed, although at its 2021 rank area it appears that its rank as a strength of schedule contributor is better than it should be. And, if you look at the data points for 2021 rather than the trend lines, it is possible it actually is quite a bit weaker than its RPI rank and strength of schedule contributor rank indicate.

    For UCLA, there are not big divergences among the different types of rankings, although its rank as a strength of schedule contributor tends to run about 10 positions poorer than it should run.

    As between the two teams, the charts indicate that Harvard is the better opponent to play. On the other hand, the two charts below suggest that in their current ranking area, at last so far as strength of schedule contribution goes, the two teams are about the same as opponents.

    upload_2022-2-15_11-40-57.png

    upload_2022-2-15_11-41-55.png
     

    Attached Files:

    L'orange repped this.
  6. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    Hopefully, the information in the preceding posts shows the kind of thinking a coach must go through, if he or she is scheduling with a view towards an at large selection (or seed) in the NCAA Tournament.

    Is there a larger lesson one can draw from the examples? In every pair of teams, one was from a Top Tier conference and the other, adjacently ranked by the RPI, was from a Lower Tier. In every case (except possibly Harvard and UCLA), the better team to play was the Lower Tier team. Is this just a coincidence or is it showing us something?

    To understand what is going on, it helps to look at the three RPI elements of the teams: Element 1 (their own winning percentages), Element 2 (the average of their opponents’ winning percentages), and Element 3 (the average of their opponents’ opponents’ winning percentages). These are their three elements in 2021:

    Northwestern State (RPI #153): Element 1: .6579 -- Element 2: .4578 -- Element 3: .4506
    Miami FL (#154): .2500 -- .6009 -- .5684

    What you can see with these two teams is that as compared to each other Northwestern State has built its RPI through its own winning percentage whereas Miami has built its RPI through its strength of schedule. In the following pairings, I will use bold face to show where the strength of each team is as compared to the other.

    UNC Wilmington (#63): .7188 -- .5214 -- .5217
    Pittsburgh (#64): .6111 -- .5502 -- .5569

    Old Dominion (#53): .7500 -- .5143 -- .5229
    Oregon (#54): .6316 -- .5593 -- .5470

    Loyola Chicago (#45): .8325 -- .5093 -- .4992
    West Virginia (#44): .6250 -- .5853 -- .5442

    Grand Canyon (#37): .8095 -- .5326 -- .4994
    NC State (#38): .5000 -- .6492 -- .5679

    Hofstra (#24): .8250 -- .5417 -- .5367
    Mississippi (#23): .6750 -- .6102 -- .5654

    Harvard (#12): .8333 -- .5834 -- .5461
    UCLA (#13): .9211 -- .5292 -- .5533 (Note: UCLA played a very weak non-conference schedule for a team in this rank area.)
    Looking at all of these pairings, you can see that except near the very top of the rankings (Harvard and UCLA), each Lower Tier team has built its ranking, as compared to its Top Tier counterpart, through its own winning percentage. On the other hand, the Top Tier team relies on its strength of schedule as compared to its Lower Tier counterpart.

    This takes us to the guts of the RPI formula. In terms of its visible math, it combines the three RPI elements as follows:

    1 x Element 1 (Winning Percentage), plus

    2 x Element 2 (Average of Opponents’ Winning Percentages against other teams), plus

    1 x Element 3 (Average of Opponents’ Opponents’ Winning Percentages)
    This makes it look like strength of schedule (Elements 2 and 3) account for 75% of a team’s rating, but that is not the case. In terms of effective weights in determining a team’s RPI, the effective weights are 50% Winning Percentage, 40% Opponents’ Winning Percentages, and 10% Opponents’ Opponents’ Winning Percentages. This means that when I am your opponent, my two RPI elements that go into your strength of schedule are my Element 1 Winning Percentage and my Element 2 Opponents’ Winning Percentage, with Element 1 having four times the impact (40% effective weight) on your RPI of my Element 2 (10% effective weight). Thus apart from the result of our game, the main contribution I make to your RPI is my own Element 1 Winning Percentage. (If anyone wants an explanation of why the effective weights are 50-40-10, let me know and I will post the math.)

    Thus using Hofstra and Mississippi as an example, for Hofstra its main contribution to an opponent’s RPI its its .8250 Winning Percentage. Its secondary contribution is its .5417 Opponents’ Winning Percentage, but within the combination of the two, the .8250 Winning Percentage has four times the weight of the .5417 Opponents’ Winning Percentage. For Mississippi, on the other hand, the two numbers are .6750 and .6102, with the .6750 having four times the weight of the .6102. The effect of all of this is to make Hofstra, which relies on its Winning Percentage for its ranking as compared to Mississippi, which relies on its Opponents’ Winning Percentages and Opponents’ Opponents’ Winning Percentages, a better opponent to play of the two. Hofstra, with the better Winning Percentage, will make a better contribution to my RPI strength of schedule -- even though the RPI itself ranks the teams as essentially equal.

    The bottom line is: If you are choosing between two potential opponents you think will be in the same rank area, with one being from a Top Tier conference and the other from a Lower Tier conference, from an RPI perspective you in almost all cases should pick the Lower Tier. This is because if the two are in the same rank area, the Lower Tier team almost always will give you a better contribution to your RPI strength of schedule. This is inherent in the RPI formula the way it presently is constructed.
     
    Rank Cleats and SpeakeroftheHouse repped this.
  7. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    All of the above information is well and good, but how much does "smart scheduling" matter?

    I decided to run a scheduling experiment. I chose Michigan State as the experiment subject. And, since I had pairs of teams I compared as opponents in my previous posts, I used them as the potential opponents. As a data base, I used the games and results from the 2021 season, except for Michigan State’s non-conference schedule. I chose Michigan State simply because their actual 2021 RPI rank was #58, which history says for practical purposes put them as the first team outside the bubble for NCAA Tournament at large selection purposes.

    I tested three sets of non-conference opponents (with all games at neutral sites):

    Actual 2021 opponents
    Power 5 opponents
    Mid-Major opponents​

    Remember, according to the RPI, the Power 5 and Mid-Major opponents are equal to each other (i.e., in adjacent rank positions).

    For each set of opponents, I needed to assign game results. I wanted the assignment of game results to be statistically appropriate. To accomplish that, I did the following for Michigan State and its opponents:

    1. I used the Massey 2021 ranks as the indications of teams’ true strength. I explained in previous posts why I use Massey.

    2. I then converted the Massey ranks to RPI ratings. I did this by assigning to each rank level the historic average RPI rating for that rank level.

    3. I then computed the RPI rating difference between Michigan State and each of its opponents.

    4. For each opponent, using the computed rating difference, I determined Michigan State’s win-tie-loss likelihood against that opponent (using a table derived from all games played since 2007, roughly 43,000 games).

    5. I then added up the cumulative non-conference win likelihoods, tie likelihoods, and loss likelihoods to get totals that show Michigan State’s likely overall non-conference record.​

    6. I then used the individual game likelihoods to set the game results of the non-conference games, so as to make them consistent with the likely overall non-conference record in a manner most consistent with the individual game likelihoods.​

    This process gave me the following schedules and results to test:

    Actual 2021 opponents: Detroit (W), Eastern Michigan (W), Central Michigan (W), Florida Atlantic (W), Bowling Green (L), Oakland (W), and Dayton (W). It is worth noting that Michigan State’s actual record in these games was 5 wins, 0 losses, and 2 ties, whereas the process I used assigned a record of 6 wins, 1 loss, and 0 ties. These two records are the same, for practical purposes.

    Power 5 opponents: Miami FL (W), Pittsburgh (T), Oregon (L), West Virginia (W), NC State (L), Mississippi (L). Overall record 2 wins, 1 tie, 3 losses.

    Mid-Major opponents: Northwestern State (W), UNC Wilmington (W), Old Dominion (W), Loyola Chicago (L), Grand Canyon (T), Hofstra (L). Overall record 3 wins, 1 tie, 2 losses.​

    It is important to remember, regarding the differences in the Power 5 and Mid-Major results, that although the RPI ranks each pair of teams equally, this does not mean they in fact are equal. Rather, the Power 5 conference teams are likely to be stronger than their Mid-Major counterparts. The overall records reflect this.

    I entered each schedule, with its results, into the RPI data base and ran RPI rating and rank calculations. Here are the results for Michigan State:

    Actual 2021 opponents: RPI .5748 [as compared to the actual 2021 .5752]; Rank #59 [as compared to the actual 2021 #58]. The near equality of the actual numbers with my computed numbers provides a good validation of the system I used.

    Power 5 opponents: RPI .5098; Rank #70

    Mid-Major opponents: RPI .6071; Rank #29 [Note: Since 2007, every team ranked #30 or better has gotten an at large position in the NCAA Tournament.]​

    Two things account for the Mid-Major schedule being much better than the Power 5 schedule, in relation to Michigan State’s NCAA Tournament prospects:

    1. The Mid-Major teams, although ranked equally with the Power 5 teams, actually tend to be weaker so that Michigan State is likely to have better game results against them; and

    2. The Mid-Major teams tend to have better ranks as strength of schedule contributors than their Power 5 counterparts.​

    CONCLUSION: Smart scheduling matters a lot.
     
  8. Tash Deliganis

    Jan 16, 2022
    Thanks for all of this. It's a great contribution.
     
  9. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    How do the potential opponent scheduling factors I emphasized in previous posts -- RPI Rank, true strength rank as represented by Massey, and rank as a strength of schedule contributor -- relate to conferences? I have suggested that if you have two teams in the same RPI rank area, of which one is from a very strong conference and the other from a weaker conference, then when you consider their true strength and strength of schedule contributor ranks, the team from the weaker conference likely will be the better team to play from an RPI and NCAA Tournament perspective. But is this just a matter of the teams I picked in my pairing examples or is it true across the board?

    To get at this question, I ran another experiment:

    1. For each year since 2013, I arranged each conference’s teams in the order of their RPI ranks, and within the conference ranked them in that order. I used 2013 as the beginning year since that was when the strong conferences had completed a major realignment so that from then until 2021 their membership has been pretty stable.

    2. I then identified, for each rank level within the conference:

    a. The average difference (from 2013 through 2021) between the RPI rank and the strength of schedule contributor rank; and

    b. The average difference between the true strength [Massey] rank and the RPI rank.​

    3. I then highlighted in orange or green (I will explain the colors below) those places where the average difference was 10 positions or greater.​

    This produced the following tables:

    upload_2022-2-18_15-25-52.png

    In this and the following tables, the orange highlighting is for those rank levels where (1) the team’s RPI rank is better than its rank as a strength of schedule contributor or (2) the team’s RPI rank is poorer than its true strength [Massey] rank -- or both. In other words, these are teams that are better than the RPI tells the Committee they are and whose strength of schedule contributions to opponents are poorer than the RPI ranks say they should be. From an RPI perspective, they are undesirable opponents.

    The green highlighting is the opposite. It is for rank levels where (1) the team’s rank as a strength of schedule contributor is better than its RPI rank or (2) the team’s RPI rank is better than its true strength [Massey] rank -- or both. In other words, these are teams that are weaker than the RPI tells the Committee they are and whose strength of schedule contributions to opponents are better than the RPI ranks say they should be. From an RPI perspective, they are desirable opponents.

    The table is clear: Except at the very tops of these conferences, from an RPI perspective the teams are not desirable opponents.

    Although the table looks as if I picked the strongest conferences to be on it, that is not what I did. Rather, I put on it those conferences where, except at the very tops, all or nearly all of the teams were not desirable opponents from an RPI perspective. Except for the Big West, the other conferences in fact are the strongest conferences as measured by average RPIs over the years. The Big West is a bit of a surprise since in terms of average RPIs over the years, the Ivy League, Conference USA, and the Colonial AC are immediately ahead of it on the list. If you look at how well teams from those conferences and the Big West perform in non-conference games in relation to their RPI ratings, however, the Big West performs better than its ratings say it should whereas the other three perform more poorly than their ratings say they should (with the Colonial performance being somewhat mixed between performing more poorly and performing just about right). From that perspective, the Big West being on the list is less of a surprise.

    The next table is conferences at the other end of the spectrum:

    upload_2022-2-18_15-55-27.png

    In relation to teams on this table, as compared to the first table, if you have two teams with similar RPI ranks, one with green highlighting on this table and the other with orange highlighting on the first table, the better team to play is the one with the green highlighting on this table. If you think of the arrangement of conferences within geographic regions as a food chain, and if you are in a conference at or near the top of your region’s food chain, the conferences on this second table are nutritious ones to have in your region. In regard to that, it is good to note that these conferences all are in either the North or the South geographic region.

    This leaves a third set of conferences that are more mixed:

    upload_2022-2-18_16-13-32.png

    As you can see, teams from these conferences can be a good opponent in one respect but a poor opponent in the other. Thus as between a potential opponent from this table as compared to a potential opponent from one of the others, where the potential opponents are of roughly equal RPI ranks, you have to look at the details to see which opponent you think would be better from an RPI perspective.
     
  10. Fanatic#88

    Fanatic#88 Member

    Nov 22, 2021
    Good stuff. Thank you.

    Do the equations take the less games played model into consideration as well? I think the past years have shown where inflated RPI's are awarded for some teams and leagues that play 2 or more less games prior to postseason (American, Ivy, etc. for example).
     
  11. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    The Ivy League plays the fewest games per team, but the American is in about the range of most of the strong conferences -- actually playing more, on average, than the ACC for example.

    I do not think the absolute number of games they play is a factor. On the other hand, the proportions of the games they play, as between conference and non-conference, is a factor. This is for a couple of reasons. One is that conferences game tend to pull all conference teams’ strengths of schedule towards 0.500. Think of it this way:

    1. My conference plays a full round robin.

    2. I am conference Team A.

    3. Conference Teams B and C play each other. B wins over C.

    4. I play Team B. Their win over Tea C goes into my strength of schedule as part of their winning percentage. I play Team C. Their loss to Team B goes into my strength of schedule as part of their winning percentage. (And, these also go into my strength of schedule again as part of my opponents’ opponents’ winning percentages.) Thus their combined contribution to my strength of schedule is 0.500.

    5. As a result of step 4, conference games tend to pull my strength of schedule towards 0.500. If I am in a strong conference, this is not good because strong conference teams ordinarily have strengths of schedule above 0.500. If I am in a weak conference, it is good because weak conference teams ordinarily have strengths of schedule below 0.500. Thus from an RPI perspective, this hurts teams from strong conferences with high proportions of conference games as compared to teams from strong conferences with lower proportions of conference games. Although this effect is not huge, at the margins for NCAA Tournament purposes, it may matter.​

    The second reason my proportion of non-conference games matters is that, if I have a large number of conference games, I have fewer schedule slots for non-conference games. And, it is in non-conference scheduling that I have some control that will let me match my schedule to my NCAA Tournament aspirations. The Michigan State example I used shows this.

    Of course, there may be non-RPI and non-NCAA Tournament reasons for having a high proportion of conference games.
     
    Fanatic#88 repped this.
  12. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    Here is a bit of information about the extent of home field advantage:

    Across all teams, if you add together the percent of games a team wins when it has the better rating to the percent of games the team wins or ties when it has the poorer rating, on average the two percentages will add up to 100%. That is, this is what will happen if you adjust the ratings to take home field advantage into account.

    For the current version of the RPI, and not adjusting the ratings to take home field advantage into account, the home team performance percentage is 119.7% and the away team percentage is 80.3%. That shows how big home field advantage is on average.
     
    Rank Cleats repped this.
  13. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    #13 cpthomas, Dec 19, 2022
    Last edited: Dec 19, 2022
    If you are a glutton for punishment, here is a link to my just released Team Histories and Simulated 2023 RPI Ranks Excel workbook. It is a resource for coaches as they are scheduling non-conference opponents. If you want to see it, you will need to download it, as it is too large a workbook to open directly in GoogleDrive. As a warning, it is very large -- with a page for each team plus additional summary and other pages.

    The workbook shows historic rank information tables and charts for the teams from 2007 to the present. The rank information includes NCAA RPI, Massey, and my Improved (Balanced) RPI ranks. It also includes team ranks as contributors to other teams’ strengths of schedule within the RPI formula. Generally speaking, the Massey and my Improved RPI ranks are reasonable representatives of teams’ actual strength from year to year and the RPI is the rating system the Committee is required to use in evaluating team strength. Strength as indicated by the RPI can be quite different than true strength. Further, how the RPI ranks teams as strength of schedule contributors to their opponents can be quite different than their true strength ranks or their RPI ranks. The workbook lets you see these differences, team by team. If you have seen the charts I sometimes post on the Hot Seat threads, this is where they come from.

    The pages for teams also let you see how their non-conference scheduling has changed over time, with average opponent ranks (RPI and my Improved RPI) and ranks as strength of schedule contributors.

    The workbook also includes other information such as who the team coaches are, coach longevity, numbers of coaches by gender, teams that will be changing conferences, and so on.
     
    TimB4Last and Klingo3034 repped this.
  14. SpeakeroftheHouse

    PSG
    Italy
    Nov 2, 2021
    I can’t believe you took the time to do all that! Really lays out all the details on each program. Trends. Direction they are headed. Overall strength. Really cool stuff. Only question I have is how you came up with 2023 sim schedules when you don’t have anyone’s schedule yet?
     
  15. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    They are 2023 simulated ranks and ratings, not schedules. These are not projected team ranks and ratings as of the end of the coming season. Rather, they are simulated ranks and ratings that show how strong my system says the teams are going into the coming year. The User Guide has a detailed explanation of how I derive those ranks and ratings.

    One of the interesting things is that if I treat the simulated ranks and ratings as absolute definitions of team strength and then, once I do have all the schedules for the coming season, apply the ratings game by game, as adjusted for home field advantage, to every game to come up with an entire simulated season and end-of-season RPI ratings and ranks, the end-of-season ranks will be quite different than the ranks based on team strength. This is due to the way the RPI formula works and illustrates that it does not measure team strength but rather a mix of team strength and strength of schedule. And, the way it measures strength of schedule, as the tables and charts show, is defective.

    If you really want to play around with the workbook, be sure to read the User Guide. It tells you how you can use the workbook to see who would be good opponents for your favorite team to play.

    I have been doing these for several years as a resource for coaches.
     
    TimB4Last and SpeakeroftheHouse repped this.
  16. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    In my Don Quixote quest to get the NCAA to change or replace the RPI formula, I have done a project that shows pretty starkly how discriminatory the RPI really is. If you are at all interested in who the RPI helps and who it hurts, this blog article is something you will want to see:
    ANOTHER WAY TO LOOK AT HOW RPI DEFECTS RELATE TO SCHEDULING

    It is getting harder and harder to understand why schools and coaches affiliated with strong conferences (and from the West) are not jumping up and down screaming at the NCAA to make a change.
     
    Carolina92, TimB4Last and ytrs repped this.
  17. Enzo the Prince

    Sep 9, 2007
    Club:
    CA River Plate
    Chris, I'd be curious if you have, off the top of your head, a handful of coaches who 'get' these scheduling maxims, and a handful who don't get it, and keep missing the tournament because of it?
     
  18. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    #18 cpthomas, Dec 27, 2022
    Last edited: Dec 27, 2022
    I do not have that. My educated guess is that at this point most coaches with NCAA Tournament aspirations have gotten pretty well educated on what to be thinking about when doing their non-conference scheduling. This may not always be apparent when you look at a team schedule, but that often can be because of factors such as:

    1. Except for the top and bottom of the rankings, there is a lot of variability in where teams end up in the rankings from one year to the next. A coach may schedule a non-conference opponent based on the best available information and then have the opponent bomb out as a good opponent to play.

    2. A team that historically has been a good opponent may become not so good because it schedules too many strong opponents who want to play it. I never have tracked likelly good opponents to see how often this happens, but I always think it is something a coach with Tournament aspirations should watch out for by maybe asking who else the potential opponent will be playing.
    I think of non-conference scheduling as doing the best you can based on good statistical information combined with additional knowledge you have about potential opponents (who is coming in, who is going out, etc.) and knowing that sometimes things just do not work out the way you reasonably expect. It most definitely is an inexact science.

    If I get some time, I might take a look at teams that just missed NCAA Tournament slots and see if it looks like better scheduling would have made a difference. That would be a difficult but interesting project.
     
  19. whatagoodball

    whatagoodball Member

    Barcelona
    United States
    Dec 9, 2021
    Is it true that the NCAA pays for travel, food, and lodging costs for the teams in the NCAA tournament? If so, there would appear to be a significant financial incentive to favor the other regions at the expense of the West region. The geographic "density" of teams in the playoffs is much lower in the West region than in the others.
     
  20. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    The answer to your question is, Yes.
     
    whatagoodball repped this.
  21. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    Not wanting to start a new thread ...

    As some of you know, I maintain the website RPI for Division I Women’s Soccer. The website over time has increasingly focused on being a coach resource, but also is a resource for fans. It covers two primary subjects:

    1. It provides information on the RPI, including but not limited to: the details of RPI formula, an evaluation of how the RPI performs as a rating system, problems the RPI has rating teams from a conference in relation to teams from other conferences, problems it has rating teams from a geographic region in relation to teams from other geographic regions, and modified versions of the RPI that would perform better.

    2. It provides information about the NCAA Tournament, including but not limited to: the factors used for selecting at large participants and seeds, the procedures the Women’s Soccer Committee goes through over the course of the season, an analysis of which factors appear to be the most important in the at large selection and seeding process, how coaches and fans can evaluate their teams’ Tournament prospects as the season progresses, and resources coaches can use so that their non-conference scheduling is appropriate in relation to their NCAA Tournament aspirations.
    I just have completed my annual update of the website. Of particular importance this year, the information on the RPI shows how we can expect it to perform under the rule change that eliminates overtimes except during conference tournaments. The data related to NCAA Tournament at large selections and seeds also are updated to include the 2022 season.
     
  22. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States
    Each year, I update my Team Histories and Simulated 20XX Balanced RPI Ranks Excel workbook. It is chock full of information about teams, including a page for each team showing its rank history and other information about the team as a potential opponent. It is primarily a resource for coaches in scheduling non-conference games, but also can be of interest to serious fans.

    It is a big workbook that you will need to download if you are interested in having it. To access it for downloading, go to the RPI Excel Workbook Library, read the instructions for downloading documents, scroll down to Library Catalog, 2024 Documents, and use the Team Histories and Simulated 2024 Balanced RPI Ranks link to begin the download process.
     
  23. cpthomas

    cpthomas BigSoccer Supporter

    Portland Thorns
    United States
    Jan 10, 2008
    Portland, Oregon
    Nat'l Team:
    United States

Share This Page