Keeper stats

Discussion in 'Statistics and Analysis' started by beineke, Sep 27, 2003.

  1. beineke

    beineke New Member

    Sep 13, 2000
  2. beineke

    beineke New Member

    Sep 13, 2000
    Here are the results of the study put together by ChrisE and me. For more detailed discussion, see the link above.

    Offside-Adjusted Ratings for 2001-03

    1. Metros +4.17% (Howard, Walker)
    2. San Jose +2.31% (Cannon, Onstad)
    3. Fire +1.93% (Thornton)
    4. Crew +1.61% (Presthus, Busch)
    5. LA +1.24% (Hartman)
    6. NE -0.61% (Sommer, Brown)
    7. DC -0.78% (Rimando, Ammann)
    8. KC -1.28% (Meola)
    9. Colorado -1.61% (Garlick)
    10. Dallas -6.05% (Jordan, Countess)

    That is to say, the Metros saved 4.17% more shots than you would've expected from looking at the number of times they pulled their opponents offsides.

    A few more points:
    -- In addition to the numbers above, the 2001 Fusion were at +2.78%, and the 2001 Mutiny were at -5.65% ... note that those numbers are more scattered.

    -- If we hadn't adjusted for offsides, the Metros would still be #1, but their lead would be less dramatic. The Crew would be #2, and the Revs would be #9.

    -- In a couple of cases (Rimando at DC, Brown at NE), the current keeper has put up markedly better numbers than his predecessor. In other cases, there doesn't appear to be much difference.
  3. ChrisE

    ChrisE Member

    Jul 1, 2002
    Nat'l Team:
    American Samoa
    So, I'm resurrecting a very very old thread, and I'm going to do so rather poorly, but if I don't, it's going to languish here, unfinished, for god knows how long.

    So, months ago, I gathered up the 8 seasons of goalkeepr info (less a couple of unavailable TB/Miami seasons), but didn't really have much to do with it. Seeing as I collected these stats months ago, I can't be sure of their accuracy. For offsides, I simply took offsides/games (which is not as good as offsides/minutes, but I'm sorry, I'm not going to go back now and change it); for save %, I removed pk's, and simply took (sog-goals)/sog - that may not be exactly what we're trying to measure, but it should be measuring the same thing across all 8 years.

    		Offsides	Save %
    Clb	1996	3.22	0.731
    Col	1996	6.38	0.725
    Dal	1996	2.56	0.820
    DC	1996	4.50	0.723
    KC	1996	3.13	0.692
    LA	1996	3.81	0.787
    Met	1996	3.28	0.786
    NE	1996	3.19	0.753
    SJ	1996	1.56	0.759
    TB	1996	2.69	0.790
    Clb	1997	2.75	0.803
    Col	1997	4.81	0.744
    Dal	1997	2.25	0.831
    DC	1997	6.34	0.757
    KC	1997	2.25	0.727
    LA	1997	5.03	0.755
    Met	1997	4.31	0.780
    NE	1997	4.34	0.717
    SJ	1997	3.59	0.719
    TB	1997	3.09	0.724
    clb	1998	2.16	0.752
    Chi	1998	2.09	0.744
    Col.	1998	3.72	0.722
    Dal.	1998	2.63	0.752
    DC	1998	6.25	0.727
    KC	1998	2.75	0.697
    LA	1998	4.16	0.735
    Met.	1998	4.84	0.755
    NE	1998	3.25	0.689
    SJ	1998	3.16	0.690
    clb	1999	1.00	0.777
    Chi	1999	2.13	0.778
    Col.	1999	2.47	0.802
    Dal.	1999	1.63	0.819
    DC	1999	5.25	0.760
    KC	1999	3.06	0.730
    LA	1999	2.63	0.823
    Met.	1999	3.47	0.702
    NE	1999	5.25	0.722
    SJ	1999	2.59	0.743
    clb	2000	1.97	0.741
    Chi.	2000	2.97	0.727
    Col.	2000	2.94	0.749
    Dal.	2000	2.69	0.738
    DC	2000	3.97	0.683
    KC	2000	2.44	0.836
    LA	2000	3.13	0.775
    Met.	2000	4.19	0.797
    Mia.	2000	3.84	0.741
    NE	2000	3.69	0.717
    SJ	2000	4.28	0.791
    TB	2000	4.03	0.804
    clb	2001	3.08	0.812
    Chi.	2001	3.31	0.810
    Col.	2001	4.23	0.772
    Dal.	2001	2.31	0.664
    DC	2001	3.65	0.691
    KC	2001	3.31	0.720
    LA	2001	1.77	0.755
    Met.	2001	3.42	0.829
    Mia.	2001	3.65	0.792
    NE	2001	3.35	0.751
    SJ	2001	2.96	0.807
    TB	2001	4.62	0.720
    clb	2002	2.07	0.770
    Chi.	2002	3.00	0.792
    Col.	2002	2.96	0.724
    Dal.	2002	1.82	0.761
    DC	2002	2.50	0.802
    KC	2002	3.46	0.732
    LA	2002	1.93	0.801
    Met.	2002	3.82	0.777
    NE	2002	4.43	0.751
    SJ	2002	3.61	0.796
    clb	2003	1.86	0.784
    Chi.	2003	2.69	0.788
    Col.	2003	3.17	0.746
    Dal.	2003	2.10	0.716
    DC	2003	2.03	0.808
    KC	2003	3.72	0.770
    LA	2003	2.83	0.830
    Met.	2003	1.83	0.817
    NE	2003	4.76	0.712
    SJ	2003	2.38	0.786
    For all 8 years, we therefore get totals of:

    	Offsides	Save %
    1996	3.431	0.757
    1997	3.878	0.756
    1998	3.500	0.726
    1999	2.947	0.766
    2000	3.344	0.758
    2001	3.304	0.760
    2002	2.961	0.771
    2003	2.738	0.776
    We see that offsides/game have been declining significantly over the years (r=-.77), while save percentage has been rising (r=.60).

    Using my extremely meager linear regression abilities, I get, from the preceding list, a formula that looks like offsides = 79% - 1.009%*offsides/game.

    This, of course, is where I stop. I think the next logical step would be to adjust for the decrease in offsides over time (which I've done, and improves correlation of predicted offsides from .27 to .37), but I'm reluctant to do that without someone's approval. Obviously, I ought to test for significance, but I haven't got the faintest clue how to do that.
  4. mellon002

    mellon002 Member

    Jan 24, 2003
    Towson, MD
    I know you posted this a while ago but to answer your question the only thing I can think of is BigSoccer live which shows the most recent posts. Did you mean like a part of the blog section?
  5. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    New England Revolution
    Yeah that was taken care of by Huss during the switch which was great for him. Although I haven't been participating much recently, the forum appears to be really healthy and I will be redoubling my efforts shortly. Another nice thing form the switchover that you'll notice is the ability to have spreadsheets scroll left and right. It makes things we do around here infinately more readable.

    In terms of Chris' numbers it's interesting that the save percentages leaguewide haven't changed all that much. The only dip you see is during an expansion year which would seem to make sense. ANother interesting comparrison might be to see how these numbers compare with other leagues. However, I think that the leaguewide goal save percentages could eventually be an interesitng baseline as to looking for what are the chances on average of a shot going in at any given shot.
  6. ur_land

    ur_land New Member

    Aug 1, 2002
    Boulder, CO
    The regression equation I get for that is slightly different (and by the way, I'm assuming you made a typo and were really trying to predict saves from offside calls/game):

    save%=0.8451 - .02646 (off/game)

    This equation's r-squared (i.e., the amount of the variance in save% that it explains) is 0.41, which is pretty good. Incidentally, take the square root of the r-squared, and you get r, the correlation coefficient of save% and off/game, which is about .65ish or so. Not too bad of a correlation, but we only have 8 data points).

    And here's what the equation means: Offside calls/game is a marginally significant predictor of save% (p<.09). This means that as offside calls/game go up by one, save percentage goes DOWN by 2.6%. More offside calls leads to lower save percetnage.

    As to why this is (more aggressive defenses lead to more 1 vs 1 opportunities?), I'm not entirely sure.

    You're right, however, that offside calls have been declining over time (the correlation is significant at p<.02). Interestingly, when you control for time (i.e.adjust for the decrease in offside calls over time), the relationship between offside calls and save% is no longer significant (p=.41). So any affect of offside on save% was really a function of offside calls going down over time and save% going up over time. That and the fact that this is a small sample, which is not ideal from an inferential standpoint.

    So, bottom line, it looks like there's not really any relationship (in this small sample) between offsides-trap-like defenses and save percentage, once you control for time.
  7. ChrisE

    ChrisE Member

    Jul 1, 2002
    Nat'l Team:
    American Samoa
    I guess. I sort of liked to have 70 entries on a single page. Maybe I just like things being unreadable.

    The change in the expansion year isn't surprising, but there's no reason I'd have predicted it to go down instead of up. I mean, people generally make the argument about expansion weakening pitching in the majors, but it ought to weaken hitting too - same here, while goalkeeping/defense may have weakened (although Chigao's keeper was Zach Thornton and Miami's was Jeff Cassar - not significantly weaker), I don't see any reason that it would weaken more than offense. What makes it especially strange is that there was no concurrent jump when the league contracted in 2002 (although 1998 was the first year Ian Feuer ever got significant minutes - maybe he can be blamed).

    I think it might be interesting to compare these numbers to other leagues, but I'm not sure exactly what it would show you. Higher save %'s don't necessarily mean a league has better goalkeepers or worse strikers - it might be the case that shot selection is different (I believe England takes a lot more low-percentage long-range shots), or defenses have different strategies, or a host of other factors. They would tell you 'or what are the chances on average of a shot going in at any given shot,' but I'm not sure what that tells you.
  8. JG

    JG Member+

    Jun 27, 1999
    Why would you need to adjust for the decrease in offside calls over time, unless there's an indication that this is an officiating change rather than a tactical change?
  9. ChrisE

    ChrisE Member

    Jul 1, 2002
    Nat'l Team:
    American Samoa
    Yeah, my bad on the typo. Thanks a lot for the input, ur_land. It looks to me that the difference between our numbers (certainly not slight!) is that you just used the 8 season averages; although I suspect I didn't make it clear, I used the 80 or so individual team-seasons. It doesn't make sense to me to use just the 8 seasons, since you eliminate the teams that would show the most distinct effect.

    That was exactly the theory (Marvin Fischer and beineke's, I believe). More offsides traps means more blown offsides traps means better shooting percentage for the shots that players actually get (probably produces fewer shots in general, also).

    Let me here apologize for screwing up and making my initial post unclear, and try to post some results from the regression that I did. R squared in this case is a measly .075 - offsides clearly don't have as significant an effect as a lot of other factors do. Adjusted r squared is even lower, .063, although I don't know what that means. Excel is kind enough to give me a whole lot of numbers I don't understand, but I think (significance F = .011881) means I can pull one of these (p<.02).

    I'm still not gonna mess with adjusting for time unless someone gives me some help.
  10. ChrisE

    ChrisE Member

    Jul 1, 2002
    Nat'l Team:
    American Samoa
    One reason that I think it might be useful is that the yearly error in the regressed save percentages correlates very strongly to the average yearly difference from the mean. Maybe this is to be expected, I'm sort of lost, but here's the data:

    mean = .759

    1996	0.757	-0.002
    1997	0.756	-0.003
    1998	0.726	-0.032
    1999	0.766	0.007
    2000	0.758	0.000
    2001	0.760	0.002
    2002	0.771	0.012
    2003	0.776	0.017
    Total	0.759	0.000
    	Regression	Average
    1996	-0.002	0.002
    1997	-0.040	0.003
    1998	0.292	0.032
    1999	-0.043	-0.007
    2000	-0.014	0.000
    2001	-0.033	-0.002
    2002	-0.094	-0.012
    2003	-0.124	-0.017
  11. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    New England Revolution
    Theres a mental connection that I'm having trouble making here. If saves percentage is going down then that should mean that more goals are being scored yes. It would seem to be the obvious inference. However, it might be interesting to see offside rate as it relates to goals per game or shots per game.

    It'd seem to me that the offeside trap, while can be an effective defensive tool is usually more of a bail out manuever. How many times do you watch a game and you see a team that is attacking, attacking and attacking and they'll have a couple of big offsides calls go agains them. I think it's possiable that the correlation we're seeing between more offsides equaling less saves is that teams are doing the lion's share of attacking.

    Problem being that we're dealing with a ratio here. So back at the begining, what's special about this sitation where keepers aren't getting shots off? Essetinally, I'd have no idea and because this damn sport is so weak with independant events it'd be terriably hard to tell. However, I'd be willing to bet that teams who are getting more offsides calls against them are also getting more shots on goals.

    Problem again, so what's the recomendation to coaches? Have your guys get called offsides more :) ?
  12. ur_land

    ur_land New Member

    Aug 1, 2002
    Boulder, CO
    To find the true relationship between offside calls and save percentage.

    When you control for time, the relationship goes away. So there is SOMETHING out there (changes in officials? changes in tactics?) that is affecting both offside calls and save percentage. I'm not sure what it is, but it is affecting both offside calls and save percentage. And when you don't control for it, you see a spurious relationship between offside cals and save%.
  13. ur_land

    ur_land New Member

    Aug 1, 2002
    Boulder, CO

    You're right in that using just the 8 seson averages has a few problems. It doesn't eliminate the teams that would show the most distinct effect, as they contribute to that year's average, but using all of the teams seperately gives you much greater statistical power. I just used the 8 years becasue it was easy and because I didn't want to go to the trouble of doing the regression with all of the teams in the correct way.

    But since you used the 80 or so team seasons......well, I guess I'll have to explain it anyway.

    I would not place too much stock on the regression results you obtained, because the regression you did was done in a less than optimal manner. When you have two or more individual data points from the same group (same marriage, classroom, league season, person, etc.) you need to make sure that your data isn't corrupted by dependency. One of the assumptions of regression is that all of the errors of you observations are independent. When you have dependency, the erros are not independent--because of some grouping variable, some errors are correlated together.

    And another example would be looking at the effect of offside calls on save% for 10+ teams over 10 years. The errors are likely to be correlated with each other within a year more than across years, and this could cause your regression to be biased.

    So there's a couple things that can be done to correct for this: you can do a within subjects regression (, you can do a multilvel model (, or you can average across all teams for a year. I did the third way because it was easy and quick. The other methods (which actually are much better, as they give you greater statistical power) take a little longer, and I'm supposed to be analyzing my dissertation data, not analyzing MLS stats! If you don't want to tackle this (and it's not something that can easily be done in excel), I'll see if I can con one of my buddies into doing it for us.......
  14. numerista

    numerista New Member

    Mar 21, 2004
    You're missing JG's point. The purpose of the study is to relate tactics (the offside trap) to save %. We know that teams have decreased their use of the offside trap over time, so by adjusting for time, you're adjusting out the signal of interest.

    I have other issues with some things you've claimed, but this one is probably the biggest.
  15. ur_land

    ur_land New Member

    Aug 1, 2002
    Boulder, CO
    How do we know that teams have decreased their use of the offside trap over time? We know that offside calls have gone down, but does that mean trap usage has necessarily gone down too? The two are not necessarily related. If they have declined their usage of the trap, then yes, that would be partialed out when controlling for time.

    What are your other issues?
  16. numerista

    numerista New Member

    Mar 21, 2004
    Take a look at individual teams over time. For instance, Bob Bradley came to New York last season and got rid of the offsides trap. They drew 50 fewer offsides calls in 2003 than 2002, by far the biggest change in the league.

    Other issues...
    1) Even after the adjustments you've made, errors are clearly correlated, due to the fact that we're observing some of the same players and coaches across multiple seasons.

    2) Correlated errors are a much bigger issue in significance testing than parameter estimation, so ChrisE's results are still the best we've got. To illustrate, even if within-year observations were perfectly correlated (this is the situation for which your adjustment is correct), then his fitted model would be (essentially) identical to yours. Because your model is different, this suggests that you're discarding valuable information.

    3) Given the way you're looking at within-season data (by averaging), I don't see how a reduction in significance implies that "you've seen a spurious relationship."
  17. numerista

    numerista New Member

    Mar 21, 2004
    Afraid I don't understand this chart, Chris. The righthand column is the difference between the yearly average and the global average. What's the lefthand column?
  18. ChrisE

    ChrisE Member

    Jul 1, 2002
    Nat'l Team:
    American Samoa
    You can only expect me to be so clear when I'm talking about things I don't understand. For the first column, I predicted the save percentage from offsides/game for all 84 teams. I then subtracted that number from the actual save percentage for that team, that year. I then summed them up by year.

    It's not actually the average, as I believe I said, it's the sum, but it hardly makes any difference.
  19. numerista

    numerista New Member

    Mar 21, 2004
    ... so if you had divided the left-hand column by the number of teams, you'd have the yearly average minus the predicted yearly average (in stats, we might call this the average residual for that year).

    Because the two columns are correlated, something other than offsides is changing from year to year to modify save percentage. What's striking to me is the low save %age in 1998. 1998 was an expansion year, meaning that two new keepers were needed. In addition, Friedel and Zenga had left the league, Dodd was getting old, and shotstoppers like Howard and Cannon (#1 and #2 in adjusted save percentage, 01-03) were not yet playing.

    My theory is that in 1998, there was a drop in the quality of goalkeeping, and that since then, keeping has improved. That's why you've found this pattern.
  20. numerista

    numerista New Member

    Mar 21, 2004
    A few more notes ...
    -- I fit a regression using offsides AND team id, but team was clearly not a significant predictor of save %age (min pval 0.11).

    -- Ignoring correlations between observations, the estimated effect is 1.01%, with a plus-minus (2 std errors) of 0.78%. This implies that even with optimistic assumptions, we're not sure how big the effect really is.

    Top five seasons: adjusted save percentage
    1. 2001 Metros, Tim Howard +7.2%
    2. 2000 Wizards, Tony Meola +6.9%
    3. 2003 Galaxy, Kevin Hartman +6.7%
    4. 1997 Burn, Mark Dodd +6.2%
    5. 1999 Galaxy, Kevin Hartman +5.8%

    Bottom five (worst first)
    1. 2001 Burn, Matt Jordan -10.4%
    2. 1998 Revs, Ian Feuer -7.0%
    3. 1998 SJ, David Kramer/Andy Kirk -7.0%
    4. 2000 DC, Mark Simpson/Tom Presthus -6.9%
    5. 1996 Wizards, Garth Lagerwey -6.8%

    1. Galaxy, Hartman +6.7%
    2. Metros, Howard/Walker + 4.4%
    3. DC, Rimando +3.7%
    4. Fire, Thornton +2.4%
    5. SJ, Onstad +1.8%
    6. KC, Meola +1.6%
    7. Clb, Busch +1.1%
    8. Clr, Garlick -1.4%
    9. NE, Brown -3.2%
    10. Dal, Countess -5.4%

    Hartman's numbers have varied quite a bit through the years, but here are Rimando's three full seasons ... 2003 +3.7, 2002 +3.7, 2001 +3.6.
  21. JG

    JG Member+

    Jun 27, 1999
    So Howard saved about 13.5 goals compared to his expected save percentage...that's almost four wins due to his goalkeeping.
  22. numerista

    numerista New Member

    Mar 21, 2004
    To clarify, JG:
    The Metros had 42 points in 2001. Replacing Howard with an average keeper, we would've expected them to get only about 30 points?

    If so, wow ...
  23. JG

    JG Member+

    Jun 27, 1999
    I hadn't actually run the numbers before--it turns out that the extra 13 goals would only be a 9 point difference. But the Metros overachieved a bit in 2001 compared to their goal differential...a team with a GD of 38-48 (the metros GD with an "average" keeper) would be expected to get 30 points from a 26-game schedule.

    Presumably we could get "point values" for every goalie this way.
  24. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    New England Revolution
    JG, would you mind expanding on this. I don't understand how you're calculating expected goals and how you're relating that to points? Please forgive my ignorance with some of this stuff.
  25. ChrisE

    ChrisE Member

    Jul 1, 2002
    Nat'l Team:
    American Samoa
    Exactly why we need a website (;)). I had the same reaction when I read this a month ago, it's based on a thread JG posted here:

Share This Page