Making Sense of Save Percentages

Discussion in 'Statistics and Analysis' started by numerista, Nov 25, 2005.

  1. numerista

    numerista New Member

    Mar 21, 2004
    Although a keeper's Save Percentage is frequently regarded as his most meaningful statistic, it also depends on the difficulty of shots he faces. Close-range attempts are much harder to stop than 40-yard prayers.

    Without a detailed shot chart, it's hard to measure shot difficulty; however, one related factor is the number of shots a keeper faces. If he faces a lot, this may mean that the opponents are getting forward in numbers and getting close to the goalmouth. To check this hypothesis, I downloaded stats on 109 college teams from 12 NCAA Div-I conferences. I found that the correlation between Shots Faced per 90 Minutes and Save Percentage was highly significant (p-val < 0.001).

    The formula for predicting save percentage from SF/90 was:
    SV% = 85.49 - 1.71 * SF/90

    In high-Div I soccer, for every additional shot you face, your save percentage drops by ~1.7%.
     
  2. numerista

    numerista New Member

    Mar 21, 2004
    One implication here is that if you play behind a terrific defense, your save percentage gets inflated. This is particularly unfortunate, because keepers are also judged by their goals against average, which also gets inflated by a team that doesn't allow openings.

    I went looking at past seasons for college keepers who were drafted by MLS, or who spent time on MLS rosters, and I adjusted their save percentages for the number of shots they faced...

    Adjusted Save Percentages

    Hannigan, Sr +9.9%
    Wells, Sr +8.9%
    Fulton, So +7.5%
    Hannigan, Jr +7.1%
    Kennedy, Sr +6.6% (USL Rookie of the Year, 2005)
    Behonick, Sr +6.2%
    Baumstark, Fr +6.2%
    Wells, So +5.7%
    Wells, Jr +5.5%
    Warren, Sr +5.1%
    Guzan, So +5.0% (went pro)
    Fulton, Jr +4.9%
    Singer, Sr +4.7%
    Terris, Jr +4.3%
    Countess, Fr +3.8% (went pro)
    Barragan, Sr +3.8% (in Div II)
    Perkins, Sr +3.5%
    Palmer, So +3.5%
    Hesmer, So +2.8%
    Kennedy, Jr +2.6%
    Warren, Jr +2.6%
    Gaudette, Sr +2.3%
    Saunders, Jr +2.1
    Hesmer, Jr, +1.9%
    Kennedy, So +1.8%
    Guzan, Fr +1.7%
    Palmer, Jr +1.6%
    Sawyer, Sr +1.4%
    Comfort, Jr +1.3%
    Gomez, Jr +1.2%
    Baumstark +0.1% (went pro)
    Gomez, Sr 0
    Sawyer, Jr 0
    Hesmer, Sr 0
    Cronin, Jr 0 (went pro)
    Pickens, Jr 0
    Pickens, Sr -0.1%
    Terris, Sr -0.1%
    Saunders, Sr -1.0%
    Nolly, Sr -2.1%
    Palmer, Fr -2.1%
    Behonick, Jr -2.4%
    Mahoney, Sr -3.7%
    Fulton, Sr -4.2%
    Palmer, Sr -4.9%
    Comfort, Sr -6.1%

    Some of the weaker numbers, it should be noted, came in seasons where a keeper wasn't fully healthy.
     
  3. numerista

    numerista New Member

    Mar 21, 2004
    In part, I'd be inclined to attribute Wells' numbers to the fact that he played behind a terrific UCLA defense. All the same, I think it bears mentioning that in 2005, there were five MLS keepers who earned significant minutes and managed to make at least 3 saves for every goal they allowed. They were ...

    1. Onstad, GK of the Year 2003, 2005
    2. Cannon, GK of the year 2004
    3. Reis, nat'l team pool
    4. Walker, nat'l team pool (16 starts)
    5. Wells (17 starts)

    Is it a coincidence that a keeper who had outstanding college numbers has also put together very good numbers as a pro?
     
  4. numerista

    numerista New Member

    Mar 21, 2004
    Along with Wells, Hannigan is the other keeper with the very best numbers, although it's worth adding that he played in a relatively low-level conference (Temple in the A-10).

    As of a few minutes ago, I knew nothing about him, so I was pleased to find that he has latched on with Bray Wanderers in Ireland. They've just finished their season, and he debuted in the final two games, winning them both and saving a penalty in the first. He did let in 3 goals total, but that's still better than the 1.8 goals per game that his team had been allowing.

    Also, I came across his sophomore numbers, and his adjusted save percentage was +7.1% that year, too.
     
  5. scaryice

    scaryice Member

    Jan 25, 2001
    Wells had one game this year with like 17 saves. He did not have a good year. Save percentage is ok, but goalkeeping stats aren't that great.
     
  6. numerista

    numerista New Member

    Mar 21, 2004
    I agree that the stats are fairly limited -- and now that I've looked more closely at a bunch of college numbers, I could say more -- however, I disagree about Wells' 14-save game.

    That game ended in a tie, so every last one of those saves was meaningful. In no sense was he padding his stats. Therefore, there's no reason to omit those saves from his accomplishments this season. Wells got dinged for letting in 4 goals against the Revs, so he should also get credit for shutting out DC.
     
  7. numerista

    numerista New Member

    Mar 21, 2004
    Adding my thoughts after looking at the college numbers (in the linked thread) ...
    https://www.bigsoccer.com/forum/showthread.php?t=271423

    1. I like the adjusted save percentage because it debunks some of the paper tiger goalkeepers who manage to post good GAA's without actually doing much. Ford Williams of UNC is my poster child, so it'll be interesting to see if he has any pro success.

    2. The players with the highest actual save percentages also tend to have the highest adjusted percentages. Well-regarded keepers behind bad defenses (e.g. Dunsheath at Bradley) do get a big boost, but this year at least, they don't catch up.

    3. My biggest concern is that I'm adjusting for the number of shots a team faces, which doesn't directly address the quality of those shots. A team that allows a larger number of low-quality shots (e.g. a bunker-and-counter team) can end up with a keeper who has artificially inflated numbers. On the opposite side, I think that this year's ACC keepers have artificially deflated numbers.

    4. It's not clear what mathematical adjustment is the "correct" one. I'm taking a keeper's save percentage and subtracting a prediction from it (which is the output of a linear model). This approach is fairly standard, but there are a great many alternatives, some of which may produce quite different results.

    5. Another concern is that save percentages vary quite a bit at random. Sometimes you get lucky (as with Wells) and a keeper's year-to-year numbers look consistent; other times, you don't get lucky.
     
  8. scaryice

    scaryice Member

    Jan 25, 2001
    He was really good for one game, and when looking at save percentages that skews things for the entire season.
     
  9. numerista

    numerista New Member

    Mar 21, 2004
    Why are you assuming that he was different in that particular game? In total, Wells ended up making a string of 17 consecutive saves between goals allowed (including his final save before the 14-save game and his first two in the game after). That's good, but it's not inconsistent with his other performances.

    Suppose that the shots Wells faced this season had been in completely random order. You'd expect his "hottest" streak to include somewhere between 8 and 21 consecutive saves. It wouldn't be at all unusual for him make 17 saves in a row (pval 0.28).
     
  10. scaryice

    scaryice Member

    Jan 25, 2001
    Because, I watched him play in the other games.
     
  11. numerista

    numerista New Member

    Mar 21, 2004
    I watched him play a fair bit, myself, yet I didn't see any reason to conclude that Wells was dramatically different from one day to the next.
     
  12. Serie Zed

    Serie Zed Member

    Jul 14, 2000
    Arlington
    If this is hyper obvious, apologies -- I'm partly thinking out loud here...

    If a keeper's save PERCENTAGE drops with shots faced/90 it means that the marginal extra shots surrendered by a defense are higher quality shots. Otherwise the percentage wouldn't drop, the GAA would just rise.

    [[I suppose you could make an argument that there's an increasing liklihood that a keeper loses his mental edge and/or makes more blunders if he's under fire, but it seems likelier (to me anyhow) that it's because those marginal shots are good ones.]]

    With that in mind, I'm wondering if the change to the Save% is linear or if it's an exponential curve (that stays flatter at the lower end of SF/90 and then rises more quickly as SF/90 go up).

    If you had a big database of game data available you could group games together based on the number of shots faced by the keeper. So that you had a bunch of records for every number (5 shots faced, 6 shots faced, etc). Then you could graph the Save% for each group to see what the pattern looked like.

    If it's exponential, it might lend support to the idea that good defenses are good not only because they prevent shots, but especially because they prevent GOOD shots as a proportion of those shots. And vica versa.

    Additionally, if you could generate the predicted change to Save % for individual games, you could compare a keeper's actual Save % to the predicted Save % for each game across a season (or career) to refine the measure of his/her quality. Which is basically just building on Numerista's idea.

    But you could then also do that for a defense. Once you know how many goals for each game would be predicted based on the shots surrendered/game you could measure the worth of the defense independent of the GK.

    Or do I just need a 2nd cup of coffee?
     
  13. numerista

    numerista New Member

    Mar 21, 2004
    Even though this forum is far too quiet for my liking, it is a good place to think out loud. :)

    When I plotted the data I had, the linear fit looked reasonable; however, any number of other fits would also have looked pretty reasonable. Since the response ranges from 0% to 100%, a further option would be to use a logistic (s-shaped) curve.

    I think that this is a terrific idea, with a couple of caveats:
    1. In MLS (not sure about college), shot selection depends a great deal on the scoreline. Once a team is ahead, it tends to shoot less often, but perhaps with higher quality. This could be important to adjust for.

    2. Data like this tends to be hard to inspect visually (too much scatter), but there are plenty of other statistical ways to figure out the right shape of curve to use.

    I think I'm following but not sure. To give an example, if we know that Nick Rimando has been a +3% for his career, but that his team was a -1% in one year (and Nick seemed to be playing normally), then we rate his defense a -4%. Is that the kind of thing you have in mind?
     
  14. Serie Zed

    Serie Zed Member

    Jul 14, 2000
    Arlington
    Not sure how "shoots more when behind" affects my idea. Will have to think on it.

    But the graph I was suggesting would be very simple. Eg data:

    97 games where offense took five shots -- opposing keepers' average Save % = .893
    113 games where offense took six shots taken -- opposing keepers' average Save % = .879

    So, one data point for each number of shots per game...Eg.

    5 shots taken = .893 Average Save % plot those two numbers
    6 shots taken = .879 Average Save % plot those two numbers

    And the strength of keeper for a season would be a game-by-game reckoning of his actual save percentage vs the average save % for the number of shots he faced.

    Over 40-50 games in a season you'd see a trend, I think.

    Likewise...if you know the average Save % for any given number of shots faced by a keeper, you could generate an expected goals scored for each game based on the number of shots a defense surrendered.

    Do that over a season and you might see how solid a D was, independent of the keeper they had behind. Again, it assumes that good defenses give up not only fewer shots, but also PROPORTIONALLY fewer good chances.

    This would have to be refined (if it worked in the first place), but it might be a decent place to start?
     
  15. numerista

    numerista New Member

    Mar 21, 2004
    Good point. I got ahead of myself imagining an adjustment for ahead/level/behind, so I missed that. I need better coffee. :)
     
  16. Steve Holroyd

    Steve Holroyd New Member

    Apr 19, 2003
    New Jersey
    Club:
    --other--
    Nat'l Team:
    United States
    I recall The Hockey Compendium voicing similar thoughts on the "unreliability" of save percentage, and came up with a "durability factor" using shots faced, etc. I don't recall how similar (if at all) it was to the formula you've proposed, but it's obvious that sone sort of calculation should be used to take into account poor defenses.

    By way of example, here are some of the NASL's all-time great goalkeepers' save percentages:

    Shep Messing (Bos/Cosmos/Oak/Roch) .832
    Hubert Birkenmeier (Cosmos) .809
    Tino Lettieri (Minn/Van) .802
    Ken Cooper (Dallas) .801
    Volkmar Gross (SD) .789
    Mick Poole (Den/Port) .789
    Bob Rigby (Phil/Cosmos/LA/Mont/SJ) .789
    Arnie Mausser (Den/TB/Van/Col/Ft.L/NE/TB/etc.) .783
    Phil Parkes (LA/Van/Chic/Tor) .778
    Jan Van Beveren (Ft.L) .762

    It is interesting to see the oft-villified Messing as the NASL's all-time leader in save percentage. It is also interesting to see Van Beveren--who used to routinely top best GK polls at the time--as the lowest among this group.

    I'd love to be able to plug in the "durability formula" against some of these numbers: Mausser, who had over 200 saves several seasons because he played for some awful teams, would no doubt rate pretty highly, while Birkenmeier would drop, as would Parkes (who managed to lead the NASL in GAA one year with a .775 percentage--obviously, his defense did the work). However, the NASL did not keep shots totals until about 1982.

    Long story short...thanks for coming up with a logical way to add some "depth" to save percentages.
     
  17. numerista

    numerista New Member

    Mar 21, 2004
    Thanks for the pointer. According to the link below, the Hockey Compendium authors actually used the same basic formula I did; however, instead of learning the parameters from the data, they chose their adjustment factor arbitrarily ... as the linked text points out, that's not too reliable an approach.

    http://www.puckerings.com/research/persev.html
     
  18. RichardL

    RichardL BigSoccer Supporter

    May 2, 2001
    Berkshire
    Club:
    Reading FC
    Nat'l Team:
    England
    Save percentage is just a small part of the story though. Look at Tim Howard and Roy Carroll. Neither of them appeared to suffer through a poor save percentage, but if you make mistakes which lead to shots/goals then that's far more critical. Saving an extra shot for every thirty you face isn't much good if a keeper's mistakes means he is helping create extra chances for the other team - either by dropped catches, bad positioning, not claiming crosses, not coming off his line etc. A keeper who doesn't command his area will be a liability, regardless of how good his reflexes are. The best thing a keeper can do is stop attacks before they lead to a shot.
     
  19. numerista

    numerista New Member

    Mar 21, 2004
    Most of the time, when a keeper's mistake leads directly to a shot attempt, it leads to a high-percentage shot attempt. As a result, mistakes like the ones you list do get reflected in a keeper's save percentage.

    As for whether Carroll and Howard's save percentages looked ok, bear in mind that you're talking about unadjusted numbers. Considering that Carroll and Howard faced fewer shots/game last year than any other keepers in the Premiership (including Cech), their save percentages are likely to have been artificially inflated by facing less attacking pressure.

    Now, unfortunately, I don't have enough Premiership data to do a good job of learning what adjustment factor to use (nor do I have game-by-game data like Serie Zed suggested). But when I apply the college soccer adjustment factor to the 2004-05 EPL, Carroll rates 19th out of 20 starters, while Howard rates worse than any EPL starter.

    Even if I use a much smaller adjustment factor (which appears appropriate), neither of them look good. When I reduce the adjustment factor by 75% (from 1.71%/shot to 0.43%/shot), Carroll still only moves up to 15th, and Howard still comes out worse than any Premiership starter.
     
  20. Steve Holroyd

    Steve Holroyd New Member

    Apr 19, 2003
    New Jersey
    Club:
    --other--
    Nat'l Team:
    United States
    To add to the discussion, I'd like to introduce some raw numbers.

    Here are the top 10 single season save percentages in the history of American D1 soccer (excepting the 1920s ASL, since no save stats were kept, nor can they be reconstructed from game reports):

    1. Mirko Stojanovic Oakland 1967 .908 1.00
    2. Bob Rigby Philadelphia 1973 .907 0.62
    3. Ken Cooper Dallas 1972 .901 0.86
    4. Mike Hewitt San Jose 1976 .898 0.92
    5. Mirko Stojanovic Dallas 1971 .897 0.73
    Mike Winter St. Louis 1972 .897 1.00
    7. Shep Messing Boston 1975 .894 0.93
    8. Barry Watling Seattle 1974 .892 0.80
    9. Richard Blackmore New York 1972 .890 1.14
    10. Gernot Fradyl Philadelphia 1967 .889 1.43


    When I started going through the data, I had expected to see a few examples of the "Gilles Meloche Syndrome" (named for the long-suffering California Golden Seals goaltender who, critics said, would have been the second coming of Glenn Hall if he had a defense in front of him. See also, Herron, Dennis circa Kansas City Scouts)--goalkeepers who had phenomenal save percentages (for purposes of my example, higher than .800) and dismal GAAs. Much to my surprise, that has never happened, with one exception: Arnie Mausser logged a .817 percentage in 1975 with Hartford while also recording a league-worst 2.24 GAA. Of course, he went on to have a Hall of Fame career.

    Instead, the top seasons were all by goalkeepers who also had great GAA seasons. Stojanovic (both times), Rigby, Cooper, Watling and Messing all led the NASL in GAA in their seasons, and Hewitt and Winter were second. Perhaps not coincidentally, Stojanovic (both years), Blackmore, and Rigby's teams won the championship in their seasons.

    Not a single MLSer mad the list. The closest was Brad Friedel's .868 in his abbreviated 1996 season, followed by Joe Cannon's .824 for Colorado in 2004. Tony Meola's monster 2000 season yielded only a .816 percetage. I suggest this could be attributable to either 1) a more offensive style of soccer in the 1970s leading to more shots and, thus, more saves; 2) dubious NASL stat-keeping; or 3) both.

    Does the fact that the goalkeepers with the best save percentages also happen to have had the best seasons for GAA render the percentage statistic meaningless? No...the list is populated with goalkeepers who were consistently among the NASL's best.

    But it just goes to show the need for a meaningful adjusted stat. Otherwise, it would seem that GAA is a reliable indicator of goalkeeper effectiveness, even though most people accept that it is not.
     
  21. numerista

    numerista New Member

    Mar 21, 2004
    Interesting stuff ... unfortunately, I think you're right that NASL-era saves numbers weren't tallied in the same way that modern ones are.

    Anyway, here's a bit of GAA trivia ... these are the US-capped goalkeepers from the all-time NCAA leaderboard in career GAA (p 14 of this doc; top 29 listed; earliest year is 1977):
    http://www.ncaa.org/library/records/soccer_records_book/2005/2005_soccer_records.pdf

    1. Tony Meola, UVa 0.34
    4. James Swanner, Clemson 0.43
    15. Brad Friedel, UCLA 0.60
    18. Kasey Keller, Portland 0.64
    25. Nick Rimando, UCLA 0.67

    Now, the big three obviously had some help in making this list ... they were all stud goalkeepers who therefore had the opportunity to play behind very strong teams. All the same, there have been plenty of keepers for powerhouse schools over the years, so it's still quite remarkable that they all came so near the top. As primitive a stat as GAA is, it must contain some information.

    As an aside, most of the other individual stats in the record book (save %age isn't included) don't seem to do nearly such a good job in predicting future success. In this respect, the only other stat that caught my eye was the 40-40 club for goals and assists. It includes names like John Kerr, Bruce Murray, and Brian McBride. Of the attackers who played four years, those have been some of the best.
     
  22. Elninho

    Elninho Member+

    Sacramento Republic FC
    United States
    Oct 30, 2000
    Sacramento, CA
    Club:
    Los Angeles Galaxy
    Nat'l Team:
    United States
    But that analysis misses the point of the argument: a keeper who can't control his box faces more shots, which contributes to the effect that was found (that keepers who face more shots save a lower percentage of them). Sure, the ability to control the box is reflected in save percentage, but it is also reflected in the number of shots that the keeper faces.

    A good keeper is able to reduce the number of shots he faces in several ways: by grabbing high balls before an attacking player gets to them, by using good foot skills to maintain possession by providing defenders with a safe pass (an alternative to hoofing the ball upfield), and by organizing the defense well. There are some goalkeepers who seem to always play behind a solid defense no matter where in the world they go. Peter Schmeichel, for one, had a reputation for being quick to shout at his defenders, and I don't doubt that it made life a lot easier for him.
     
  23. numerista

    numerista New Member

    Mar 21, 2004
    Empirically, this effect seems to be negligible.

    By reputation, Tim Howard is poor at controlling his box, and yet he faced the fewest shots per game of any EPL keeper last season. The only two who came close were his teammate Roy Carroll and Peter Cech of Chelsea. IIRC, fourth-fewest shots faced was Jens Lehmann of Arsenal.

    Shots faced per game is overwhelmingly a function of team strength.
     

Share This Page