PDA

View Full Version : Making Sense of Save Percentages


Pages : [1] 2

numerista
25 Nov 2005, 10:52 AM
Although a keeper's Save Percentage is frequently regarded as his most meaningful statistic, it also depends on the difficulty of shots he faces. Close-range attempts are much harder to stop than 40-yard prayers.

Without a detailed shot chart, it's hard to measure shot difficulty; however, one related factor is the number of shots a keeper faces. If he faces a lot, this may mean that the opponents are getting forward in numbers and getting close to the goalmouth. To check this hypothesis, I downloaded stats on 109 college teams from 12 NCAA Div-I conferences. I found that the correlation between Shots Faced per 90 Minutes and Save Percentage was highly significant (p-val < 0.001).

The formula for predicting save percentage from SF/90 was:
SV% = 85.49 - 1.71 * SF/90

In high-Div I soccer, for every additional shot you face, your save percentage drops by ~1.7%.

numerista
25 Nov 2005, 11:23 AM
One implication here is that if you play behind a terrific defense, your save percentage gets inflated. This is particularly unfortunate, because keepers are also judged by their goals against average, which also gets inflated by a team that doesn't allow openings.

I went looking at past seasons for college keepers who were drafted by MLS, or who spent time on MLS rosters, and I adjusted their save percentages for the number of shots they faced...

Adjusted Save Percentages

Hannigan, Sr +9.9%
Wells, Sr +8.9%
Fulton, So +7.5%
Hannigan, Jr +7.1%
Kennedy, Sr +6.6% (USL Rookie of the Year, 2005)
Behonick, Sr +6.2%
Baumstark, Fr +6.2%
Wells, So +5.7%
Wells, Jr +5.5%
Warren, Sr +5.1%
Guzan, So +5.0% (went pro)
Fulton, Jr +4.9%
Singer, Sr +4.7%
Terris, Jr +4.3%
Countess, Fr +3.8% (went pro)
Barragan, Sr +3.8% (in Div II)
Perkins, Sr +3.5%
Palmer, So +3.5%
Hesmer, So +2.8%
Kennedy, Jr +2.6%
Warren, Jr +2.6%
Gaudette, Sr +2.3%
Saunders, Jr +2.1
Hesmer, Jr, +1.9%
Kennedy, So +1.8%
Guzan, Fr +1.7%
Palmer, Jr +1.6%
Sawyer, Sr +1.4%
Comfort, Jr +1.3%
Gomez, Jr +1.2%
Baumstark +0.1% (went pro)
Gomez, Sr 0
Sawyer, Jr 0
Hesmer, Sr 0
Cronin, Jr 0 (went pro)
Pickens, Jr 0
Pickens, Sr -0.1%
Terris, Sr -0.1%
Saunders, Sr -1.0%
Nolly, Sr -2.1%
Palmer, Fr -2.1%
Behonick, Jr -2.4%
Mahoney, Sr -3.7%
Fulton, Sr -4.2%
Palmer, Sr -4.9%
Comfort, Sr -6.1%

Some of the weaker numbers, it should be noted, came in seasons where a keeper wasn't fully healthy.

numerista
25 Nov 2005, 11:44 AM
Adjusted Save Percentages

Wells, Sr +8.9%
Wells, So +5.7%
Wells, Jr +5.5%

In part, I'd be inclined to attribute Wells' numbers to the fact that he played behind a terrific UCLA defense. All the same, I think it bears mentioning that in 2005, there were five MLS keepers who earned significant minutes and managed to make at least 3 saves for every goal they allowed. They were ...

1. Onstad, GK of the Year 2003, 2005
2. Cannon, GK of the year 2004
3. Reis, nat'l team pool
4. Walker, nat'l team pool (16 starts)
5. Wells (17 starts)

Is it a coincidence that a keeper who had outstanding college numbers has also put together very good numbers as a pro?

numerista
25 Nov 2005, 10:40 PM
Adjusted Save Percentages

Hannigan, Sr +9.9%
Hannigan, Jr +7.1%

Along with Wells, Hannigan is the other keeper with the very best numbers, although it's worth adding that he played in a relatively low-level conference (Temple in the A-10).

As of a few minutes ago, I knew nothing about him, so I was pleased to find that he has latched on with Bray Wanderers in Ireland. They've just finished their season, and he debuted in the final two games, winning them both and saving a penalty in the first. He did let in 3 goals total, but that's still better than the 1.8 goals per game that his team had been allowing.

Also, I came across his sophomore numbers, and his adjusted save percentage was +7.1% that year, too.

scaryice
28 Nov 2005, 06:53 AM
Wells had one game this year with like 17 saves. He did not have a good year. Save percentage is ok, but goalkeeping stats aren't that great.

numerista
28 Nov 2005, 09:34 AM
Wells had one game this year with like 17 saves. He did not have a good year. Save percentage is ok, but goalkeeping stats aren't that great.

I agree that the stats are fairly limited -- and now that I've looked more closely at a bunch of college numbers, I could say more -- however, I disagree about Wells' 14-save game.

That game ended in a tie, so every last one of those saves was meaningful. In no sense was he padding his stats. Therefore, there's no reason to omit those saves from his accomplishments this season. Wells got dinged for letting in 4 goals against the Revs, so he should also get credit for shutting out DC.

numerista
28 Nov 2005, 12:37 PM
I agree that the stats are fairly limited -- and now that I've looked more closely at a bunch of college numbers, I could say more.

Adding my thoughts after looking at the college numbers (in the linked thread) ...
http://www.bigsoccer.com/forum/showthread.php?t=271423

1. I like the adjusted save percentage because it debunks some of the paper tiger goalkeepers who manage to post good GAA's without actually doing much. Ford Williams of UNC is my poster child, so it'll be interesting to see if he has any pro success.

2. The players with the highest actual save percentages also tend to have the highest adjusted percentages. Well-regarded keepers behind bad defenses (e.g. Dunsheath at Bradley) do get a big boost, but this year at least, they don't catch up.

3. My biggest concern is that I'm adjusting for the number of shots a team faces, which doesn't directly address the quality of those shots. A team that allows a larger number of low-quality shots (e.g. a bunker-and-counter team) can end up with a keeper who has artificially inflated numbers. On the opposite side, I think that this year's ACC keepers have artificially deflated numbers.

4. It's not clear what mathematical adjustment is the "correct" one. I'm taking a keeper's save percentage and subtracting a prediction from it (which is the output of a linear model). This approach is fairly standard, but there are a great many alternatives, some of which may produce quite different results.

5. Another concern is that save percentages vary quite a bit at random. Sometimes you get lucky (as with Wells) and a keeper's year-to-year numbers look consistent; other times, you don't get lucky.

scaryice
29 Nov 2005, 12:57 AM
I agree that the stats are fairly limited -- and now that I've looked more closely at a bunch of college numbers, I could say more -- however, I disagree about Wells' 14-save game.

That game ended in a tie, so every last one of those saves was meaningful. In no sense was he padding his stats. Therefore, there's no reason to omit those saves from his accomplishments this season. Wells got dinged for letting in 4 goals against the Revs, so he should also get credit for shutting out DC.

He was really good for one game, and when looking at save percentages that skews things for the entire season.

numerista
29 Nov 2005, 05:30 AM
He was really good for one game...

Why are you assuming that he was different in that particular game? In total, Wells ended up making a string of 17 consecutive saves between goals allowed (including his final save before the 14-save game and his first two in the game after). That's good, but it's not inconsistent with his other performances.

Suppose that the shots Wells faced this season had been in completely random order. You'd expect his "hottest" streak to include somewhere between 8 and 21 consecutive saves. It wouldn't be at all unusual for him make 17 saves in a row (pval 0.28).

scaryice
29 Nov 2005, 05:35 AM
Why are you assuming that he was different in that particular game?

Because, I watched him play in the other games.

numerista
29 Nov 2005, 07:48 AM
Because, I watched him play in the other games.

I watched him play a fair bit, myself, yet I didn't see any reason to conclude that Wells was dramatically different from one day to the next.

Serie Zed
10 Dec 2005, 08:33 AM
If this is hyper obvious, apologies -- I'm partly thinking out loud here...

If a keeper's save PERCENTAGE drops with shots faced/90 it means that the marginal extra shots surrendered by a defense are higher quality shots. Otherwise the percentage wouldn't drop, the GAA would just rise.

[[I suppose you could make an argument that there's an increasing liklihood that a keeper loses his mental edge and/or makes more blunders if he's under fire, but it seems likelier (to me anyhow) that it's because those marginal shots are good ones.]]

With that in mind, I'm wondering if the change to the Save% is linear or if it's an exponential curve (that stays flatter at the lower end of SF/90 and then rises more quickly as SF/90 go up).

If you had a big database of game data available you could group games together based on the number of shots faced by the keeper. So that you had a bunch of records for every number (5 shots faced, 6 shots faced, etc). Then you could graph the Save% for each group to see what the pattern looked like.

If it's exponential, it might lend support to the idea that good defenses are good not only because they prevent shots, but especially because they prevent GOOD shots as a proportion of those shots. And vica versa.

Additionally, if you could generate the predicted change to Save % for individual games, you could compare a keeper's actual Save % to the predicted Save % for each game across a season (or career) to refine the measure of his/her quality. Which is basically just building on Numerista's idea.

But you could then also do that for a defense. Once you know how many goals for each game would be predicted based on the shots surrendered/game you could measure the worth of the defense independent of the GK.

Or do I just need a 2nd cup of coffee?

numerista
10 Dec 2005, 12:39 PM
If this is hyper obvious, apologies -- I'm partly thinking out loud here...

Even though this forum is far too quiet for my liking, it is a good place to think out loud. :)


With that in mind, I'm wondering if the change to the Save% is linear or if it's an exponential curve (that stays flatter at the lower end of SF/90 and then rises more quickly as SF/90 go up).

When I plotted the data I had, the linear fit looked reasonable; however, any number of other fits would also have looked pretty reasonable. Since the response ranges from 0% to 100%, a further option would be to use a logistic (s-shaped) curve.

If you had a big database of game data available you could group games together based on the number of shots faced by the keeper. So that you had a bunch of records for every number (5 shots faced, 6 shots faced, etc). Then you could graph the Save% for each group to see what the pattern looked like.

I think that this is a terrific idea, with a couple of caveats:
1. In MLS (not sure about college), shot selection depends a great deal on the scoreline. Once a team is ahead, it tends to shoot less often, but perhaps with higher quality. This could be important to adjust for.

2. Data like this tends to be hard to inspect visually (too much scatter), but there are plenty of other statistical ways to figure out the right shape of curve to use.


But you could then also do that for a defense. Once you know how many goals for each game would be predicted based on the shots surrendered/game you could measure the worth of the defense independent of the GK.

I think I'm following but not sure. To give an example, if we know that Nick Rimando has been a +3% for his career, but that his team was a -1% in one year (and Nick seemed to be playing normally), then we rate his defense a -4%. Is that the kind of thing you have in mind?

Serie Zed
10 Dec 2005, 02:40 PM
Not sure how "shoots more when behind" affects my idea. Will have to think on it.

But the graph I was suggesting would be very simple. Eg data:

97 games where offense took five shots -- opposing keepers' average Save % = .893
113 games where offense took six shots taken -- opposing keepers' average Save % = .879

So, one data point for each number of shots per game...Eg.

5 shots taken = .893 Average Save % plot those two numbers
6 shots taken = .879 Average Save % plot those two numbers

And the strength of keeper for a season would be a game-by-game reckoning of his actual save percentage vs the average save % for the number of shots he faced.

Over 40-50 games in a season you'd see a trend, I think.

Likewise...if you know the average Save % for any given number of shots faced by a keeper, you could generate an expected goals scored for each game based on the number of shots a defense surrendered.

Do that over a season and you might see how solid a D was, independent of the keeper they had behind. Again, it assumes that good defenses give up not only fewer shots, but also PROPORTIONALLY fewer good chances.

This would have to be refined (if it worked in the first place), but it might be a decent place to start?

numerista
10 Dec 2005, 07:30 PM
But the graph I was suggesting would be very simple.

Good point. I got ahead of myself imagining an adjustment for ahead/level/behind, so I missed that. I need better coffee. :)

Steve Holroyd
05 Jan 2006, 04:47 PM
I recall The Hockey Compendium voicing similar thoughts on the "unreliability" of save percentage, and came up with a "durability factor" using shots faced, etc. I don't recall how similar (if at all) it was to the formula you've proposed, but it's obvious that sone sort of calculation should be used to take into account poor defenses.

By way of example, here are some of the NASL's all-time great goalkeepers' save percentages:

Shep Messing (Bos/Cosmos/Oak/Roch) .832
Hubert Birkenmeier (Cosmos) .809
Tino Lettieri (Minn/Van) .802
Ken Cooper (Dallas) .801
Volkmar Gross (SD) .789
Mick Poole (Den/Port) .789
Bob Rigby (Phil/Cosmos/LA/Mont/SJ) .789
Arnie Mausser (Den/TB/Van/Col/Ft.L/NE/TB/etc.) .783
Phil Parkes (LA/Van/Chic/Tor) .778
Jan Van Beveren (Ft.L) .762

It is interesting to see the oft-villified Messing as the NASL's all-time leader in save percentage. It is also interesting to see Van Beveren--who used to routinely top best GK polls at the time--as the lowest among this group.

I'd love to be able to plug in the "durability formula" against some of these numbers: Mausser, who had over 200 saves several seasons because he played for some awful teams, would no doubt rate pretty highly, while Birkenmeier would drop, as would Parkes (who managed to lead the NASL in GAA one year with a .775 percentage--obviously, his defense did the work). However, the NASL did not keep shots totals until about 1982.

Long story short...thanks for coming up with a logical way to add some "depth" to save percentages.

numerista
05 Jan 2006, 11:57 PM
I recall The Hockey Compendium voicing similar thoughts on the "unreliability" of save percentage, and came up with a "durability factor" using shots faced, etc. I don't recall how similar (if at all) it was to the formula you've proposed

Thanks for the pointer. According to the link below, the Hockey Compendium authors actually used the same basic formula I did; however, instead of learning the parameters from the data, they chose their adjustment factor arbitrarily ... as the linked text points out, that's not too reliable an approach.

http://www.puckerings.com/research/persev.html

RichardL
07 Jan 2006, 05:52 AM
Save percentage is just a small part of the story though. Look at Tim Howard and Roy Carroll. Neither of them appeared to suffer through a poor save percentage, but if you make mistakes which lead to shots/goals then that's far more critical. Saving an extra shot for every thirty you face isn't much good if a keeper's mistakes means he is helping create extra chances for the other team - either by dropped catches, bad positioning, not claiming crosses, not coming off his line etc. A keeper who doesn't command his area will be a liability, regardless of how good his reflexes are. The best thing a keeper can do is stop attacks before they lead to a shot.

numerista
07 Jan 2006, 11:12 AM
Save percentage is just a small part of the story though. Look at Tim Howard and Roy Carroll. Neither of them appeared to suffer through a poor save percentage, but if you make mistakes which lead to shots/goals then that's far more critical.

Most of the time, when a keeper's mistake leads directly to a shot attempt, it leads to a high-percentage shot attempt. As a result, mistakes like the ones you list do get reflected in a keeper's save percentage.

As for whether Carroll and Howard's save percentages looked ok, bear in mind that you're talking about unadjusted numbers. Considering that Carroll and Howard faced fewer shots/game last year than any other keepers in the Premiership (including Cech), their save percentages are likely to have been artificially inflated by facing less attacking pressure.

Now, unfortunately, I don't have enough Premiership data to do a good job of learning what adjustment factor to use (nor do I have game-by-game data like Serie Zed suggested). But when I apply the college soccer adjustment factor to the 2004-05 EPL, Carroll rates 19th out of 20 starters, while Howard rates worse than any EPL starter.

Even if I use a much smaller adjustment factor (which appears appropriate), neither of them look good. When I reduce the adjustment factor by 75% (from 1.71%/shot to 0.43%/shot), Carroll still only moves up to 15th, and Howard still comes out worse than any Premiership starter.

Steve Holroyd
07 Jan 2006, 10:23 PM
To add to the discussion, I'd like to introduce some raw numbers.

Here are the top 10 single season save percentages in the history of American D1 soccer (excepting the 1920s ASL, since no save stats were kept, nor can they be reconstructed from game reports):

1. Mirko Stojanovic Oakland 1967 .908 1.00
2. Bob Rigby Philadelphia 1973 .907 0.62
3. Ken Cooper Dallas 1972 .901 0.86
4. Mike Hewitt San Jose 1976 .898 0.92
5. Mirko Stojanovic Dallas 1971 .897 0.73
Mike Winter St. Louis 1972 .897 1.00
7. Shep Messing Boston 1975 .894 0.93
8. Barry Watling Seattle 1974 .892 0.80
9. Richard Blackmore New York 1972 .890 1.14
10. Gernot Fradyl Philadelphia 1967 .889 1.43


When I started going through the data, I had expected to see a few examples of the "Gilles Meloche Syndrome" (named for the long-suffering California Golden Seals goaltender who, critics said, would have been the second coming of Glenn Hall if he had a defense in front of him. See also, Herron, Dennis circa Kansas City Scouts)--goalkeepers who had phenomenal save percentages (for purposes of my example, higher than .800) and dismal GAAs. Much to my surprise, that has never happened, with one exception: Arnie Mausser logged a .817 percentage in 1975 with Hartford while also recording a league-worst 2.24 GAA. Of course, he went on to have a Hall of Fame career.

Instead, the top seasons were all by goalkeepers who also had great GAA seasons. Stojanovic (both times), Rigby, Cooper, Watling and Messing all led the NASL in GAA in their seasons, and Hewitt and Winter were second. Perhaps not coincidentally, Stojanovic (both years), Blackmore, and Rigby's teams won the championship in their seasons.

Not a single MLSer mad the list. The closest was Brad Friedel's .868 in his abbreviated 1996 season, followed by Joe Cannon's .824 for Colorado in 2004. Tony Meola's monster 2000 season yielded only a .816 percetage. I suggest this could be attributable to either 1) a more offensive style of soccer in the 1970s leading to more shots and, thus, more saves; 2) dubious NASL stat-keeping; or 3) both.

Does the fact that the goalkeepers with the best save percentages also happen to have had the best seasons for GAA render the percentage statistic meaningless? No...the list is populated with goalkeepers who were consistently among the NASL's best.

But it just goes to show the need for a meaningful adjusted stat. Otherwise, it would seem that GAA is a reliable indicator of goalkeeper effectiveness, even though most people accept that it is not.