Home > Soccer Forum > World of Soccer > Statistics and Analysis

Reply
 
Thread Tools Search this Thread Rate Thread Display Modes
Old 18 Feb 2004, 02:29 AM   #1
mellon002
BigSoccer Member+
 
mellon002's Avatar
 
Join Date: Jan 2003
Location: Towson, MD
Default Soccermetrics

Be forewarned, I can't fall asleep so that's why I'm able to do this.


The tagline for this forum got me interested. I'm a big baseball fan and for anyone who gets really into the game, stats is a huge part. So much of baseball is numbers. So why not start it with soccer. I've been investigating what Sabermetrics means and it's very confusing. I found one interesting formula they use to predict wins and losses.

http://www.baseball1.com/bb-data/gra...manifesto.html

Quote:
There is a clear relationship between a team's runs (goals in our case) scored and allowed and its wins and losses. This relationship isn't perfect, but it is very strong. A good formula, determined empirically from the data by Bill James, is that a team's ratio of wins to losses will be equal to the square of the ratio between its runs scored and allowed. Thus a team which scores and allows the same number of runs will win and lose the s same number of games, finishing at .500; a team which scores 800 runs and allows 700 will win 64 games for every 49 it loses, which projects to a 92-70 record over a season. This formula comes very close to the actual records of most teams.
It's an interesting theory so I figured I would try it with the numbers from last year.

Chicago Fire: 53 GF and 43 GA - 15W 7L 8T

53:43 (squared) = 2809:1849
2809+1849=4658
2809/4658=.603 win %

Spliting the ties between both wins and losses:
19+11=30
19/30=.633 win %

New England Revolution: 55 GF and 47 GA - 12W 9L 9T

55:47 (squared) = 3025:2209
3025+2209=5234
3025/5234=.578 win %

Spliting the ties between both wins and losses:
16.5+13.5=30
16.5/30=.550 win %

NY/NJ MetroStars: 40 GF and 40 GA - 11W 10L 9T

40:40 (sqared) = 1600:1600
1600+1600=3200
1600/3200=.500 win %

Spliting the ties between both wins and losses:
15.5+14.5=30
15.5/30=.517 win %

Here's the rest of the percentages for the league:
Team------Proj.------Actual
DC---------.527-------.483
CLM--------.500------.467

SJ----------.623-------.616
KC----------.543-------.517
CO---------.441--------.483
LA----------.500-------.450
DAL--------.230--------.283

Some of the percentages are extremely close while others may be off a tad. Here's the interesting thing though, if you can predict the amount of goals scored approximately versus the goals against, you can accurately predict where each team will land. Had you guessed right with the goals and played the percentages, you could have gotten 8 out of 10 teams right in the final standings.

Predicting goals scored and allowed might not be as hard as you think. Just look for steady players and units.

Teams with proven scorers such as Ruiz and Twellman, barring injury, should give steady numbers. We can look at their past trends and get a pretty good read on what their production could be in 2004. Example: Carlos Ruiz should be close to 20 goals again this year.

DC's defense hasn't changed and has only gotten better if anything with the addition of Milton Reyes coming back from a knee injury. They should allow few goals again this year.


I'm still reading into Sabermetrics and how it could be applied to soccer. If anyone knows anything about Sabermetrics and thinks something could work for Soccermetrics, please post here.

I feel like a huge nerd.
mellon002 is offline   Quote 

TRY BIGSOCCER
NOW!
NEWS, SCORES & TABLES FOR 1,300 CLUBS

Connect in the web's largest forums.
Blog about soccer from your point of view.
Shop 17,000 authentic soccer items.




On sale for $72.44
at our soccer store

On sale for $4.99
or buy soccer jerseys

Old 18 Feb 2004, 12:42 PM   #2
mpruitt
BigSoccer Member+
 
mpruitt's Avatar
 
Join Date: Feb 2002
Location: E. Somerville

Supporter: New England Revolution
Default

Welcome to the fold. I think someone may have already ran the numbers that you have there but certainly you've obviously broken them down in a pretty clear way. Look around some of the other threads in here to see which one it might be. This whole forum basically started as an idea that advance objective statistical analysis (sabermetrics) could be applied to soccer. The original thread which was a starter for this forum is here

It's excellent that you've found this stuff as interesting as some of us on here already have. Everytime someone buys into this idea it makes me really excited. If we ever manage to get someone as a moderator on here then that thread definately should be a sticky or as some kind of FAQ. It'd also be great if we could finally get this forum onto the main page, as it's a bit hidden now.

The term Soccermetrics is something I believe beineke termed, "Stats and Analyisis" just always seemed a little more straight forward.
mpruitt is offline   Quote 
Old 18 Feb 2004, 01:32 PM   #3
ChrisE
BigSoccer Member+
 
Join Date: Jul 2002
Location: Chicago, IL
Default Re: Soccermetrics

Quote:
Originally posted by mellon002
Pythagorean stuff
I pretty much agree with all this (moreso than Maxim, at least), and have actually done a little bit of work on it, as have a number of other people, if you care to look around. Just now, trying things out, measuring the average error of the predictions, it looks like an exponential of 1.5 gives the lowest average error (around .02).

Quote:
Predicting goals scored and allowed might not be as hard as you think. Just look for steady players and units.

Teams with proven scorers such as Ruiz and Twellman, barring injury, should give steady numbers. We can look at their past trends and get a pretty good read on what their production could be in 2004. Example: Carlos Ruiz should be close to 20 goals again this year.

DC's defense hasn't changed and has only gotten better if anything with the addition of Milton Reyes coming back from a knee injury. They should allow few goals again this year.
I've got to totally disagree with this part; it's much much harder to predict future performance than simply looking at a team as the sum of its parts. The Galaxy last year had a massively worse goal differential than they did in 2002, despite the fact that they lost pretty much no one. Meanwhile, despite losing several integral players, the Chicago Fire relied on several unpredictable newcomers (Damani Ralph, Justin Mapp, Andy Williams) to improve on 2002. No one expected them to be better last year.

Furthermore, you've got huge problems arising with a team like D.C. that's under new management. That team won't play anything like they did in 2003, now that Etcheverry and Stoitchkov are gone, and Convey is running the midfield (if he even stays the whole season - who knows); there's simply no way (that I can see) to predict how their offense will perform.

Even your examples reveal problems: Ruiz may have looked about as productive in 2003 as he was in 2002, but he got 7 of his 15 goals off penalties; I don't know if his problem was related to the Galaxy's midfield, or simply a year-long slump, but he clearly wasn't as good this year. Twellman, likewise, missed a lot of time to injuries - how do you predict something like that?
ChrisE is offline   Quote 
Old 18 Feb 2004, 02:05 PM   #4
mellon002
BigSoccer Member+
 
mellon002's Avatar
 
Join Date: Jan 2003
Location: Towson, MD
Default

Thanks maxim-1. I've actually been thinking about other stats that we could use. I think Sabermetrics invented the OPS (on-base percentage + slugging percentage). We should start creating some numbers as well.

For a striker we could use a GPM (Goals per 90 minutes) which would be similar to an ERA in baseball. A GPM could help us to predict results in this way. If a player were to miss extended time due to national team call-ups for the World Cup or Olympics their numbers will obviously decrease. But if they have had steady production over the past few years, we can calculate how many minutes they will play and we should be able to accurately predict their production level for the season.

I'm sure somebody has probably thought of this. I didn't read through all of the thread yet that you posted, but I'm working on it. I did read that someone made the point that we must establish a relationship between the numbers and results. Well, the numbers I crunched last night have a pretty good relationship.

The most I was off by was The Burn who were off by .053 while in contrast the 'Quakes were only off by .007. The average difference between actual and projected numbers was .033 which happens the same as the Crew's difference. Therefore if you multiply the the projected win % of the Crew (.500) against 30 games you get 15 which was a 1 game difference between their total after ties (14) were factored into the win column. So I guess this means that on average, we can come within 1 game of correctly guessing the win total after splitting ties. The more important part, is the fact that because the win % comes that close to being correct, we don't have to calculate that total. All we do is rank the the win % and we can make predictions for the results!

I'd say there is a definate relationship. We could even get a group going and make predictions for each team, average all the predictions and then crunch the numbers to have an official board prediction for he upcoming season. What do you think?
mellon002 is offline   Quote 
Old 18 Feb 2004, 03:23 PM   #5
mpruitt
BigSoccer Member+
 
mpruitt's Avatar
 
Join Date: Feb 2002
Location: E. Somerville

Supporter: New England Revolution
Default

Chris E and Karl Keller have done some good work in regards to goals stuff. Here's a thread about goals per 90. http://www.bigsoccer.com/forum/showt...threadid=87335
mpruitt is offline   Quote 
Old 18 Feb 2004, 06:15 PM   #6
mellon002
BigSoccer Member+
 
mellon002's Avatar
 
Join Date: Jan 2003
Location: Towson, MD
Default

Where do they get those numbers?
mellon002 is offline   Quote 
Old 18 Feb 2004, 06:19 PM   #7
mellon002
BigSoccer Member+
 
mellon002's Avatar
 
Join Date: Jan 2003
Location: Towson, MD
Default Re: Re: Soccermetrics

Quote:
Originally posted by ChrisE
I've got to totally disagree with this part; it's much much harder to predict future performance than simply looking at a team as the sum of its parts. The Galaxy last year had a massively worse goal differential than they did in 2002, despite the fact that they lost pretty much no one. Meanwhile, despite losing several integral players, the Chicago Fire relied on several unpredictable newcomers (Damani Ralph, Justin Mapp, Andy Williams) to improve on 2002. No one expected them to be better last year.

Furthermore, you've got huge problems arising with a team like D.C. that's under new management. That team won't play anything like they did in 2003, now that Etcheverry and Stoitchkov are gone, and Convey is running the midfield (if he even stays the whole season - who knows); there's simply no way (that I can see) to predict how their offense will perform.

Even your examples reveal problems: Ruiz may have looked about as productive in 2003 as he was in 2002, but he got 7 of his 15 goals off penalties; I don't know if his problem was related to the Galaxy's midfield, or simply a year-long slump, but he clearly wasn't as good this year. Twellman, likewise, missed a lot of time to injuries - how do you predict something like that?
I complely agree and to make a prediction you have to try and do your best to factor all those things in. But how does that make soccer any different that any other sport you try to predict? You just have to do your best. If we did a board prediction by using the formula I used, we could have a group of people ranking teams offenses, defenses, and individual players. The more brains, the closer we can become to making accurate predictions.
mellon002 is offline   Quote 
Old 18 Feb 2004, 06:36 PM   #8
mpruitt
BigSoccer Member+
 
mpruitt's Avatar
 
Join Date: Feb 2002
Location: E. Somerville

Supporter: New England Revolution
Default

Quote:
Originally posted by mellon002
Where do they get those numbers?
All the statistics are from MLSnet.com their stats site. A couple of us have turned them into more workable excel files. I think that Chris has one of the more extensive, I can email you mine if you like though the file I've been working with is all team data, not individual players.
mpruitt is offline   Quote 
Old 18 Feb 2004, 07:07 PM   #9
ChrisE
BigSoccer Member+
 
Join Date: Jul 2002
Location: Chicago, IL
Default Re: Re: Re: Soccermetrics

Quote:
Originally posted by mellon002
I complely agree and to make a prediction you have to try and do your best to factor all those things in. But how does that make soccer any different that any other sport you try to predict? You just have to do your best. If we did a board prediction by using the formula I used, we could have a group of people ranking teams offenses, defenses, and individual players. The more brains, the closer we can become to making accurate predictions.
Well, it's very different than baseball because soccer is a team game, whereas baseball is (largely) the product of a lot of discrete events. You may see Rich Aurilia's numbers improve because he's batting behind Barry Bonds, but it's a lot easier to adjust for Bonds's contribution than it is to adjust for Jose Cancela's contribution (it's not like Bonds is in the batter's box with Aurilia). You'll notice that sabermetrics haven't been nearly as successful (for whatever reason) in hockey or basketball as they have in baseball.

I'm not saying that I don't think it's worthwhile to try to predict future performance, of a team or an individual, with the stats we have - hell, that's why I'm here. I just think it's going to be much, much harder, and ultimately less conclusive, than you're giving it credit for.
ChrisE is offline   Quote 
Old 18 Feb 2004, 07:10 PM   #10
ChrisE
BigSoccer Member+
 
Join Date: Jul 2002
Location: Chicago, IL
Default

Quote:
Originally posted by mellon002


I'd say there is a definate relationship. We could even get a group going and make predictions for each team, average all the predictions and then crunch the numbers to have an official board prediction for he upcoming season. What do you think?
I had actually been thinking about doing something like this. There was an interesting thread on rec.sport.soccer about predicting the World Cup group results using a probabilistic model instead of just going 1,2,3,4 (I'd explain further, but it would be easier to just go to the link). I think that would be a much more informative and realistic way to approach this kind of prediction. I'd just be worried that we wouldn't get more than 4 or 5 responses.

(Yeah, I just compiled the data from MLS's quite comprehensive statistics section. Kenntomasch has a lot of stuff on player ages and attendances, and I'm sure beineke and voros have some interesting things, I just don't know what.)
ChrisE is offline   Quote 
Share

Reply

  Home > Forums > World of Soccer > Statistics and Analysis


On sale for $72.44
at our soccer store

On sale for $67.44
or buy soccer jerseys

Share
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Forum Jump

World of Soccer
On The Pitch
Equipment & Gear
Soccer Store
England
Europe
USA
Americas
Asia, Oceania & Africa
Women's Soccer
Not Soccer Related
Customer Service







All times are GMT -5. The time now is 01:45 AM.



 

Copyright © 2009 Big Internet Group, LLC. All rights reserved. PRIVACY POLICY. TERMS OF USE.
The BigSoccer name and logo and 'Share the Passion!' are service marks of Big Internet Group, LLC.
The BIG Network: Soccer | Aussie Rules Football | Travel | Cricket | Lacrosse | Music
Views expressed by the bloggers and users of BigSoccer do not represent the views of Big Internet Group, LLC.