PDA

View Full Version : Sabermetrics applying to Soccer


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 [14] 15 16

leg_breaker
11 Feb 2007, 10:13 PM
Apparently those numbers are just made up.

Schu419
09 Jun 2007, 02:14 PM
really sad to see this thread has died out so much. I've been reading Baseball Between The Numbers and have been inspired to look into Soccermetrics more and more.

voros
09 Jun 2007, 08:00 PM
really sad to see this thread has died out so much. I've been reading Baseball Between The Numbers and have been inspired to look into Soccermetrics more and more.
The basic issue is that unlike baseball there's been no real movement around which folks can congeal. so it's just isolated folks plugging away here and there.

I think it's also important to not equate "Soccermetrics" with "Soccer Stats". Stats are used so heavily in baseball because they're there, but a lot can be done in soccer with biographical and other factual information without having to start counting up passes completed and the like.

mario11
11 Jun 2007, 06:36 PM
Soccer statistics - Norway, Finland, Sweden, Russia

http://www.mfstats.com (http://www.mfstats.com/)

Schu419
12 Jun 2007, 01:09 PM
but a lot can be done in soccer with biographical and other factual information without having to start counting up passes completed and the like.


What do you mean exactly?

And what do you think needs to be done to create a focal point for everyone's disparate progress?

voros
13 Jun 2007, 04:56 PM
What do you mean exactly?
Here's an example:

http://numeridicalcio.wordpress.com/2006/08/16/ages-of-players-on-world-cup-rosters-1998-2006/

The idea is that counting up the years of birth of guys on World Cup rosters, we can get a better idea of what National Team managers think about the way players age and develop.

Another example are the stats with U.S. youth soccer and the month of birth for players selected for the National Team. These teams tend to be overwhelmingly comprised of players born in the first third of the calendar year. The fed continues to do this despite an immense amount of statistical evidence that suggests it is a problem.

You can also look at the heights and ages of players for World cup winners, or Champions League quarterfinalists to try and gauge what the best teams think on those issues. You can compare and contrast what areas of the country certain types of players come from (for example are most creative dribblers tend to come from "soccer backwaters" so to speak, at least initially).

You don't have to keep track of passes completed in the 80th minute onward to extract valuable information. This sort of contrasts what goes on in baseball where computer tracking has led to a level of detail that can even frustrate me at times.

Schu419
14 Jun 2007, 02:19 PM
Here's an example:

http://numeridicalcio.wordpress.com/2006/08/16/ages-of-players-on-world-cup-rosters-1998-2006/

The idea is that counting up the years of birth of guys on World Cup rosters, we can get a better idea of what National Team managers think about the way players age and develop.

Another example are the stats with U.S. youth soccer and the month of birth for players selected for the National Team. These teams tend to be overwhelmingly comprised of players born in the first third of the calendar year. The fed continues to do this despite an immense amount of statistical evidence that suggests it is a problem.

You can also look at the heights and ages of players for World cup winners, or Champions League quarterfinalists to try and gauge what the best teams think on those issues. You can compare and contrast what areas of the country certain types of players come from (for example are most creative dribblers tend to come from "soccer backwaters" so to speak, at least initially).

You don't have to keep track of passes completed in the 80th minute onward to extract valuable information. This sort of contrasts what goes on in baseball where computer tracking has led to a level of detail that can even frustrate me at times.

I like where you're going with this (and I also like the little medieval icon you have going on your blog).

Wasn't there an article in the NYTimes (or was it elsewhere? I get my reading confused) about that problem with birth months among youth soccer players? (was it based on your research?)

I can understand the problem just from my own experience; playing Little League baseball, the oldest kids of our year, particularly if they were quite developed, were dominant. Of course, these kids were then spotted and placed on travel teams etc. In the end, this doesn't lead to the best talent scouting, as everyone's physical development will, essentially, average out. (One kid I remember in particular at age 12 was highly touted - big guy, powerful, best hitter around. He ended up being crap )

Looking at your examples, I want to ask very general question: What is the purpose of SoccerMetrics?

In baseball, I suppose (correct me if I'm wrong), the statistics movement is based on the quest for an objective evaluation of player worth. But in what direction are your provided examples of SoccerMetrics moving? The goal, from what you just explained, seems different: namely, to challenge the status quo. In your examples, team selection/coaching are up for criticism, pointing out illogical biases (ie, not as logical and objective as possible. Not "stupid")

I'm not saying that it's not the correct thing to do, but I just wonder what the ultimate goal is here. Sabermetrics (don't some people hate that term?) seems to aim at finding what is right - what's the correct way to measure hitting, to evaluate pitching performances etc. The problems with age-bias, height-bias in soccer seem to just point out what is wrong - what's clearly NOT the correct way to do things.

Maybe this is how the statistics movement started in baseball research, just going with findings here and there, eliminating what's wrong in order to eventually get at what's right, but it seems to me the most efficient start for any sort of movement for Objective and Logical soccer analysis would need a clarification of its mission.

James8
05 Jul 2007, 06:04 AM
not sure

watanabe2k
27 Jul 2007, 02:51 PM
I like where you're going with this (and I also like the little medieval icon you have going on your blog).

Wasn't there an article in the NYTimes (or was it elsewhere? I get my reading confused) about that problem with birth months among youth soccer players? (was it based on your research?)

I can understand the problem just from my own experience; playing Little League baseball, the oldest kids of our year, particularly if they were quite developed, were dominant. Of course, these kids were then spotted and placed on travel teams etc. In the end, this doesn't lead to the best talent scouting, as everyone's physical development will, essentially, average out. (One kid I remember in particular at age 12 was highly touted - big guy, powerful, best hitter around. He ended up being crap )

Looking at your examples, I want to ask very general question: What is the purpose of SoccerMetrics?

In baseball, I suppose (correct me if I'm wrong), the statistics movement is based on the quest for an objective evaluation of player worth. But in what direction are your provided examples of SoccerMetrics moving? The goal, from what you just explained, seems different: namely, to challenge the status quo. In your examples, team selection/coaching are up for criticism, pointing out illogical biases (ie, not as logical and objective as possible. Not "stupid")

I'm not saying that it's not the correct thing to do, but I just wonder what the ultimate goal is here. Sabermetrics (don't some people hate that term?) seems to aim at finding what is right - what's the correct way to measure hitting, to evaluate pitching performances etc. The problems with age-bias, height-bias in soccer seem to just point out what is wrong - what's clearly NOT the correct way to do things.

Maybe this is how the statistics movement started in baseball research, just going with findings here and there, eliminating what's wrong in order to eventually get at what's right, but it seems to me the most efficient start for any sort of movement for Objective and Logical soccer analysis would need a clarification of its mission.

Well, I'm working on a PhD focus on Sports Econ, so maybe I can try and answer or discuss some of these points a little better.

The statistics and econ movement in sports really has only really taken off in research in the last 10 years. Currently I am working on competitive balance, player production, and a few other projects.

Let's look at player production. How good is a player, how do we measure that? The research until about 1992, when Zimbalist wrote Baseball and Billions (It might be Billions and Baseball, I always get confused which is first), MLB players in Marginal Revenue Product studies were only measured on ratios of strikeouts to walk (if pitcher) or a batter used slugging average. This was first used by Scully in 1974 in his paper on measuring marginal revenue product of players.

The study is still being imitated with small tweaks. However, much of the problem is that many people still only use one statistic for a player. In fact most of the problem in all of these research articles are that they only measure one small part of a players production, and aren't truly just a single players contribution to a team. Basically what I am saying is there a flaw in all the studies, and while they all say important things, it doesn't give us a definite answer, just a better picture of what we are dealing with.

Personally, I did a research study that I might try to evolve into a conference presentation or paper that looks at measuring a players presence on the field in soccer. It ignores goals, fouls, assists, and looks at playing time using a simple model to try and find the intangible benefits of players on a team. Using data I collected from every Arsenal match over the last three years, I looked at who contributes to having a better outcome of the match the most. Interestingly, it was Gilberto Silva, and not any other player. This goes to show that many of the "statistics" that are collected and talked about, don't necessarily paint the whole picture.

I guess this is getting long, so basically what I want to say is that statistics are a tricky thing, and its hard to say what is "right."

The Jitty Slitter
06 Aug 2007, 09:51 AM
Are you guys familiar with decision technologies?

http://www.dectech.org/times/Predictor.html

These guys have modelled european football with quite good results using long run probability. They have a column in the times during the season.

They found the following things in summary, based on the idea that goals scored and conceded are the best predictor of outcome (obviously)


shots on goal is a good predictor of goals. Teams who generate more shots on goal on average are likely to score
preventing shots on goal is a predictor of goals conceded. Teams who deny shots on goal are less likely to score
passing accuracy (or completion) is predictive
corners are not strongly correlated
possession is not as strongly predictive as shots (above)
shots off target are not predictive


they have some interesting ideas

first that the season is not long enough for the "best team" to definitely win. Some luck comes into it.

You can measure the impact of various players by removing them from the line up and replacing them with an average player... :confused:

http://www.timesonline.co.uk/tol/sport/football/fink_tank/article1851667.ece

We used a multivariate Poisson log-normal model. I hope you find that information helpful.

Dr Ian Graham and Dr Henry Stott used the model to allow us to identify the relationship between goals scored and every kick of the ball made by every player for every club. Once this was done, they simulated the league season over and over again, removing players one by one and replacing them with an average player in his position.

watanabe2k
08 Aug 2007, 10:34 AM
Damn, I am not a big fan of Poisson models, especially since I had to calculate those things from data tables for one of my econometrics classes.

I guess to put it simply the Poisson model will tell how often a certain event will occur in a fixed period of time, if the rate of the certain event is already known.

I find this study they have done quite interesting, but I would like to see a publishable version with all of their data collection sources and models shown, so that others can try to retest what they have done. As with all stat studies, there can be errors that could cause their estimates of who are the best players to be off.

I used an Ordered Probit Model to try and do something somewhat similar to what they have done, but in my case I was measuring a players productivity when he isn't touching the ball. Think about the player who helps his team win, by running into space, drawing away defenders, dummying, or just plain staying out of the way. These players and their movements help affect outcome, and by using several season of data, I try to predict who has the best positive and negative effect on the team winning, without looking at traditional stats such as touches, shots, pass accuracy, fouls, tackles, goals, and the whole lot.

I think the real problem is that both these studies show the problem with doing stats research. They are both legitimate methods of showing how effective certain players are, but they both take into account different things to measure. Hopefully one would be able to combine these studies so that both full stats, and a numeric for off the ball movement and other such non recorded contributions could be combined into a single study.

Of course thats a bit down the road for me at this moment. I will have to show my advisor and research committee this article and see how they think it can affect the research I'm doing. THANKS FOR POSTING IT! :D

Sorry if I rant on too long.......

The Jitty Slitter
10 Aug 2007, 05:03 PM
i'm actually thinking of contacting these guys - good idea?#

watanabe2k
16 Aug 2007, 11:22 AM
i'm actually thinking of contacting these guys - good idea?#

It might be a good idea. What would be a good thing to ask them is if they have a technical paper that has been published, or a working paper they are trying to get published. By looking at that it can give us a lot more details on the exact measurements and everything they are doing. It is always good to see how they collected their data, how they weighted, and measured things, what kind of errors may be present in their study, as well as possible things that they missed out, and how difficult will it be to retest their study possibly with other conditions added on or taken out.

On a side note, I just got a invitation to give a presentation at a Econ Conference on my paper: Competitive Balance and other Variable Effects on Major League Soccer Attendance! :D

revelationx
21 Aug 2007, 10:39 AM
I have not read all this thread as it is too long. It is interesting though.

It should be noted that statistics are generally less useful to predict results in Football than most other sports. This because in football there are usually very low scores. 2-1, 1-0, 1-1, 0-0 etc. Compared to basketball which could often have more than 50-60 pts scored by each side at the top level.

A football match has less decisive moments in a match that result in a score. If the ref makes a bad call or a player makes a bad pass this can affect the entire result in a match more often than in other sports. In basketball if one basket is scored or missed it is less significant to the result than if one goal is scored or missed in football. Same with rugby, gridiron. This is because in these sports there are simple more scoring opportunities than in football. Tennis and golf are accumulative efforts so one does not usually suddenly win or lose on one shot or officials decision.

Hockey is quite similar to football but does have more goal-scoring opportunities than football.

Due to the low-scoring nature of the sport, in football there are more upsets and more shock results than in other sports. Even the very best team can lose to an underdog 1-0. So predicting a result becomes harder.

The greatest aid to predicting results in football is that the teams are not equal in terms of ability. In other sports like basketball, grid-iron and baseball there is no-relegation and a wage-cap. This is supposed to make teams more equal. In football there are huge inequalities between teams in the same division (although not in MLS). To use an example, Man Utd dwarf the capabilities of Wigan. Man Utd will have a budget for players and players wages many times greater than that for Wigan. Man Utd have income streams far in excess of Wigan. Man Utd have a much larger and intimidating stadium than Wigan. Man Utd have more fans and a higher media profile than Wigan. And yet Wigan have to compete in football matches with Man Utd in the Prem despite all these disadvantages. Football is not egalitarian at all. The playing field is not level. So it is this factor that makes football easier to predict than other more egalitarian sports.

It is fair to say that the sport of football makes it harder to predict while simultaneously the setup of (most of) the leagues make it easier to predict. Note that MLS inherits the egalitarian nature of American leagues and so is a more level playing fields for its clubs. What may give certain MLS clubs an advantage in the long run is the location of the club/franchise. Foreign players may choose an MLS club because of where they are located. If there is little distinction in wages, playing staff or prospect of trophies at one club over another then location becomes more significant in deciding which club to sign for. In this regard MLS clubs in more glamarous locations have a long-term advantage all-else being equal. However currently MLS results are more difficult to predict than other league with more unequal competitors.

The most useful stats are probably Prozone stats. They map players movements during a match. Very useful in analysing likely patterns of play and devising effective counters and exploiting weaknesses.

Oliver Anderson
04 Sep 2007, 04:23 AM
Hi Guys!!

This stuff is great and, yes, I am new here.

First a little bit about myself.

My name is Oliver Anderson and I've been doing football performance analysis for the last four years in the UK. I've written two books on the subject of applying statisticial analysis to football (THE RED REVIEW and THE FOOTBALL REVIEW) but it is very slow going for getting other people involved. I have created and developed a number of different statistics since I started and investigated may may other things.

I am so excited to have finally found a Forum discussing this subject. I am a huge baseball and american football fan and have basically tried to intergrate sabermetric ideas into football. Normalising stats, adjusting for strength of opponent faced, etc. Reading this thread has been like reading all the questions I asked myself about four years ago when this started

I would love to share my findings with you guys. A lot of the stats and interest on this site is about the MLS, but would you be interested in discussions about the English Premier League?

Please take a look at my site (hope the administrators don't mind - www.thefootballreview.co.uk (http://www.thefootballreview.co.uk/)) and tell me what you think. I'd love to know what you guys think and whether you have any other ideas for what could be done to better analyse football. You have by far the most analytical discussions on the web and I'd value your opinion. Please feel free to ask anything

Many Thanks

Oliver Anderson

The Jitty Slitter
04 Sep 2007, 09:23 AM
Hi Guys!!

This stuff is great and, yes, I am new here.

First a little bit about myself.

My name is Oliver Anderson and I've been doing football performance analysis for the last four years in the UK. I've written two books on the subject of applying statisticial analysis to football (THE RED REVIEW and THE FOOTBALL REVIEW) but it is very slow going for getting other people involved. I have created and developed a number of different statistics since I started and investigated may may other things.

I am so excited to have finally found a Forum discussing this subject. I am a huge baseball and american football fan and have basically tried to intergrate sabermetric ideas into football. Normalising stats, adjusting for strength of opponent faced, etc. Reading this thread has been like reading all the questions I asked myself about four years ago when this started

I would love to share my findings with you guys. A lot of the stats and interest on this site is about the MLS, but would you be interested in discussions about the English Premier League?

Please take a look at my site (hope the administrators don't mind - www.thefootballreview.co.uk (http://www.thefootballreview.co.uk/)) and tell me what you think. I'd love to know what you guys think and whether you have any other ideas for what could be done to better analyse football. You have by far the most analytical discussions on the web and I'd value your opinion. Please feel free to ask anything

Many Thanks

Oliver Anderson

Bloody interesting!

Interestingly your data matches with Fink Tank who find Robinson to be a dreadful player

revelationx
04 Sep 2007, 09:59 AM
Hi Guys!!

This stuff is great and, yes, I am new here.

First a little bit about myself.

My name is Oliver Anderson and I've been doing football performance analysis for the last four years in the UK. I've written two books on the subject of applying statisticial analysis to football (THE RED REVIEW and THE FOOTBALL REVIEW) but it is very slow going for getting other people involved. I have created and developed a number of different statistics since I started and investigated may may other things.

I am so excited to have finally found a Forum discussing this subject. I am a huge baseball and american football fan and have basically tried to intergrate sabermetric ideas into football. Normalising stats, adjusting for strength of opponent faced, etc. Reading this thread has been like reading all the questions I asked myself about four years ago when this started

I would love to share my findings with you guys. A lot of the stats and interest on this site is about the MLS, but would you be interested in discussions about the English Premier League?

Please take a look at my site (hope the administrators don't mind - www.thefootballreview.co.uk (http://www.thefootballreview.co.uk/)) and tell me what you think. I'd love to know what you guys think and whether you have any other ideas for what could be done to better analyse football. You have by far the most analytical discussions on the web and I'd value your opinion. Please feel free to ask anything

Many Thanks

Oliver Anderson

Hi Oliver,

I am a Liverpool fan and am familiar with Paul Tomkins and his columns about Liverpool. You worked with him on Red Review which I have not yet read although I have heard about it. I have seen excerpts of this and as a piece of analysis it seemed both incredibly detailed and something rare. This sort of statistical data is not really commonly available to the fans and is something that is of great help in dispelling various assumptions and myths about football.

Your contributions to this forum are most welcome. When you record data such as assists where do you get this data from? Are these officially reported? I only ask as assists are sometimes less clearcut than other stats and data on assists are certainly less frequently available.

Oliver Anderson
05 Sep 2007, 03:58 AM
Bloody interesting!

Interestingly your data matches with Fink Tank who find Robinson to be a dreadful player

I've looked closely at a lot of what the Fink Tank does and although the systems we use are completely different we are using the same thing - stats on players performance. For that reason then a lot of our conclusions come out the same such a Robinson being a below-average goalkeeper.

Oliver Anderson
05 Sep 2007, 04:07 AM
Hi Oliver,

I am a Liverpool fan and am familiar with Paul Tomkins and his columns about Liverpool. You worked with him on Red Review which I have not yet read although I have heard about it. I have seen excerpts of this and as a piece of analysis it seemed both incredibly detailed and something rare. This sort of statistical data is not really commonly available to the fans and is something that is of great help in dispelling various assumptions and myths about football.

Your contributions to this forum are most welcome. When you record data such as assists where do you get this data from? Are these officially reported? I only ask as assists are sometimes less clearcut than other stats and data on assists are certainly less frequently available.

For the reasons you give above the data we use for assists is our own. Assist data is hard to come by and the rules many have didn't fit right with me. We created our own assist stat that includes the last two attacking players to touch the ball in a meaningful attacking sence. This was because from our watching and charting of the game many goals were scored by a move and not just a great last ball.

If I remember correctly, last year the last ball or touch was 'most' important in about 55-60% of goals. That left approximately 40% of goals where the prior touch was more important.

Think of Arsenal's second goal of the weekend just gone. Was Gilberto's header the most important part of Fabregas' goal or was the Rosicky pinpoint corner cross the more important deliver. This on is more of a 50-50. Rosicky's ball was very important as was the run and accurate header by Gilberto - however there would have been header from gilberto without an accrate corner cross.

As for the assists we collect they are avaliable in the book (THE FOOTBALL REVIEW) for last season and will be up dated and displayed on the webiste thoughout the season for the leaders and put in the book for next year.

Hope that helps

The Jitty Slitter
05 Sep 2007, 05:31 AM
I've looked closely at a lot of what the Fink Tank does and although the systems we use are completely different we are using the same thing - stats on players performance. For that reason then a lot of our conclusions come out the same such a Robinson being a below-average goalkeeper.

I find this so fascinating as to how Robinson can be england number 1 let alone spurs number 1

There is a great debate on the arsenal board about lehmann

People are convinced that he has lost it, that he was rubbish last season, or even, that he was always rubbish.

Do you have any analysis of the prem goal keepers?

Is it in your book?