Sabermetrics applying to Soccer

Discussion in 'Statistics and Analysis' started by mpruitt, Jul 30, 2003.

  1. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
  2. Dr.Marin

    Dr.Marin Member

    May 28, 2009
    USA...
    Club:
    FC Barcelona
    Nat'l Team:
    Colombia
    I'm a huge sabermetrics buff, and consider myself fairly knowledgeable. Soccer is my favorite sport, and the one I really grew up loving, though. It's clear to me that objective analysis is really needed in the sport, though it's also clear that there are many obstacles in the way just by the nature of the sport. I see the resources thread here, and it's fairly disappointing to be honest. I thought we were further along. Are there any primers or books I should read regarding soccer sabermetrics (we also need a cool term!)?

    I do have some ideas of my own, though I'm not sure how stupid they are unrealistic....:eek:
     
  3. Dr.Marin

    Dr.Marin Member

    May 28, 2009
    USA...
    Club:
    FC Barcelona
    Nat'l Team:
    Colombia
    And I'm going to read through this thread, as I see it has some fascinating stuff.
     
  4. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    There's a lot you can do team by team, but individual player stuff is hard. I've always thought a "win shares" like approach could be much more useful here than it actually is in baseball. If you can figure out how good a team is (possible), and then divide up credit within that team (harder but also approachable) it can be done.
     
  5. blank_frackis

    blank_frackis New Member

    Apr 23, 2009
    Club:
    Aberdeen FC
    This is pretty difficult to do in practice, but sorely needed. One of the greatest myths in the game today is that a goalscorer is necessarily the most important player in a given team. In terms of value this might well be the case, but there are numerous examples where a player with a great goalscoring record is far less important to a team's success than a creative midfielder, or a central defender, or the forward who plays alongside the main striker.

    One of the most glaring examples of this principle is Kris Boyd (striker at Rangers in the Scottish Premier League) who has a phenomenal goalscoring record, but is severely lacking in terms of general play (passing, technique, first touch) to the extent that he undermines much of the good work he does by finishing chances. Indeed, his club side actually has a greater goals per game record without him playing in the last two seasons than they do with him. Try and make this point in the present climate of footballing opinion, however, and you'll be met with a torrent of cliches about "goals winning games" or whatever else.

    Some equivalent to win shares would solve this problem, but despite the difficulty in creating such a statistic there's some hope on that front if you consider that the tendency to quote goalscoring records as an indicator of a player's worth to the team is itself an attempt at statistical analysis - just an exceptionally poor one. There is more of a market for statistics than the knee jerk reaction of many football fans would suggest.
     
  6. przeszczepan

    przeszczepan New Member

    Apr 9, 2009
    Hi Guys,

    You will find an approach addressing this (and other) issues in this article.
     
  7. ark215

    ark215 Member

    Jan 16, 2009
    America
    Club:
    New York Red Bulls
    Nat'l Team:
    United States
    I don't see this working for soccer. I'm surprised they have distance covered and passes attempted and converted. however, stats in soccer are rare and often are not independent of the skill of other players around them or their opposition, even if we were to measure every single thing each player did out there.
     
  8. RichardHKirkando

    RichardHKirkando New Member

    Aug 9, 2006
    Madison, WI
    Club:
    Fulham FC
    Nat'l Team:
    United States
    I'm actually trying to do exactly that. I have some of what I've done on my site, which is mostly Fulham-specific at this point, but I'm in the process of doing another site covering a bit more, and also including a searchable/sortable database with all of my stats.

    A lot of the calculations are borrowed from the Win Shares formulas. I've calculated marginal goals for each team, and then broken those down into marginal goals scored, prevented by outfield defence, and prevented by goalkeepers. Then, they are assigned to individuals based on lots of things, but for main areas - goalscoring, playmaking, defensive actions, and a possession score (tendency to retain possession vs giving the ball away). The end result is a score that can be read like "a team of player_x would win y games in a season". So really, you could divide by 11 and get the player's value in terms of actual wins, but I feel that its a little more presentable as-is.

    One of the things I've tried really hard to do is eliminate elements of luck and the more team-oriented things. For example, instead of using assists in my playmaking calculations, I use assist attempts (any pass leading to a shot), since that doesn't rely on another player's shooting ability. I've also adjusted each goalkeeper's save percentage based on the quality of shots faced. There's much more refining to be done, but I like the results so far.

    I do agree with most of what you've said, and so do my findings. Central midfielders are generally the most valuable players, although the very best goalkeepers will out-produce them slightly. The value of forwards is difficult to explain - I think you're right about forwards whose "general play" is lacking being less valuable. For example, Kevin Davies was the league's 8th leading scorer, but was only the 14th most valuable forward according to my system. Dirk Kuyt and Wayne Rooney had the same number of goals, but both were more than a full win better than Davies. As you can see, I've concentrated on the Premier League so far since there is a lot more data available, but I'll eventually start doing other leagues as well.
     
  9. Vandervaart

    Vandervaart Member

    May 21, 2003
    London
    Club:
    AFC Ajax
    Nat'l Team:
    Netherlands
    Surely strikers are the most valuable players, as the object of the game is to win, and to do that you need to score goals, which are scarse. For instance, I reckon Torres and Villa to be more valuable than Xavi and Iniesta.
     
  10. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    In terms of this thread and this board, the question I guess would be then to try and examine the evidence for that.

    Undoubtedly that's possible and certainly the attacking geniuses _seem_ to command higher transfer fees. I also am somewhat sympathetic to that viewpoint, because I think announcer types can really go overboard on the praise for central midfielders, particularly here in the states.

    But the question for the thread is how to do we go about finding out whether this is actually true or not.
     
  11. RichardHKirkando

    RichardHKirkando New Member

    Aug 9, 2006
    Madison, WI
    Club:
    Fulham FC
    Nat'l Team:
    United States
    In some ways, yes. Great forwards are certainly more scarce than great midfielders - according to my findings, only 22 forwards in the Premier League accounted for 1 or more wins on their own, while 50 defenders and 56 midfielders did the same. I think its very likely that having a top 5 striker can give you a bigger competitive advantage over your opponent than having a top 5 midfielder or defender would. Average strikers are more interchangeable, their results depending more on service from the midfield than their own skill.

    Example: Aston Villa is playing Fulham. Their top striker, Gabriel Agbonlahor, is worth 2.5 wins over replacement. Fulham's Andy Johnson is worth just under 1 win over replacement. In the midfield, you have Gareth Barry at 3 wins vs Danny Murphy at 2 wins, which is obviously a smaller difference.

    So when you're talking about the extreme high-end, those game-changing strikers can be more valuable. If one of those players is not obtainable though, its best to build from the midfield and defence.
     
  12. przeszczepan

    przeszczepan New Member

    Apr 9, 2009
    Your work is impressive, Colin.

    I’ve gone through the recent 7 pages of your blog and there are certainly lots of imaginative concepts and loads of good work done out there.

    What I am missing there is a kind of glossary with definitions of the statistics you are putting to practical use throughout your blog. It took me some time to dig through your earlier posts to find what you mean by all the abbreviations. I am not saying it was unpleasant, though :)

    I particularly liked the 2008/09 Premier League Shot Quality bit. Have you ever looked at how a logistic regression approach values shots on goal depending on the circumstances (distance from goal, angle, proximity of the defender, previous action, …) it has been taken in?

    The paper I have linked to in my previous post uses it as a starting point and takes it a step further to estimate the goal scoring potential in each game situation. Knowing that, each action performed on the field can be assessed based on how much of that goal scoring potential is created or lost by performing it. As a result, players’ performance and various types of team moves can be assessed based on how much they contributed to the chance of scoring a goal by their own team and by the opposition.

    Some of the desirable characteristics of this methodology include:

    - a goal scored by a striker is not exclusively his own contribution if it has been worked out all the way through the midfield; similarly merits for a goal conceded are distributed as a result of the opposite team’s build up not opposed to the sufficient extent by the defending side and not only goalkeeper’s fault for example;

    - a great pass opening the striker’s way to the goal remains a great pass if the latter fails to convert it to a goal; if he does fail, the passer is rewarded for the opportunity he had created irrespective of the striker’s action and the striker is punished for the missed chance (even though he is rewarded for finding himself in the good position); the goalkeeper shares the merits (positive or negative) with the striker if the latter had tested him;

    - in preventing the opposite side form getting their goals, defending creates results (victories or losses) as much as the offensive actions do;

    - a pass into the penalty area allowing strikers to get into a shooting position is much more valuable than a pass in the middle of the field; similarly a key tackle of the last defender which prevents the one-on-one striker vs. keeper situation is worth more than any other.

    One of the things I find the most appealing in this approach is that the goal potential created by all the players on the teams sums up to the number of goals scored and conceded by those teams by definition.

     
  13. happyforever

    happyforever New Member

    Jun 16, 2004
  14. Vandervaart

    Vandervaart Member

    May 21, 2003
    London
    Club:
    AFC Ajax
    Nat'l Team:
    Netherlands
    Off topic and probably already mentioned dozens of times in the beginning of this thread, read Moneyball last weekend. Great book. Too bad that it was published before Epstein's success at the Red Sox.
     
  15. ElChupacabra67

    ElChupacabra67 New Member

    Sep 30, 2009
    Georgia
    Club:
    FC Barcelona
    Nat'l Team:
    United States
    I've been working on applying Sabermetrics and linear weights in particular as applied to soccer for the last few months. I've come up with one measure of Player Production: http://nordeckeluchador.blogspot.com/2009/08/introducing-el-luchadors-player.html

    but it needs work. The measures of Team production I'm more happy with but am still doing much needed research. Stumbling on this thread was a godsend.
     
  16. brew42

    brew42 New Member

    Nov 6, 2009
    Club:
    Manchester United FC
    Hey

    I have recently developed a Gaelic Football/Hurling Tracker for the iPhone. Check it out in the iTunes Store @ iTunes.com/apps/gaatracker. Its something we use quite a lot for gathering & viewing stats on Gaa games.

    Would people use a similar app for soccer. Check out Footy Tacker @ iTunes.com/apps/footytracker

    I think it could be useful for minor teams that don't have premiership budgets
     
  17. wolf6656

    wolf6656 New Member

    Aug 9, 2004
    Canada
    Club:
    Tottenham Hotspur FC
    Nat'l Team:
    Portugal
    Did your study involve possession in different parts of the pitch? Or perhaps which players had possession for which percentage of the team's totals? Where on the pitch did certain players or certain positions have possession? If your midfilelders have 70% of their possession on your side of half then what good does it do you? If your team has 60% of its possession in your own end it doesn't help you, or does it? You are right, that simple possession is largely meaningless.

    The same with merely reporting the number of completed passes. Where were the passes completed? Were they forward? Were they short tic-tac crosses behind midfield, or long accurate crosses to players in the area? I mean, what works for Spain doesn't necessarily work for everyone else, but man does it work for Spain!

    The relationship between shots at goal and shots on goal? Is that telling? For individuals, or for teams?

    For sabremetrics to be meaningful in soccer, we first have to identify which quanta we should be counting. Soccer is more fluid than baseball or gridiron football, and thus is kind of like trying to nail jello to the wall.

    Yet, there are plenty of stats for hockey that seem to work, and hockey is the closest heavy stat sport to soccer in terms of fluidity. But at the same time, a great quality defensive defenseman in hockey is harder to find on a scoresheet. Plus-minus helps, but has to be measured against time spent against the opponents best players vs another defender's time spent against lesser players.

    Actually soccer has that advantage in that there are limited substitutions, so every player is pretty much on the field against every other player, There isn't line matching as in hockey.

    I think that the very elusiveness of soccer is the greatest part of its appeal. The sport was tailor-made for the Irish. They can argue about it and wax poetic about it at the same time. I learned about soccer from an Irishman, and it was the poetry of his descriptions that sold me. Even when he was angry about "talking points" he was always poetic. He learned to LOVE hockey, while living in northern Manitoba. (You try to find a soccer score up there), but soccer was always number 1.

    Soccer isn't a table of stats, as much as Pat used to carry around the soccer newspapers with page after page of scoresheets and tables. Soccer is a never ending story, with plot twists and drama. It was Pat's descriptions of that drama that sold me on the sport. I have yet to find a more meaningful stat than 2:1 (a.e.t.)..........but I will keep searching. hahaha!
     
  18. ElChupacabra67

    ElChupacabra67 New Member

    Sep 30, 2009
    Georgia
    Club:
    FC Barcelona
    Nat'l Team:
    United States
    "For sabremetrics to be meaningful in soccer, we first have to identify which quanta we should be counting"

    I looked at a number of different quanta, including measuring by points, wins, differential, and goals, and it finally came down to weighting everything in terms of how often it lead to a goal scored. One regression I did showed that Goals minus Goals Against had the strongest correlation to Points earned. But I wanted to go deeper then that and the current incarnation of the metric includes Goals, Wins, Losses, Away Wins, Away goals, Shots, Shots on Goal, Corners Taken, Shots Against, Shots on Goals Against, Corners Against, Draws--all weighted by their relative worth in terms of goals. So, a Shot in the EPL is worth about .10 goals on average. But I've had to tweak the overall metric, and continue to do so. Regardless, the R-squared for the metric in terms of Points earned is around .9. That's pretty damn good.

    http://nordeckeluchador.blogspot.com/

    The blog is not up to date, but will be in a day or so. I was away for a week at Thanksgiving and took some time off.
     
  19. ENB Sports

    ENB Sports Member

    Feb 5, 2007
    I don't disagree with you about the type of game soccer is but being from Canada you must know there are huge capacity for statistics in Hockey. And it's as free-flowing as soccer if not more.

    The only reason statistics is not a major part of world football is nationalism or probably more contenentalism (if thats a word). If you go to any American or Canadian newspaper and you archive from 1920's there is many sports statisics in that paper is there as paper from today. If you look a current UK newspaper the only statistics within that newspaper would be scores and standings.

    Now you could argue that stats don't mean crap in hockey as well and I might agree although I'm a sports statistician by trade so lets keep that our little secret. ;)

    As for Sabermetrics most of it not for soccer but in baseball is complete hogwash. If the data set is big enough and you tak reaccuring occurances the results would be the same or predictable.

    IE did you know that if you take the complete history of the English Premier League and Womens Professional Soccer the percentage of Headed, Long Range, Free Kick and Penalty goals are about the same.

    I think the key component to statistics is being aware of a players history without having a chance of seeing it firsthand so as a fan if your team buys a player you know something about him to brag to your friends, talk about on radio, complain about at work and feel as if you can do a better job then the current manager.
     
  20. ElChupacabra67

    ElChupacabra67 New Member

    Sep 30, 2009
    Georgia
    Club:
    FC Barcelona
    Nat'l Team:
    United States
    dude, some of the Sabremetrics are "hogwash" in that they are rather crude (OBP + SLG comes to mind). But the linear weights system is rigorous and rather useful. The problem with hardcore academic statistical studies is that they often tells us more about a particular theorum than they do a particular sport.
     
  21. bluelink

    bluelink New Member

    Dec 29, 2009
    Club:
    3 de Febrero
    Average strikers are really more interchangeable, their results depending more on service from the midfield than their own skill.
     
  22. teucer

    teucer Member

    Dec 17, 2009
    Raleigh, NC, USA
    Club:
    Carolina Railhawks
    Nat'l Team:
    United States
    So at the moment, what is our best formula for trying to predict future performance? Is it the one rooted in the Poisson assumption mentioned earlier in the thread? (If so, what's the actual formula for that? I didn't see it posted, and it's been a few years since I took stats so I don't think I could derive it from first principles myself.)

    I'd like to run some numbers on past USL-1 performance, but I'm not exactly sure where to begin.
     
  23. teucer

    teucer Member

    Dec 17, 2009
    Raleigh, NC, USA
    Club:
    Carolina Railhawks
    Nat'l Team:
    United States
    Never mind, I've got this part down. Do we have any better predictors, or is this the state of the art?
     
  24. ElChupacabra67

    ElChupacabra67 New Member

    Sep 30, 2009
    Georgia
    Club:
    FC Barcelona
    Nat'l Team:
    United States
    Nice! I would kill to get that sort of data for professional clubs. I exchanged a few emails with some guys in the Association of Football Statisticians (http://www.11v11.com/) and they said the basic problem is that STATS Inc whose English subsidiary (or maybe STATS is the American subsidiary of the English company?) does all the stats for the EPL doesn't release the data. You'd have to purchase the data from STATS to get a large enough data set to really dig down into the available numbers. But we know the big clubs track this stuff (like which zone in the Penalty Area a corner goes into) but the teams don't want the other teams to have easy access to data on them and so it's all hush hush. This makes it very difficult to come up with comprehensive metrics for soccer. I've been tracking the EPL and MLS week by week on my own for the last year and a half and it's labor intensive to say the least.

    Plus the English papers don't publish stats like American papers do. It's much easier to get numbers on the MLS as the league makes them available on its own site and the major media outlets also publish most of what the league releases. But for the EPL I generally have to use two to three websites to track down all the numbers I'm looking for.
     
  25. bunkmedal

    bunkmedal Member

    Feb 12, 2010
    Club:
    Cardiff City FC
    I think this would be remedied in time by more people producing statistics. The reason businesses don't publish the statistics is, obviously, that they want to make money from them. If there were statistics of a similar quality being published by other sources then the market would collapse and most statistics would be freely published the way they are in U.S. sports.
     

Share This Page