Sabermetrics applying to Soccer

Discussion in 'Statistics and Analysis' started by mpruitt, Jul 30, 2003.

  1. microbrew

    microbrew New Member

    Jun 29, 2002
    NJ
    I'll have to refresh my memory on what processes Poisson distributions model best, but...

    That formula probably could be improved upon. Some things I can think of, immediately, are:
    1) somehow collapse runaway scores
    2) take a closer look at goals in overtime wins/losses, as teams behave differently in overtime

    In any case, say a stat like is useful. How do you find players that maximize the Goals Scored to Goals Allowed ratio?
     
  2. beineke

    beineke New Member

    Sep 13, 2000
    Originally posted by microbrew
    That formula probably could be improved upon. Some things I can think of, immediately, are:
    1) somehow collapse runaway scores
    2) take a closer look at goals in overtime wins/losses, as teams behave differently in overtime


    You could certainly do things with a more sophisticated model, but the beauty of the Pythagorean Formula is that it requires only a minimal amount of time and information to compute ... in that respect, the ratio of Goals Scored to Total Goals is even quicker.

    Incidentally, I ran the numbers for both the Pythagorean and the simple ratio. Mean error:

    Pythagorean: 5.0 %
    Simple Ratio: 3.6 %
     
  3. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    I'd have to check when I got home, but I'm almost positive James relayed in an early Abstract why he squared everything. For some reason, I want to say it was because the error was reduced that way.

    Which, as you showed, is the opposite of MLS. Which makes sense for the reasons you elucidated.

    Speaking of academic papers, you have one about a study that showed that teams don't score significantly more goals or have fewer draws when the league uses a 3-1-0 system versus a 2-1-0 system, don't you?
     
  4. JG

    JG Member+

    Jun 27, 1999
    Probably. IIRC empirical study has shown that the best exponent for baseball is 1.83. Something like 16.1 works well for the NBA.

    There was an interesting article on rec.sport.soccer a few years ago where the author analyzed the value of goals in a Serie A season based on their context (i.e. what's the value of a goal that puts your team ahead by 2 goals in the 35th minutes?) and came up with a "value-weighted" topscorers list that had some large differences from the "raw" topscorer list.

    http://www.rsssf.com/miscellaneous/paserman-howmuchgoals.html

    Presumably that study could be expanded to other leagues and seasons to get more accurate point values for goals in each situation, and to see which players consistently score important goals. I can't remember if the author ever did more work on the subject--will check google.
     
  5. NER_MCFC

    NER_MCFC Member

    May 23, 2001
    Cambridge, MA
    Club:
    New England Revolution
    Nat'l Team:
    United States
    I'm both a baseball fan and something of a numbers geek, and I am inclined to agree with everyone who mentioned the issue about the rarity of goals in soccer. Their rarity means that no particular series of events that actually leads to a goal will do so very often. The relative lack of discrete events is also a problem. This is to say nothing about the lack of unanimity on definitions of events (Was that a bad pass or a bad bounce? Was that a cross or a shot?).

    It seams to me that there is plenty of potential for statistics in analyzing individual performance (the Simon Elliot example) or particular situations (like the corner kick analysis that was mentioned), but unless someone finds events or patterns that consistently correlate with scoring or allowing goals I'm at a loss to see how you could use the kind of analysis that baseball, especially, allows of overall team and season performances.
     
  6. microbrew

    microbrew New Member

    Jun 29, 2002
    NJ
    Here's my post which has a link to paper analyzing the three point victory and Golden Goal.

    https://www.bigsoccer.com/forum/showthread.php?s=&postid=414316#post414316

    While rereading the paper, I came across this quote:
    "Most importantly, the following conditions hold for virtually every soccer fan: (i) (s)he has spent a fair
    amount of leisure time thinking about the effects and suitability of rule changes, (ii) (s)he
    has come up with a strong ad-hoc opinion about it, and (iii) (s)he believes that economic
    modelling cannot add anything to this debate."
     
  7. beineke

    beineke New Member

    Sep 13, 2000
    Although this is an interesting idea in principle, it's very muddy statistically.

    Without getting too technical, the ability to score "clutch" goals relies on having some ability to score goals to begin with, so "non-clutch" goals are still a useful indicator of scoring ability. We shouldn't downweight them very much just because they weren't tactically all that important.

    The above is true as long as every opponent is playing respectable defense ... and in Serie A, that's what you expect. A more promising approach would be to do this kind of study for the US national team, adjusting for strength of opponent.

    Is Joe-Max Moore one of our all-time great scorers? He has 24 goals, but he got 4 in a 7-0 friendly against El Salvador, 2 in a 7-0 rout against Barbados, and 2 more in an 8-1 win against the Cayman Islands. A few of his other goals have been scored on penalty kicks. It'd be very interesting to see his adjusted scoring totals.
     
  8. beineke

    beineke New Member

    Sep 13, 2000
    In sports like baseball and American football, measurement is assisted by defining intermediate goals. Getting on base leads to runs, and yardage leads to touchdowns. So we get a foothold by measuring on base percentage and yardage.

    In soccer, we can also define intermediate goals ... winning possession, maintaining possession from defensive third to middle third, the middle third to attacking third, and the attacking third into scoring position.

    Along the way, it's possible to tabulate the tackles that Armas wins, the passes that Reyna receives and completes, and the on-target crosses that Eddie Lewis delivers. These numbers will be imperfect and subjective, but they still capture a chunk of what's happening on the field.
     
  9. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    It's just not as easy to quantify those things as it is in baseball or football.

    I mean, it can be done (OPTA does a bunch of it) but it's labor-intensive.
     
  10. JG

    JG Member+

    Jun 27, 1999
    The idea as I see it wouldn't necessarily be to measure scoring ability, but to see whether certain players have a knack for scoring important goals, and also to construct a scoring table that gives a better indication of how important a player's goals were to their team.

    OTOH it would probably be tricky...the system would probably favor guys whose teams play a lot of close games (presumably teams near the middle of the table) while hurting guys on teams that play more lopsided games (presumably the teams at the top and bottom of the league)

    I think that just the score/time/result probability data would be interesting too...in which situations is the benefit of scoring a goal greater than the cost of allowing a goal?
     
  11. joe2

    joe2 New Member

    Oct 14, 2001
    Sabermatics applying to Soccer

    I have done a thorough statistical study of basketball, baseball, American football and soccer. I have found a strong corelation in all sports between scoring the most points and winning. I hope all coaches and statisticians take heed of these findings.
     
  12. NGV

    NGV Member+

    Sep 14, 1999


    Even if there's really no such thing as an ability to score important goals, though, some players will still end up at the top of an important goal table based on pure luck. So, for it to be believable, you'd have to show that the same players tend to display this "knack" consistently from season to season.

    In baseball, as far as I know (which admittedly isn't very far), attempts to show a consistent ability to hit "in the clutch" have pretty much come up empty. I'd suspect the same would be true for soccer.
     
  13. skipshady

    skipshady New Member

    Apr 26, 2001
    Orchard St, NYC
    I'm not familiar with the field of statistics at all, but I'm wondering if you could generate a soccer equivalent of hockey's plus/minus stat, as in, how many goals/shots/scoring chances occurs when a certain player is on the field, as opposed to when he's off, or how many goals/shots/scoring chances are allowed.

    This may be a good measure of, for example, a good defensive midfielder who allows his central midfield partner to make forward runs, or a wingback who makes intelligent runs to stretch the defense.
    In either case, the player contributes to the attack without touching the ball.
     
  14. beineke

    beineke New Member

    Sep 13, 2000
    Originally posted by JG
    The idea as I see it wouldn't necessarily be to measure scoring ability, but to see whether certain players have a knack for scoring important goals


    That's a nice idea, but their study is too elaborate to be interpretable ... here's a cleaner starting point.

    MLS 2002
    Goals leaders

    Ruiz 24 G, 9 GW -- 7.63 expected
    Twellman 23 G, 5 GW -- 5.63 expected
    Cunningham 16 G, 3 GW -- 4.00 expected
    Graziani 14 G, 6 GW -- 4.35 expected
    Razov 14 G, 3 GW -- 3.25 expected
    Kreis 12 G, 5 GW -- 3.55 expected
    Diallo 12 G, 4 GW -- 3.20 expected
    Faria 12 G, 2 GW -- 3.22 expected
    Carrieri 11 G, 5 GW -- 3.32 expected
    Chung 11 G, 3 GW -- 3.32 expected
    Henderson 11 G, 2 GW -- 3.32 expected

    For each of the top goalscorers, we have his total goals, his game-winners, and his expected number of game-winners under a simple null model (player goals * team wins/team goals).

    There doesn't seem to be much in the way of trends here, though we can also look at other seasons.

    I think that just the score/time/result probability data would be interesting too...in which situations is the benefit of scoring a goal greater than the cost of allowing a goal?

    Agreed. I once saw a fascinating hockey paper about when to pull the goalie ... I think it was in Chance, but I don't know how to find it now.
     
  15. beineke

    beineke New Member

    Sep 13, 2000
    This is most useful in hockey because all players spend a lot of time both on and off the ice.

    In soccer, it's less useful because most players are on the field all the time. Only injuries provide a good way to compare team X with or without a certain player.

    Derek Fisher once led the NBA in plus/minus per minute. That's because he was always on the court teaming with Shaq, while facing the opposing teams back-ups.
     
  16. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    Matt Bernhardt from Ohio State did a plus-minus analysis of the Crew one year.

    All I remember is that it tended to indicate that the Crew was better off with Brian McBride off the pitch than on the pitch.

    A little common sense doesn't hurt sometimes. :)
     
  17. skipshady

    skipshady New Member

    Apr 26, 2001
    Orchard St, NYC
    So I see that plus/minus wouldn't be that meaningful for soccer, at least copied directly from hockey. But how about plus/minus for players by positions? For example, can we measure the effectiveness of Claudio Reyna when he is used at the top of a diamond compared to in a box or in a flat four midfield? Or the team's effectiveness when Reyna is a) in the 18 yard box, b) between the center line and the box, c) in the right third of the attacking third, etc etc.

    Of course, a team's formation depends greatly on the whether it's ahead or behind and the other 10 players on the field, and different players switch positioins at different frequencies. Obviously, the stats would not so useful on their own. But again, I'm just throwing out an idea.
     
  18. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    We can count anything you want. If you have access to the games and can keep track of what you want.
     
  19. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    Assuming there are some people trully excited about this, and that we could form an Association of Soccer Nerds, then maybe we need some organization or discussion as to what to start begin to look at, or how to look at it. Furthermore, I think the whole point of any of this should be to be willing to question everything you think you know about soccer. A poster before said that he didn't think this stuff woudl be as usefull because there are too many events which lead up to each individual goal. Well, maybe that's true. But how do you know that? I'd like to try to assume that as many things as possiable in soccer idealy could be quantified objectively.
     
  20. joe2

    joe2 New Member

    Oct 14, 2001
    Sabermatics applying to Soccer

    The problem with trying to quantify events in soccer is extremely great. What events lead to a goal ? the last pass ? The three previous passes ? The misplayed ball at midfield that lead to a 4 pass combination in which an unmarked defender was able to slip through the defense and score a goal on a ball that richoted off an outstretched leg of a defender ? The fluidity of the game defies quantification. It is easy to make up statistics after the fact by collecting whatever data you want. But what data is meaningful ? I contend that none of it is meaningful unless it is predictive of future events. Unlike American football and baseball with plenty of set plays and stoppages soccer is a continuous action sport. Players are constantly adjusting to the changes in the game. An attacker might find himself defending in his own box and a defender might find himself able to score. The few discreet events that are quantifiable: goals, shots, corner kicks, fouls, assists, saves, cards given...really do little to express what the game is abou or even the quality of play. I would like to see someone come up with a list of discrete quantifiable actions that are meaningful in evaluating performance while being able to be easily collected and agreed upon by unbiased observors. Unless you can do that you are just assuming certain events are more important than others. In the end the only important events in soccer are goals. And such events as gamewinning goals are especially meaningless. Any goal could be a game winning goal depending on how many goals a team gives up.
     
  21. microbrew

    microbrew New Member

    Jun 29, 2002
    NJ
    Most of the literature I've read is dealing with game theory and soccer, i.e. decision making in games (strategy and tactics) and the running of leagues. So, I guess I would be more of a armchair coach, than an armchair GM.

    More articles:

    "Skill, Strategy, and Passion: an Empirical Analysis of Soccer" at
    http://ideas.repec.org/p/ecm/wc2000/1822.html

    This is the one I'm trying to chew through right now- damn you guys. I haven't touched this stuff in four years, and now I can't resist trying to make sense of it.


    "A simulation model for football championships" at
    http://www.ub.rug.nl/eldoc/som/a/01A65/01a65.pdf

    I've only read the abstract so far, but it claims: "[...] a simulation/probability model that identifies the team that is most likely to win a tournament. The model can also be used to answer other questions like ‘which team had a lucky draw?’
    [...]"
    The keywords: Poisson models, football, simulation

    So maybe beineke was onto something.
     
  22. nancyb

    nancyb Member

    Jun 30, 2000
    Falls Church, VA
    Club:
    DC United
    Nat'l Team:
    United States
    I thought this was about some Star Wars-based game.
     
  23. Real Ray

    Real Ray Member

    May 1, 2000
    Cincinnati, OH
    Club:
    Real Madrid
    Nat'l Team:
    United States
    Re: Sabermatics applying to Soccer

    I'm not sure I really agree with this.

    For starters, the basic "stat" that I think we could all come to agreement on is what we all refer to as a "chance." Breaking down tape and then processing the data to show a correlation between the number of chances vis-a-vis specific players on the pitch, ect., is a basic tool already. In fact, before the BS crash, there was a thread in the Coaches forum that brought up one of Dave Dir's old columns that had an an analysis of how the MetroStars did when Villegas was able to get certain numbers of crosses in. It was due to this that Zambrano-despite-Villegas' flaws as a player-played him, as he felt given those opportunites to cross, good things would happen inside the box.

    I would also be interested in seeing success-faliure rate on chances based on where key players get the ball. Space is critical to soccer-you could argue it's the true currency of the sport. Creating space, closing space-sure it's a painstainking process but is has promise/worth IMO.

    For instance, a trend that you see in basketball that I would love to backup with numbers in soccer, is the number touches-passes received by a player, based on the quality/success of what that player does with the ball.

    In the NBA an example was Jason Kidd-Keith Van Horn w/the Nets. If you watched the games, you could see the trend very early that if Van Horn did not get off well, Kidd would pretty much freeze him out. This led to problems in the dressing room as well, and Van Horn was later traded-much to the delight of some of the Nets. I suspect if you broke down the tapes, there was a certain number points by a certain time in the game, that you could count on re: when/if Kidd would freeze him out our go with him the full 48 min.

    Basketball is an easier game to track in this manner, but I think you'd find similar trends in soccer that could be exploited-why and when does that striker or midfielder all of a sudden seem to not to get any touches? Is it tough marking, his fitness-or is there a "deadline" in a match based on how he plays early in a match-do the other players freeze him out? And how does that player react? Does still work hard or does he sulk and drift away? Important stuff to know, and better to back up with hard data to show your players.

    Your point about the fluidity of the game I would agree with, but I think you can breakdown soccer to create a game of percentages-which is really at the heart of baseball stats: getting your matchups right; having the percenatges on your side. Which in MLS, considering that the teams can play each other up to four times, should be easier if the teams dedicate the time and resources.
     
  24. beineke

    beineke New Member

    Sep 13, 2000
    Re: Re: Sabermatics applying to Soccer

    Very interesting post.

    Just to touch on this one point, I can't think of any quick and dirty way to measure space directly. However, simple play-by-play tracking provides valuable information about space. Here is one example from a stretch of the US-Portugal game that I tracked. Portugal had possession in the attacking third of the field, but they were forced to make a series of backward passes that went all the way back to their own keeper.

    Even though the tracking itself says nothing directly about spacing, the implications are clear: either the US was covering space extremely well, or else Portugal's attackers weren't getting into position to make a positive play.
     
  25. joe2

    joe2 New Member

    Oct 14, 2001
    Sabermatics applying to Soccer

    An interesting point but you seem to be evading my key criticism...How do you define and measure your discrete event ? For starters...What is a chance ? When does the chance start ? Is it the keeper's pass ? Or some other combination of touches ? There seems to me to be way too much margin for judgement on the part of the observor. In soccer, unlike baseball or even basketball, any given individual player is more dependent on the skills of his teammates. That is one reason you do not have players actually dominating games, as a pitcher can do in baseball, for example. Back to the idea of "chance"...define it in discrete measurable terms. Then, explain to what extent it is important. What is a "chance" and what is it's usefulness in the game ? Are you talking about any given individual's touches ? Are you talking about scoring opportunities ? Are you talking about territorial gain ? Are you talking about controlling time ? What is your operational definition of chance and why is it an important variable. Once you have done that maybe we can see if there are objective ways to quantify "chance". It could be an interesting statistic if you can define it.
     

Share This Page