Sabermetrics applying to Soccer

Discussion in 'Statistics and Analysis' started by mpruitt, Jul 30, 2003.

  1. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    Sabermatics applying to Soccer

    not sure this is the right forum to post it in, but after reading Micheal Lewis' Moneyball, i've become fascinated that sabermatic principles could be used to help find undiscovered talent in soccer and there by helping MLS succeed. it occured to me that there are pathetically few usefull statistics available for soccer players. i happend to stumble upon this site right here and the stuff that they're doing is phenominal. i just wish there was more of it.

    http://www.matchanalysis.com/
     
  2. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    Sabermetrics, but I echo the sentiment.
     
  3. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    Ken your pm inbox is full.

    A lot of the things about the problems of applying it to soccer are obvious. My thought was though that if it works in hockey and basketball, there has to be a better way of tracking individual performance of soccer players. So I did a little bit of research and found these guys. http://www.matchanalysis.com/

    I emailed them and one of the guys got back to me really quickly. He said some interesting things, a lot of which were echoed in Moneyball. Foremost that coaches with a traditional upbrining have been reluctant to incorporate statistical analysis but have more accepted some of the video analysis that they have done. However, if you look at the way that they've broken down individual players, much of that stuff is pretty brilliant, but unfortunately a lot of work. Consiquently, he said that they have not done much of the statistical analysis recently.

    There in lies the biggest problem. There's not this wealth of information available to go back and look at. And while record keeping in baseball has improved with companies like Stats, Inc. and the internet. I wouldn't imagine it's simply not there with soccer. (those types of records)

    So the question would be how to collect it. Matchtracker, with a quick glance in my mind has a pretty good model of what to be looking at, at least as a start. Then the question only becomes testing hypothesies as to what are the most important attributes of an individual soccer player or position, and what their corrilation is to their team, and their team's record.
     
  4. NGV

    NGV Member+

    Sep 14, 1999
    A couple reasons why stats are less useful for measuring player performance in soccer than in baseball:

    First, offensive contribution in baseball can be measured for players independently of the rest of the team's contribution. Not the case in soccer, where a forward's goal scoring may depend on the quality of service from the midfield and a midfielder's assists may depend on the quality of finishing from the forwards.

    Second, in baseball, offensive contributions can be measured identically for every player, regardless of their position or the team's "style of play." It doesn't matter whether a player is a shortstop or a right fielder, their hits are worth just as much. In soccer, even just at a single position like forward, you can have several different types of players - Brian McBride type forwards, Landon Donavan type forwards, Taylor Twellman type forwards - and the contribution each type makes may not be measurable by the same standard. Also, what you want from these players may vary depending on what formation or tactics a team is using. It's not clear how the value of different players could be reliably and consistently compared.

    Third, baseball consists of many discrete events (hundreds per player per season), each measurable as a contribution to run scoring. In soccer, the relevant events are goals, but they are rare. And, for more common events, like "passes completed," the exact contribution to goal scoring or goal prevention is often not obvious. A lot of possession can be a good thing, or it can be not such a good thing - I think the USA had something like 65% possession against Poland in the World Cup. So, there may be a shortage of meaningful data.

    Basically, while resistance to statistical analysis in baseball is usually unfounded, in soccer I think there may be good reasons to be skeptical. That said, I also think that statistics could be used more than they are now, and I was really interested in the reports that Match Analysis used to post on US national team games. I think that trying to measure what happens during a soccer game with precision and objectivity is worthwhile, even if directly using that information to evaluate player performance is difficult.
     
    NERY CASTILLO MEX repped this.
  5. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    That's sort of my point, while it may not lend itself so easily to soccer, there's definately more that can be done and should be done. However, to simply say that it wouldn't work with different styles of play, and that possesion sometimes ins't key is simply just an assumption. With numbers there could be more accurate breakdown even within a given team of what went wrong and what went well, and putting those out over a timeline you could there find the crux of what actually causes people to win succer games. Obviously luck is a very big factor but I can't help but think with some of the things like Matchanalysis provides could help to break down strategies with a forumlaic precision.
     
    ~*GabrielG repped this.
  6. TomEaton

    TomEaton Member

    Mar 5, 2000
    Champaign, IL
    This is the best thread I've seen on here for awhile.

    Good post, NGV. You made a lot of the same statements I was going to make. I will nitpick a little bit, though, and say that in baseball the offensive contributions of each player are not ENTIRELY independent of the rest of the team. For instance, if you are the leadoff man, most of the time no one will be on base, and batting averages always increase with runners on base (particularly with a runner on first base, which opens up a hole on the right side of the infield and has the middle infielders playing closer to second base). If you're batting behind a great base-stealer, you often have to take pitches so they can try to steal, and consequently get behind in the count, hurting your own chance of hitting safely. And it's pretty difficult to have a high RBI count if no one is getting on base in front of you.

    Getting back to soccer, your main point was correct that each player's performance is VERY dependent upon the team around him, which explains in part why statistical evaluation hasn't had really any emphasis in soccer up to now. On the other hand, you still find a few hard-line baseball people who insist that statistics are useless in evaluating baseball talent, which is absolutely astonishing.

    Whether soccer stat analysis is truly useful will be determined when someone can show definitively that certain statistics correlate highly with winning AND that the relationship is causative. For instance, if someone can show that the higher percentage of passes your center midfielder completes, for instance, the more often you win, and that's the reason, then people will start to pay attention to that information.

    I once ran a study on the MLS possession statistic to determine whether having more possession correlated with winning (the article was in the 2001 American Soccer Analyst). Out of the 170 games I studied, the team with more possession had a record of 75-67-28, slightly better than .500. The interesting thing, though, was that where possession was really lopsided (one team had at least 56% of the possession), the team with more possession was only 18-24-10. I have some theories as to why this might be, but I won't bore you with my opinions. The point is, there is a correlation, but not a particularly strong one. I would have liked to have looked at some numbers for other leagues to see if the results were similar, but I didn't know where to get the numbers. I hope the Matchanalysis people can keep track of stuff like that.
     
    analytics.soccer repped this.
  7. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    A baseball player's contributions are not independent of the team, but we have much more information about the relationships between individual players' performances and each other, and that of the team. Or at least we can quantify it better.

    There are more big-picture things we can use analysis for. It tends to break down a bit at the player level. But we're getting closer.

    The biggest obstacle is the backward-thinking "soccer and statistics don't mix" sentiment.
     
  8. NGV

    NGV Member+

    Sep 14, 1999
    Well, I think some of the other obstacles are pretty big too. But, there definitely is a potential for interesting and useful work - and long as you still have people saying and believing things like "2-0 is the most dangerous lead in soccer," there's clearly a lot of room for improvement.
     
  9. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    You're right, the other obstacles are big, too.

    Perhaps attitude is the biggest obstacle to getting the work accepted, not getting it done.
     
  10. beineke

    beineke New Member

    Sep 13, 2000
    ... unless, of course, they're associated with the Revs. :p

    By the way, even the most basic tracking can lead to strong conclusions. Since the start of the 2001 season, Simon Elliott has attempted 106 shots, of which only 36 of which were on goal, and only 2 of which went in. It doesn't take a genius to realize that he shouldn't be winding up on these balls.
     
  11. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    Great great stuff. You're the kind of person I'm looking for. Is there anyway I could get a copy of that article. Or even better the journal you speak of. I'd love to read this stuff. Unfortunately, I'm not much of a numbers guy so I fear that I'd be pretty useless when it comes to breaking down numbers. However, I really found that book very inspiring and it definately set off an idea in my head. While many of the things you guys have said have been very insightfull, it shouldn't be a case of "why won't this work?" it should be how could someone make this stuff work. There's gotta be corrilations to be made, there has to be a way to get data, and then conclusions there from.

    I'm just flabergasted that this stuff doesn't exist already. There's people tracking statistics, but its pointless pointless stuff. http://www.rsssf.com/ purely historical
     
  12. NGV

    NGV Member+

    Sep 14, 1999
    Kenn makes a good point - gaining acceptance of conclusions drawn from stat analysis could be just as difficult as drawing those conclusions in the first place. I mean, in baseball, a lot of people have resisted the idea that OBP + SLG (OPS) is a better measure of offense than batting average, despite the fact that the superiority of OPS is a. easily provable and b. kind of obvious, when you think about it. It's hard to imagine much at all in soccer that could be proved that definitively; so, gaining acceptance for new ideas will be far tougher (and, like I said, at least some of the skepticism will probably be justified).

    But there are a few things that seem like they should be fairly easy to prove. For example, short corner kicks - how often do they lead to goals compared to corner kicks taken into the box? My subjective impression is that they work a lot less often, yet teams still use them. It would be interesting to see what the actual numbers are.

    Or, here's something I've wondered about - is goal differential a more accurate way of evaluating short term team performance than won-loss record, as is thought to be the case in baseball? In other words, does goals scored vs. allowed at the halfway point of the season give a better indication of where a team will eventually end up than their wins vs. losses vs. ties at the same point? This should be pretty straightforward to figure out, if somewhat tedious, and would have some obvious applications.
     
  13. Real Ray

    Real Ray Member

    May 1, 2000
    Cincinnati, OH
    Club:
    Real Madrid
    Nat'l Team:
    United States
    http://www.nytimes.com/2003/07/30/sports/30risk.html

    This article is very much related to this topic-
    In soccer I can see this more valuable in scouting-providing a greater empirical base for a certain action, like showing a player why he really is better playing on the right rather than in center; the correlation re: the number of touches player X gets in the attacking third, etc. I remember recently the during the whole Sosa corked bat drama, Bobby Valentine was on ESPN radio, where he noted how he would sometimes talk to science guys or even stop at libraries, to get data to back up his points.

    Of course Bobby lacked that.."warm and fuzzy" thing, but I think his thiking is spot on.

    But as other have mentioned, sport culture is so hard to change-just look at how you still read about English players making the usual cracks about pasta when taking shots at foreign coaches. Attitudes die hard.
     
  14. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    Remember when Bill James was looked at as a radical? Now many of his concepts are just accepted as a part of the game (OPS, pitchers' run support, stolen bases against catchers) and every baseball broadcast.

    There are ways to analyze certain things about the game of soccer, and I think we should all endeavor to keep searching for knowledge objectively. I think Peter Hirdt tries to do this in his weekly "Analyze This" columns on MLSnet.com.

    Maybe not everything will work, but that doesn't mean we shouldn't try to increase our knowledge by trying.

    Maybe we should start the Society for Soccer Research. :)
     
  15. SYoshonis

    SYoshonis Member+

    Jun 8, 2000
    Lafayette, Louisiana
    Club:
    Michigan Bucks
    Nat'l Team:
    United States
    I hope you guys don't mind, but I'm taking down your names, so as to compile a list of those who should not object to the term "soccer nerds" EVER AGAIN!

    And yes, I only say that because all this stuff is way over my head....
     
  16. beineke

    beineke New Member

    Sep 13, 2000
    In fact, Tom Eaton would probably be interested in Hirdt's recent piece on corner kicks. It's similar to Eaton's possession study, except it breaks things down by halves.

    It reveals that losing teams get more corner kicks while they're losing. While the score is tied, the eventual winner tends to get more corners. I'd expect that possession shows the same trend. Once you're ahead, it's (believed to be) more important to keep your defensive shape than to keep the ball.

    If Hirdt were willing to share his data, we could probably find some interesting stuff.
     
  17. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    I'd imagine a grad student in statistics at a good West Coast university would come in handy, as well. :)
     
  18. DoctorD

    DoctorD Member+

    Sep 29, 2002
    MidAtlantic
    Club:
    Philadelphia Union
    Nat'l Team:
    United States
    Do you know that if Reyna is playing with a torn ACL, his teams have a 0.27 win percentage? :)
     
  19. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    Another thing in Moneyball was Jame's distaste for Ellias Sports Burea and their reluctance to share their data without having people paying for it. The biggest ovstacle in thisis of course data collection. But there obviously some things that can be done w/o any more than what's available right now. Perhaps even a discussion of some things which would be most important to collect would be worth while. As I mentioned when I first started talking about thisI was thinking a little bit about the hockey/basketball ball mold, where of course they keep track of things like checks, turn overs, passing, and things like that as well in basketball. But why not soccer?
     
  20. Flyer Fan

    Flyer Fan Member+

    Apr 18, 1999
    Columbus, OH
    Because I'm a geek, I tried to see if James' Pythagorean winning percentage could somehow apply to soccer, using goals scored and goals allowed. It passes the time at work sometimes.

    So far it doesn't work as well for soccer, but then again I really have no idea what I'm doing half the time.
     
  21. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    There's a fairly good correlation, I've found, between whatever type of "points" a particular team sport uses and how well the points scored and points allowed tracks with w/l percentage.

    Which, if you think of it, makes sense. In a reasonably-sized data sample, if you're giving up more "points" (or runs, or goals, or whatever) than you score, chances are you're going to lose more games than you win. And vice-versa.

    I ran the numbers once and I don't remember what they showed. But draws have a tendency to throw things off, as does the 3-1-0 thing.

    Edited to add the pythagoreans from MLS in 2002:


    Team.........GF..GA....Exp...W....L..T..Pct....Diff
    Chicago......43..38.. .561...11..13..4.. .464..+9.7%
    Colorado.....43..48.. .445...13..11..4.. .536..-9.0%
    Columbus.....44..43.. .511...11..12..5.. .482..+2.9%
    Dallas.......44..43.. .511...12...9..7.. .554..-4.2%
    DC United....31..40.. .375....9..14..5.. .411..-3.5%
    Kansas City..37..45.. .403....9..10..9.. .482..-7.9%
    Los Angeles..44..33.. .640...16...9..3.. .625..+1.5%
    MetroStars...41..47.. .432...11..15..2.. .429..+0.4%
    New England..49..49.. .500...12..14..2.. .464..+3.6%
    San Jose.....45..35.. .623...14..11..3.. .554..+7.0%


    As you can see, it comes within 4.2% of the actual W-L-T percentage for seven of the ten teams. It was way off on Colorado, Chicago, and, to a lesser extent, Kansas City.

    I don't know if that's because, since goals in soccer are much less plentiful than runs in baseball, the odd goal, or one at an inopportune time, has a greater effect than that in baseball. Maybe the effect is magnified, making the correlation harder to track.

    But I know that if you rank teams by points or W-L-T percentage or whatever, and you put their straight goal differential at the end, usually the positives are at the top and the negatives are at the bottom.
     
  22. beineke

    beineke New Member

    Sep 13, 2000
    This is true, but remember also that the baseball season is 162 games. That allows a lot more time for things to even out.

    By the way, if you asked a hypothetical stats grad student about this, he might consider a Poisson model instead of the simpler Pythagorean Formula. Then he might spend a few minutes scratching out formulas on a sheet of paper and conclude that the Pythagorean Formula is sort of like assuming that every game is decided by two runs.

    From there, he might think ... soccer is usually decided by only one goal. Under that assumption, you get a different equation, and it's really simple:

    Goals scored
    --------------------------------------------------
    Goals scored + Goals Allowed

    If this turns out to be an improvement for soccer, a hypothetical stats grad student would be very pleased. :)
     
  23. Flyer Fan

    Flyer Fan Member+

    Apr 18, 1999
    Columbus, OH
    And that's where I got most "confused." I'm never sure how to account for draws. Does one consider a winning percentage or a non-losing percentage? To me, draws should count as losses in a winning percentage since you didn't win. However, I think it's more common to consider a draw half a win and half a loss, right?
     
  24. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    Right.

    But points wise, a win is more than twice as good as a draw under 3-1-0.

    I agree, it does leave it open to interpretation about what percentage to use.
     
  25. microbrew

    microbrew New Member

    Jun 29, 2002
    NJ
    There's quite a bit of academic papers written on sports, including soccer. And let's get some links:

    Good intro article, covering players' market and competitive balance:
    http://www.iesbs.com/pdf/sports_economics.pdf
    In particular, section 3.5 mentions MLS and has a brief history of single entity ownership.

    This paper, It's Fourth Down and What Does the Bellman Equation Say? A Dynamic-Programming Analysis of Football Strategy by David Romer,
    was in the news a few months ago.
    http://emlab.berkeley.edu/users/dromer/papers/nber9024.pdf

    The Sport League's Dilemma: Competitive Balance versus Incentives to Win by
    Frederic Palomino and Luca Rigotti, which concludes that "Under demand maximization, a performance-based reward scheme (used by European sport leagues) may be optimal. Under joint profit maximization, full revenue sharing (used by many US leagues) is always optimal."
    http://repositories.cdlib.org/iber/econ/E00-292/

    Maybe I should start a thread on this: I had an earlier post with more links, I'll need to search. It's possible it was wiped. Anyway, you can find out more by looking at bibliographies of these papers.
     

Share This Page