Sabermetrics applying to Soccer

Discussion in 'Statistics and Analysis' started by mpruitt, Jul 30, 2003.

  1. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Umm, declined?

    It was more of a "I've been through the wars sort of thing" than a volunteer effort. :)
     
  2. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    Doesn't matter. You're drafted. :)
     
  3. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Then, to avoid the draft, I guess I'll have to start posting to the Canada forum. :)
     
  4. us#1by2006

    us#1by2006 Member

    Jun 21, 2002
    Thanks

    Voros,

    Thanks for providing your professional insight to us. I appreciate what you were willing to offer us.
     
  5. joe2

    joe2 New Member

    Oct 14, 2001
    Voros...interesting post. I would like to respond to what I see as conceptual weaknesses....
    Point 1 b...The goal of a soccer team may not always be to accumulate as many points per game as possible. Near the end of a season very often getting 1 point for a tie is just as good as 3 for a win. But that is a minor point as in general you are correct.
    Point 1 C, while it seems obvious is only correct to the point that the information is reliable. In other, bad information is in fact worse than no information because it may lead to faulty strategies and faulty conclusions.
    1 D....descriptive data is always better than qualitative data ?...I don't see any basis for this statement. In fact, I don't see how you can make a distinction. Each piece of data collected always has a descriptive as well as qualitative component to it.
    1 D...objective data tends to be better than subjective data ?....again, this depends on the purpose of the data collection. Certainly subjective and objective data, to the extent they can be separated, are useful, each in their own way. Neither is "better" than the other, or rather the "betterness" of each is dependent on the context in which each collected and used.
    2. ...If you cannot attempt to define a "chance" in some way then any statistical analysis of soccer would be a nice set of numbers but with very little practical application. "Chances" or opportunities to score are the essence of the game. It is akin to saying we can't know what an At Bat is in baseball so we will ignore it.
    3. and 4. I do agree that these discrete events have some descriptive value. And all of us should always be open to criticism and refinement of our work.
    I also tend to agree with points 5,6,7, and 8.
    Point 9. What you say is basically true. But it is also true of statistical interpretation as well. Because information is collected and assigned a numerical value does not in itself make it better than intuitive information. That is the great fallacy we see in the numbers game. ( For a good example of this just look at IQ scores, for example) The fallacy of the human mind, which I agree with you about, extends to the creation of mathematical models, which are after all, creations of the human mind. In other words, just assigning numerical value to events does not, in itself, make the information more meaningful, accurate or useful. Statistics can, after the fact, identify trends. But so can an expert human observor (a great coach, for instance).
    A Final Point....personally I find statistical info of all kinds interesting, but not useful as a predictor of individual future events. Kind of like nice , fun toys.
     
  6. SamPierron

    SamPierron BigSoccer Supporter

    Nov 30, 1998
    Kansas City
    Club:
    Sporting Kansas City
    Nat'l Team:
    United States
    I mention this only because he's my favorite athlete ever by a factor of ten, but George Brett's postseason statistics are considerably higher than regular season (where, of course, he was no slouch). OPS 1026 compared to 856. And, if you're going to believe in this sort of thing...his numbers were only subpar in the two series where the Royals were never really in it and were overmatched...the 1981 BS split season playoff and 1984 ALCS against the Tigers (where have you gone, Willie Hernandez?).

    Yeah, the sample size is smaller; yeah, it's probably just hokum. But if you saw George Brett play in the postseason...
     
  7. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    maybe we'll sign you as a free agent before the start of the draft. or to keep this soccer related. you could be a Senior International that get's allocated to us in a 'weighted' lottery.
     
  8. microbrew

    microbrew New Member

    Jun 29, 2002
    NJ
    My favorite is Martin Brodeur. And then there's the Claude Lemieux.

    What's the question here: Can a player play better("elevate play", "clutch") in situations where the cost of a loss is more significant than normal?

    Or is that the wrong question? Perhaps the correct question is: Can a player take advantage of the conditions generated by a playoff format, where the cost of a loss is more significant than normal? And, what might these conditions be?
     
  9. beineke

    beineke New Member

    Sep 13, 2000
    Has anyone in this thread suggested that quantitative information is more valuable than intuitive?

    In virtually all fields of human endeavor, both numerical reasoning (i.e. statistics) and conceptual understanding each have their place. They tend to complement another. But in soccer, there are many different intuitions with little numerical analysis to substantiate them.

    Do we know for a fact that Elias won't share its MLS data at all? If it's for sale, do we have any idea what it costs?
     
  10. microbrew

    microbrew New Member

    Jun 29, 2002
    NJ
    Voros,

    Thanks for your insight. Actul experience beats my armchair quarterbacking.

    What do you call your field of expertise? Sports engineering (a la financial engineering)? And I'd imagine there would be subfields of expertise, such sports actuary (why does it seem like that GMs don't employ a sports actuary before signing an older player?, or is it just my selective memory?).

    Are there universities out there studying this, perhaps with graduate programs?
     
  11. beineke

    beineke New Member

    Sep 13, 2000
    Sounds like a nice question to me ... in the World Cup and many other sports competitions, teams respond to these conditions by playing very conservatively (in many cases, too conservatively). There are ways to take advantage of that.

    In baseball, conservatism might make a pitcher more predictable, in which case a hitter could get better results. Then again, I think it's been thoroughly studied, and it doesn't appear to happen.

    One situation where the "clutch" effect does seem to occur is in soccer penalty shootouts. I'll try to dig up a reference if I get the chance.
     
  12. microbrew

    microbrew New Member

    Jun 29, 2002
    NJ
    Actually, what the heck does "clutch" mean?

    If I understand correctly, each event is iid- hence, no such thing as clutch. (iid being independently and identically distributed)

    But, some players seem to improve their stats come playoff time. What's the explanation? All I could think of is either the sample size isn't large enough, or the player is actually more effective. He is more effective by taking advantage of a change in the playing style, or by the conditions changing to favor his style of play, or by something else that's completely speculative.


    As for "clutchness" in penalty kicks- I have a feeling that this is sports psychology.
     
  13. joe2

    joe2 New Member

    Oct 14, 2001
     
  14. beineke

    beineke New Member

    Sep 13, 2000
    Without putting words into his mouth, I thought he was saying that while you're gathering and crunching the numbers, you're better off with something more objective and quantitative. That doesn't deny intuition its rightful place in assessing the results.

    And I do agree that we need to be aware of the limitations of every study that gets done. Nobody's ever going to solve soccer, any more than they can solve a game like chess ... and in chess, there's only one piece moving at a time. But we can develop a lot more quantitative insight than we have right now.
     
  15. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    It's a two pronged process: we use the intuitive process to recognize patterns and such and make observations. We then use various methods to test those intuitive observations to test to which extent they are provably true or false (one of which would quantitative analysis), so that we don't let our thought processes mislead us with faulty logical leaps.

    The intutive processes are already in place everywhere. They are human nature to make when observing anything. However, without a rigorous, logically sound analyses of these observations, we're greatly susceptible to incorrect conclusions (logical fallacies if you will). That's why things like "Post Hoc Ergo Propter Hoc" and "Multiple Endpoints" can be such common traps. Our mind is conditioned to think in such ways, and without a process to correct things, we'll believe things that aren't true.

    Again, the point is _not_ statistics, it's about using the fundamental building blocks of logic to help increase our understanding of the game. This currently would help weed out lots of issues I see on this board everyday: "multiple endpoints, seemingly fulfilled prophecies, missing data fallacies, appeals to authority, post hoc ergo propter hoc, correlation implying causation, wishful thinking, quotes out of context, etc., etc., etc." Every time someone says, "well Bruce Arena does think Frankie Hejduk should be starting and he knows more than you," that's a logical fallacy. Whether Frankie Hejduk belongs in the XI for the national team is a proposition independent of the people arguing each side of the issue. Who argues which side has no bearing on which side is correct. The argument needs to be evaluated, not the arguers.

    The problem with intuitive thinking, is that despite being unthinkably valuable, it has its pitfalls. This is why we think more babies are born during full moons, eggs can be stood on end during the equinox, and that more domestic violence occurs on Super Bowl sunday than any other day of the year. There's no evidence in support of any of these things (actually eggs can be stood on end during the equinox, but it can be done on arbor day, groundhog day, and during the MLS All-Star game too, the equinox has nothing to do with it), but I'd venture a majority of Americans would believe these things.

    People can go about and say every nurse and doctor I've talked to swears that babies on the full moon thing is true, and if we accepted the appeal to authority, we'd believe something that all subsequent research has shown to be false.

    I have no problem with intuitive decision making. It's the overwhelming supremacy of it that is at issue, particularly with regards to issues that are to an extent testable logically and scientifically.
     
  16. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Exactly. I'm a big believer in keeping objective and subjective data separate, until it comes time to make decisions on the data available. This way we can get a very good grip on how far objective data can get us, allowing us to give the subjective (or intuitive) data the proper weight to our decisions.

    To use an example, with regards to clutch hitting. Intuitively, people see things like Bobby Thomson or Kirk Gibson and _know_ clutch hitting exists. They then make the faulty assumption that therefore clutch _hitters_ exist. The clutch hitting studies done to date all seem to come to the same conclusion: either clutch hitters don't exist in the major leagues, or do to such a small extent that it is mostly irrelevant. The studies involve tracking batting results in a variety of situations defined as "clutch" and observing how far the results deviate from what our expectations would be if said ability did not exist. In each and every case, there is no proof of guys who are clearly clutch hitters. The most "clutch" guy I ever found? Brett? No. Schmidt? No. Reggie Jackson? No. It was Pat Tabler. In other words it was some random guy who wound up on the extreme end of the distribution.

    Where intuition comes in here is how and why this might be possible. The best explanation is not that clutch hitters don't exist in MLS, it's that ALL MLB hitters are "clutch." The argument goes, if you can't perform in high pressure situations, you get weeded out a long time before you reach the Major Leagues. We have to remember that MLB is the cream of the baseball crop, and only the very best are there. "Chokers" as they are called, should have washed out a long time ago, when the scouts first came to watch him play.

    Another argument is that hitting is not a skill where increased effort necessarily results in increased results, which would mean while a clutch ability might not exist in hitting, it could exist in a sport like soccer.

    It's that sort of interaction between logic, numbers, science and intuition that can greatly further our knowledge of the game. I agree that it appears to be somewhat lacking in soccer.
     
  17. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    this doesn't really have to do with anything signifigant which we've been discussing but it looks like a pretty decent stats page that i hadn't seen before.

    http://soccer-stats.football365.com/
     
  18. joe2

    joe2 New Member

    Oct 14, 2001
    Beineke...Well, go back and read his exact words because he does make a distinction bewtween the two.
    I think we may be argueing different points. My contention is that there is a certain amount of subjectivity in the actual identification of the data. this tends to be denied by some statisticians because they think it renders their numbers less useful, which it does. I notice no one has yet to define what a "chance" is. Nor has anyone challenged my notion that there are many kinds of "strikes" in baseball. We must remain aware that we are lumping different actions into one category. This must be done in order to be able to count anything. But it lessens the reliability of our ultimate conclusions.
    I agree that chess is infinitely simpler than soccer to analyze. that is because each chess move can be adequately described as P to KB4 or the like. It does not matter HOW the player moved the piece, if it was done quickly or slowly, with style or elegance or shoved across the board. No one is being faked out by the particular move itself. It stands alone as a discrete event. There are few, if any, such discrete events in soccer. And that is the problem with quantification. If you can't agree on what is being counted then how can statistical data have meaning ? That is my point.
    I also agree it is possible to quantify more in soccer than we do now. As a practical matter, it would involve taping a match and carefully counting the actions of each player. Very time consuming but interesting, I think. But what would we count that we could agree was significant ? I think we could develop categories such as postive ball movement, postive or negative space development, appropriate defensive response and others. But it would probably take a day or more to get accurate data from one match.
     
  19. joe2

    joe2 New Member

    Oct 14, 2001
    VOROS...you seem to be suggesting that "intuitive" thinking is somehow part of human "nature" and is separate from "logical" thinking. I would suggest that intuitive (by which I think you mean"subjective") and "logical" thinking are both part of the human thinking process and in fact go hand in hand. A good coach knows a certain player is better not because he justs thinks so. He has an accumulated experience of the game, how players move, take shots, interact under pressure, etc. A good coach has internalized this "objective" data and that is the basis for his "intuitive" decisions. You don't get to be a good coach by just feeling something is right. Those judgements are always based on experience, the experience of the real world. Whether or not those experiences have been counted is largely irrelevant. And new information is always altering those judgements as they are incorporated into the subjective-objective thinking process.
    VOROS...Regarding your paragraph about what "people believe"...full moon babies, etc. That is really setting up straw men now isn't it ? People believe in angels, weapons of mass destruction, etc. Since you and I know better why bring in the unwashed masses ? I could use the same straw man argument and say many people believe an IQ of 104 means the person is smarter than someone with an IQ of 97. Or a player with a fielding percentage of .977 is a better fielder than one with an average of .985. Therefore anyone who uses statistical information is silly. But we both know those are false arguments.

    VOROS...Regarding your statement about Bruce Arena...whether or not it is a "logical fallacy" , it is accurate. If you are an average fan Bruce Arena knows a hell of a lot more than you do about whether Frankie Hedjuk should or should not be playing. He has much more objective and subjective data to draw on. If he does not know better than any other person would have the same knowledge about soccer, which just isn't true. Any other person with an opinion would be just as qualified to coach. That is similar to suggesting that anyone's opinion about an illness is of the same caliber. I go to my doctor when I am sick, not my plumber, for a good reason. My doctor's knowledge is superior to my plumber's on that topic.

    VOROS...I agree that testabilty is important and would be valuable. Still, what is the data that you could collect in soccer that would lend itself to this. That question seems to be one that is being avoided in this discussion. And that is the crux of the matter...
     
  20. joe2

    joe2 New Member

    Oct 14, 2001
    VOROS...I think you put your finger on the problem with trying to collect data in soccer. Except for a few things (goals, corner kicks, fouls) it is pretty hard, maybe impossible, to separate the objective-subjective elements of the data. It may be possible if precise definitions of other important soccer events (possession, creating space, creating chances, etc) could be agreed upon.

    VOROS...Your discussion of clutch hitting is a good one. The problem may be... a. everyone is a clutch hitter which makes the term meaningless, b. no one is a clutch hitter, c. clutch hitting itself does not exist, d. clutch hitting exists but the data collected does not really describe the important elements of clutch hitting, e. the definition you are using (of clutch-hitting) is somehow faulty (maybe you are not counting what should be counted).f. the definition of a clutch situation may be faulty, g. there may be non-quantifiable factors at play or more or different dat needs to be added...How much more difficult it will be to define and quantify a "chance" in soccer ! I don't think that should stop us, but it will certainly be a daunting task.
     
  21. TomEaton

    TomEaton Member

    Mar 5, 2000
    Champaign, IL
    Great thread.

    I think everybody with an opinion about these issues agrees (more or less) with the following points:

    1. Soccer is much more difficult to quantify through numbers than baseball and other sports; but

    2. More could be done than is being done now.

    As Joe pointed out, why should we concern ourselves with the difficulty of defining something nebulous like a "chance" when we don't even know comparatively simple things like how often goals are scored off of corner kicks? I'd love to know these things. If data indicated that, leaguewide, 5% of corner kicks ultimately led to a goal, and eight out of ten MLS teams had figures within 0.5% of that number, but, say, the Columbus Crew didn't score on a single corner kick all year while D.C. United scored twice as often, that's worth knowing.

    As with any statistic, the results would still be subject to interpretation. It might mean that Columbus had no player who could strike a decent corner kick, while D.C. had a great one; or it might mean that Columbus had nobody taller than 5'7" who could win headers in front of the net while D.C. had loads of guys over 6 feet tall; or it might mean both of these things, or neither. You'd have to examine these things and apply a little common sense.
     
  22. JG

    JG Member+

    Jun 27, 1999
    From a quick scan of the match reports on the MLS website, 23 goals have been scored off corner kicks this year (I'm counting anything where the ball wasn't cleared by the defense--goalmouth scrambles, penalties earned when shots from corners were handled, Onstad' infamous own goal).

    847 total corner kicks have been taken according to the MLS stats.
     
  23. kenntomasch

    kenntomasch Member+

    Sep 2, 1999
    Out West
    Club:
    FC Tampa Bay Rowdies
    Nat'l Team:
    United States
    Wow, you could tell that in a quick scan, how the goals were scored?

    That's less than 3% (does that 847 include short corners, where they don't make an attempt, really to have the ball drop directly on someone's head in an attempt to get it on net? I'm guessing it does).

    Looks like more on TV. :)
     
  24. beineke

    beineke New Member

    Sep 13, 2000
    You didn't happen to make a list of them, by chance? I'm interested for a couple of reasons ...
    1) I'd be curious to see if there's any apparent heterogeneity on CK goals scored/allowed by each team.
    2) Given a list, I might go into the goals archive and document them more thoroughly.

    Am I in luck?
     
  25. JG

    JG Member+

    Jun 27, 1999
    I kept a list of them, but not terribly detailed.

    7/26 San Jose 1st goal
    7/19 Chicago 3rd goal
    7/19 San Jose 1st goal
    7/16 Dallas 2nd goal
    7/12 New England 1st goal
    7/4 Colorado 1st goal
    7/2 San Jose 1st goal
    7/2 San Jose 3rd goal
    6/21 Kansas City 1st goal
    6/18 Los Angeles 1st goal
    6/14 Colorado 1st goal
    6/7 DC United 1st goal
    5/31 Dallas 2nd goal
    5/31 San Jose 1st goal
    5/24 Columbus 2nd goal
    5/24 Dallas 1st goal
    5/17 Columbus 2nd goal
    5/17 Los Angeles 1st goal
    5/17 Dallas 1st goal
    5/17 New England 2nd goal
    4/26 Kansas City 2nd goal
    4/26 Metrostars 1st goal
    4/19 Kansas City 1st goal
     

Share This Page