World Cup Simulation Results [Rs]

Discussion in 'USA Men: News & Analysis' started by voros, Jun 1, 2006.

  1. HoustonSoccer

    HoustonSoccer New Member

    Mar 11, 2006
    Houston, Texas
    Club:
    Houston Dynamo
    Re: World Cup Simulation Results

    Well I did not mean to be too harsh on Argentina's defense. It is quite possible the pre-WC performances may have the lulled the team into complacency.

    Brazil had a much more difficult time qualifying then Argentina. Yet, with a significant coaching change and early jitters in-group play they improved their game and advanced. Anyone recall the Brazil vs. Costa Rica game?
     
  2. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    Okay here's the first of the alternate universe scenarios:

    USA as host.

    Changes: Switch the U.S.A. with Mexico. Then switch the U.S.A.'s new group to 'Group A' and then switch Germany's group in that group's old spot.

    USA Chances to win: 6.59%
    USA Chances to advance: 82.36%
    USA Quarterfinal Chances: 50.13% (originally around 20%)
    USA Semi Final Chances: 29.38% (originally around 9%)
    USA Final Chances: 16.46% (originally a little under 4%)

    It pays, in any number of ways, to host the tournament.
     
  3. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    In the ratings anyway, the U.S.A. is 38th in the World on offense, 7th in the world on defense. Mexico, interestingly enough, is 17th in the World in each which somehow makes them 15th in the World.
     
  4. BigKris

    BigKris Member

    Jan 17, 2005
    Falls Church, VA
    Re: World Cup Simulation Results

    Voros, fascinating thread - thanks so much for sharing this with us.

    I'm curious about how you build and validate your model - what do you use for samples? I know just a tiny bit about modeling so bear with me if I'm way off base but in my business we work with historical data to build predictive models and the perpetual problem is gathering enough data. The way I'm thinking of it, you need a big enough sample with the resulting answers known so you can build the model ('big enough' in this case meaning, "a long enough historic period", and defining that is a whole topic in and of itself), then a second sample (also 'big enough') so you can validate the model results. My modelers are always at pains to stress that we have to be careful not to overlap the build sample, with your validation sample, nor can the build sample be re-used as the historic data for when you're ready to start making 'real' predictions with the model.

    Is this, in fact, how the sampling for your model works? If so, I'm curious how you make the trade-off between bigger/richer (but, by definition older) data samples vs more contemporary (but, by definition less deep) samples?
     
  5. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    Which is why in the international system, factors are applied depending on the type of match being played. Friendlies don't count nearly as much as competitive matches for that reason.

    As far as a coaches poll, sure we could do that, but I see no reason to expect it to be more accurate than a computer ranking. George Mason, last year's NCAA cinderella, lost in their conference tournament. Normally that conference wouldn't send too many at large bids, but the computer rankings showed that George Mason was a stronger team than their reputation. They were given an at large bid.

    The NCAA selection committee continues to give more and more weight to computer rankings as they have become more and more convinced of its advantages over subjective opinions.

    Certainly using the rankings as a guideline and making mental adjustments such as you suggest based on your opinion is a perfectly acceptbale use of the rankings. Doing that however is not my goal. My goal is to make the rankings be based strictly on results and the mathematics that evaluates them. I could bump teams up or down based on various factors and maybe be more accurate (maybe), but that's not my interest in the system.

    The biggest mathematical weakness in the system, in my opinion, is dealing with the results of mismatches. Theoretically this could screw stuff up overrating teams who choose to pound overmatched opponents. Fortunately the bad teams play a lot fewer matches than the good ones, meaning that if the bad team gets blown off the field, the results of that game generally adjusts their rating a lo and the good team only a very little. Australia dropping a 31-0 on American Samoa (to make a point) did very little for Australia's ranking, but it was murder on American Samoa's.

    Still I'm looking for ways to deal with that which won't require massive additional computing power.
     
  6. BigKris

    BigKris Member

    Jan 17, 2005
    Falls Church, VA
    Re: World Cup Simulation Results

    Voros, I have a indulgent question, if you happen to have the info on hand: from the 10,000-run simulation what is the number of goals the Americans are predicted to score in the three group games?
     
  7. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    Well for the baseball part, just gathering the data was difficult, the initial analyses showed was a significant step forward from what was being done, and from there its just a matter of monitoring, analzying and fine tuning as time goes on. There wasn't a whole lot of money spent on the project, and it wasn't like they were drafting straight off the list the computer was suggesting anyway. More than those basics, I'm not really at much liberty to discuss beyond that.

    But for soccer, things were much easier since no money or capital were at risk so I was free to evaluate on the fly. So after 2004, I could look at the rankings at the end of 2003, and then test the rankings on 2004 results. Then rinse and repeat after 2005 and so on. In a business environment that sort of patience is not really feasible.

    That plus the original test data has allowed me to make minor modifications and also allowed me to eventually decide fully on this goals-based system. I was running simultaneous systems based on different methods: wins, margin of victory with declining value as the margin grows, etc. The two-rating goals based system has outperformed the systems of other types. ELO outperformed FIFA in those tests by the way.
     
  8. voros

    voros Member

    Jun 7, 2002
    Parts Unknown
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    I don't keep records of the individual scores in the sims, just the order of finish.

    But this is an easy question to answer since I don't have to do sims:

    vs. Czech = 0.84 goals
    vs. Italy = 0.68 goals
    vs. Ghana = 1.30 goals

    If you want to know how that breaks down, just use poisson:

    g = expected goals
    n = number of goals predicted
    x = chance of 'n' goals being scored

    x = ((g^n)*(e^(-g)))/n!

    So the chances of the US being shutout against the Czechs would be:

    x = ((0.84^0)*(e^(-0.84)))/0! = .432 = 43.2%

    That's a whole lot more math than most of you need, I just wanted to demonstrate what was meant by a poisson distribution.
     
  9. sidefootsitter

    sidefootsitter Member+

    Oct 14, 2004
    Re: World Cup Simulation Results

    So, 3 goals in 3 games?

    I doubt that'll do the job.
     
  10. wcharriscpa

    wcharriscpa Member

    Arsenal FC
    Dec 26, 2000
    Austin
    Club:
    Arsenal FC
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    Sitting here right now, I think I'd take 3 goals in 3 games -- could certainly be worse.

    Not knowing our opponents scores, I'd take one goal versus all three.

    I'd take one or more in our opening match (leaving one or fewer for Italy and Ghana).

    I'd certainly take one or more versus Italy (leaving one or none for CR and Ghana).

    3 goals versus Ghana would likely be great if our first two games are scoreless draws. Or, it could mean that we're already out of it. On the whole, though, I think we could face much worse prospects than a guaranteed 3 goals in the opening round.
     
  11. sidefootsitter

    sidefootsitter Member+

    Oct 14, 2004
    Re: World Cup Simulation Results

    The US scored 5 in 2,002 and was lucky to qualify.

    Hypothetically, of course, you can have three 1:0 wins and max points.

    Now, the US has improved defensively vs. 2,002 (no Agoos is one reason) but ... IMO, it'll be hard to keep the total goals allowed under 4.

    And, IMO, it'll be very hard to match the last Cup's goals scored due to the quality of opposition.
     
  12. KenC

    KenC Member+

    Jun 11, 2003
    Re: World Cup Simulation Results

    Wow, thanks. Hosting would raise the US's chances by about 4x! That's alot. And, advancing is just about a lock, as the historical record shows for host countries.
     
  13. tbgh

    tbgh New Member

    Jan 16, 2006
    Re: World Cup Simulation Results

    For those who have some trouble visualizing the math, I broke down his Poisson distribution for each match.


    ......................................Czech......Italy........Ghana
    Odds of being shutout.........43.2%.....50.7%......27.2%
    Odds for one goal...............36.3%.....34.5%......35.4%
    Odds for two goals.............15.2%.....11.7%......23.0%
    Odds for three or more.........5.5%......3.1%.......14.4%

    Some other interesting numbers we can take from that table:
    Odds of being shutout at least once - 79.6%
    Odds of being shutout twice - 29.6%

    Odds of matching 2002 (5 goals) - 8.8%
    Odds of exceeding 2002 - 4.7%

    We've heard it before, but this is going to be tough.

    Note for obsessive people: I did this very quickly and these are approximations and will be subject to rounding error as well as a slight error from lumping the small possibilities of four or more goals in with three goals. For Ghana the odds of four or more goals is 4.4%. For both of the others it would be well under 1%.
     
  14. superdave

    superdave Member+

    Jul 14, 1999
    VB, VA
    Club:
    DC United
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    In what way were we lucky? We had 4 points, which gets you through more often than not.
     
  15. TimB4Last

    TimB4Last Member+

    May 5, 2006
    Dystopia
    Re: World Cup Simulation Results

    I assume he meant we left our fate in someone else's hands, when we could have taken control of our own destiny in Game 3 (v. Poland).

    You're right, however, in the sense that our advancement was certainly not pure luck - plus we made a lot of our own good luck.
     
  16. Maximum Optimal

    Maximum Optimal Member+

    Jul 10, 2001
    Re: World Cup Simulation Results

    We also gave up 6 in the first round. From my own observations and from Voros' defensive rating, I'm pretty sure defense is the strength of this team. We ain't gonna score as many pretty goals as we did in 2002, but our defense will be a lot better.

    The change is a good thing too. When you're playing teams a bit better than you, being strong on defense gives you the best chance of getting a result.
     
  17. sidefootsitter

    sidefootsitter Member+

    Oct 14, 2004
    Re: World Cup Simulation Results

    4 points is a cut-off usually.

    Group B had Paraguay and South Africa tied on 4 and it came down to goals scored (6 vs. 5).

    Group C had Turkey and Costa Rica tied with 4 and the winner was decided on goal differential.

    Group E had Cameroon out on 4 pts vs. Germany's 9 and Ireland's 5.

    Group F had Argentina out with 4, with Sweden and England finishing with 5 each.

    So, out of six 2002 4-point teams, two went through and four didn't.


    Which were my thoughts.... except low scoring teams fare well only if they earn the required points. Otherwise, this is a losing strategy on a tiebreaker.
     
  18. Trackman20

    Trackman20 Member

    May 14, 2003
    New York City
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    I've got a headache looking at all of these numbers......
     
  19. TimB4Last

    TimB4Last Member+

    May 5, 2006
    Dystopia
    Re: World Cup Simulation Results

     
  20. JohnR

    JohnR Member+

    Jun 23, 2000
    Chicago, IL
    Re: World Cup Simulation Results

    Voros's defensive rating would have been excellent entering '02, also. I believe that particular squad conceded fewer WCQ qualifying goals than did the current one, and yep, both of 'em shipped 4 goals in the road warm-up in Germany.

    Pretty hard to make the case on the evidence that this team is better defensively than the last version, except for the argument that we gotta be better without Goose. Maybe, maybe not. By that logic, we gotta be worse without Sanneh.
     
  21. superdave

    superdave Member+

    Jul 14, 1999
    VB, VA
    Club:
    DC United
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    If Ireland had 5 points...that's a win and two draws. How did Germany have 9?

    Anyway, go back on this thread and you'll see that 4 points puts you through more often than note. Then go to dictionary.com and look up anecdotal.
     
  22. Quaker

    Quaker Member+

    FC Dallas
    Apr 19, 2000
    Nat'l Team:
    United States
    Re: World Cup Simulation Results

    Not sure if this has been posted elsewhere, but EA Sports did a simulation using 2006 FIFA World Cup in which the Czechs won the title. The U.S. advanced out of the group with seven points (!) but then lost to Brazil.

    http://www.gamasutra.com/php-bin/news_index.php?story=9592
     
  23. numerista

    numerista New Member

    Mar 21, 2004
    Re: World Cup Simulation Results

    I suspected this, as well, so I ran a few Poisson regressions to in order see how we looked.

    6 1/2-Year Time Window
    start of 1996 through start of WC02:
    USA #32 in world offense, #13 in world defense

    start of 2000 through May 31, '06:
    USA #30 in world offense, #9 in world defense

    ~2 1/2-Year Time Window
    start of (global) WC02 qualifying through start of WC02:
    USA #21 in world offense, #17 in world defense

    start of (global) WC06 qualifying through May 31, '06:
    USA #34 in world offense, #11 in world defense

    Note that my approach is almost certainly inferior to Voros', which came up with current rankings of #38 on offense, #7 on defense. I'm doing nothing fancy at all, treating every game identically (including friendlies), and only adjusting for home/neutral/away.

    All the same, there is a broad similarity that suggests my "system" (actually a one-line software command) isn't too far off. It seems to indicate that the US has managed to improve an already strong defense, but that when it comes to scoring goals, we're still only around the borderline of the World Cup standard.

    As you guys have pointed out, it's interesting that during the 2002 World Cup Finals, our offense wasn't the problem.
     
  24. TimB4Last

    TimB4Last Member+

    May 5, 2006
    Dystopia
    Re: World Cup Simulation Results

    It'll pass.

    This is the sort of thing BA (or one of his assistants) should be mastering.

    How do we get to the promised land?

    Draw-Draw-Win = 5 should get the job done.
    Draw-Loss-Win = 4 probably won't, in my view.

    I've offered my group strategy elsewhere, but to recapitulate:

    Go for the win against CZ, because

    (1) A win in Game 1 is HUGE, allowing all sorts of game-planning, lineup-tweaking for results in Games 2 & 3. For example, 'reversing' the accepted 5-point formula, W-D-D. Better yet, why not win the group, say with W-D-W, and avoid Brazil?

    (2) If we need a win v. CZ or IT, CZ seems like a better bet. We'd like to put a big dent in the chances of one of our obvious rivals.

    (3) We need to score goals, to improve our tie-breaking chances and to improve our goal-scoring confidence for Games 2 & 3 (and beyond).

    (4) I'd rather plan for success than for failure.

    OK, sorry, we didn't win v. CZ, we tied.

    The good news is, we play Italy after CZ plays Ghana, with the Ghana/Italy result (from Game 1) already known, of course.

    Let's look at some basic scenarios, going into our second game:

    CZ 4 (D-W)
    IT 3 (W- )
    US 1 (D- )
    GH 0 (L-L)

    This is a pretty 'normal' projection. IT and CZ have both beaten GH, and we tied CZ. Everyone assumes we'll play to tie Italy, then go for the win against GH. Slow down, not so fast. We need to look at goal difference first.

    If Italy and CZ are merely +1, then D-D-W doesn't look like such a bad strategy. Maybe we can go +2 v. Ghana. But if IT and CZ are +2 or better, I think our strategy has to change. Otherwise, we're setting ourselves up for a big disappointment, allowing IT and CZ to tie each other and go through.

    If either of the 'other' games has been tied, we have:

    IT 3
    CZ 2
    US 1
    GH 1

    or

    CZ 4
    IT 1
    US 1
    GH 1

    Now D-D-W looks much more promising.

    Ditto if all three games have been ties. Now D-D-W may put us first in the group!

    CZ 2
    GH 2
    IT 1
    US 1

    If we lose Game 1, however, I think you have to restrategize the whole thing. If I can make the time, I'll break things down, but I suspect that we'll need L-W-W = 6 points, not just L-D-W = 4 points to advance. We'll have to look at the results of the other two games first, though.
     
  25. Maximum Optimal

    Maximum Optimal Member+

    Jul 10, 2001
    Re: World Cup Simulation Results

    This is the point to keep in mind. And it jives with looking at how the personnel has changed, mainly Gooch and Pope in central defense rather than Goose and Pope. It is also my impression that our midfield defense is better now. We pressure the ball more aggressively and consistently. Mastroeni and Beasley, who were already good defenders in 2002, have improved. I think we also do a better job of maintaining possession, which improves the performance of the defense.
     

Share This Page