Statistical Rankings/Gold Cup Predictions

Discussion in 'CONCACAF' started by NoSix, Jul 5, 2013.

  1. slaminsams

    slaminsams Member+

    Mar 22, 2010
    So how many predictions did you get right in the group phase?
     
  2. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    11 out of 18. See post #19 for details.
     
    slaminsams repped this.
  3. EvanJ

    EvanJ Member+

    Manchester United
    United States
    Mar 30, 2004
    Club:
    Manchester United FC
    Nat'l Team:
    United States
    The nine people in my prediction contest averaged 10 2/3 correct results with six of them having at least 11 correct results, so a person who knows how good the teams are has much better than 1.4% chance at having 11 correct results. I'm not calling your model bad, but I don't think the random chance percentage is relevant.
     
    tab5g repped this.
  4. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    OK, so how does a person "know how good the teams are"?
     
  5. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Noone is claiming that the players in your contest have only a 1.4% chance of having 11 or more correct results. Imagine that prior to start of the Gold Cup, I sat down with a list of the 18 group stage fixtures, and for each fixture I rolled a die, and predicted the first team to win if the result on the die was 1 or 2, predicted the second team to win if the result on the die was 3 or 4, and predicted a draw if the result on the die was a 5 or 6. Choosing results randomly in that manner, there is a 100-1.4=98.6% probability that I would pick 10 or fewer match results correctly. The fact that six of your nine contestants did better than that indicates that they are skillful at picking match results, not just lucky. By the same token, the fact that my algorithm did better than that also indicates that it is "skillful" at picking match results. (Note that had my algorithm entered your contest, it would be kicking your butt right now - while you have one more correct result (12 vs 11), my algorithm has picked 5 exact scores correctly compared to your 0.)
     
  6. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Further breaking down the probability of draws after regulation:

    #4 PAN vs #16 CUB
    Probability of PAN winning in overtime: 8%
    Probability of CUB winning in overtime: 2%
    Probability of draw after overtime: 11%
    Overall probability of PAN winning: 82.5%
    Overall probability of CUB winning: 17.5%

    #1 MEX vs #8 TRI
    Probability of MEX winning in overtime: 8%
    Probability of TRI winning in overtime: 2%
    Probability of draw after overtime: 11%
    Overall probability of MEX winning: 81.5%
    Overall probability of TRI winning: 18.5%
     
  7. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Further breaking down the probability of draws after regulation:

    #2 USA vs #11 SLV
    Probability of USA winning in overtime: 8%
    Probability of SLV winning in overtime: 1%
    Probability of draw after overtime: 9%
    Overall probability of USA winning: 87.5%
    Overall probability of SLV winning: 12.5%

    #5 HON vs #3 CRC
    Probability of HON winning in overtime: 5%
    Probability of CRC winning in overtime: 8%
    Probability of draw after overtime: 21%
    Overall probability of HON winning: 38.5%
    Overall probability of CRC winning: 61.5%
     
  8. EvanJ

    EvanJ Member+

    Manchester United
    United States
    Mar 30, 2004
    Club:
    Manchester United FC
    Nat'l Team:
    United States
    I know what you mean with rolling a die. I just think that when you brag about your model you should use the latter part of the paragraph (comparing it to people) and not the 1.4% part.
     
  9. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Before looking forward to the semifinals, a quick summary of the predictions to date:

    Out of 22 matches, 14 results were predicted correctly (64%), including 6 exact scores predicted correctly (27%).

    The probability of predicting 14 or more out of 22 matches correctly by chance alone is only 0.3%.
     
  10. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Updated rankings, based on match results through 2013/7/21:

    rank team w d l pf pp pct gd
    1 MEX 13.4 3.7 1.9 43.8 57. 0.769 1.55
    2 USA 13.6 3.0 2.4 43.8 57. 0.768 1.76
    3 PAN 11.4 4.4 3.2 38.6 57. 0.677 1.04
    4 CRC 11.2 4.8 3.1 38.2 57. 0.671 0.95
    5 HON 9.7 5.1 4.2 34.3 57. 0.602 0.64
    6 JAM 8.2 5.8 5.0 30.5 57. 0.534 0.33
    7 GLP 8.2 4.6 6.2 29.3 57. 0.514 0.27
    8 TRI 7.4 5.1 6.5 27.2 57. 0.477 0.11
    9 MQE 6.8 5.4 6.9 25.7 57. 0.450 0.00
    10 DOM 6.9 3.8 8.3 24.5 57. 0.430 -0.23
    11 GUA 6.4 4.9 7.6 24.2 57. 0.425 -0.16
    12 SLV 6.1 4.9 8.0 23.2 57. 0.406 -0.23
    13 CAN 5.5 6.1 7.4 22.6 57. 0.396 -0.24
    14 HAI 4.8 6.0 8.2 20.5 57. 0.359 -0.36
    15 GYF 5.3 3.7 10.0 19.7 57. 0.345 -0.69
    16 ATG 4.6 4.4 10.0 18.3 57. 0.321 -0.75
    17 CUB 4.2 4.5 10.2 17.2 57. 0.302 -0.75
    18 NCA 4.1 3.9 11.1 16.1 57. 0.282 -1.00
    19 GUY 3.2 3.8 12.0 13.4 57. 0.236 -1.21
    20 BLZ 2.6 4.9 11.5 12.6 57. 0.222 -1.02

    USA's blowout win over SLV combined with MEX's squeaker over TRI leave the two virtually tied at the top. The four gold cup semifinalists rank 1st-3rd and 5th.

    Biggest gainers are USA, up 23 pct points and 0.21 gd, and PAN, up 22 pct points and 0.14 gd.

    Biggest losers are CUB, down 25 pct points and 0.16 gd, and SLV, down 20 pct points and 0.11 gd.
     
  11. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Semifinal Preview:

    #2 USA vs #5 HON
    Probability of USA win: 69%
    Probability of HON win: 8%
    Probability of draw after regulation: 23%
    Prediction: USA 1 HON 0 (21% probability)
    In the event of a draw after regulation:
    Probability of USA win after overtime: 9%
    Probability of HON win after overtime: 2%
    Probability of draw after overtime: 12%
    Overall probability of USA win: 84%
    Overall probability of HON win: 16%

    #1 MEX vs #3 PAN
    Probability of MEX win: 45%
    Probability of PAN win: 24%
    Probability of draw after regulation: 31%
    Prediction: MEX 1 PAN 0 (18% probability)
    In the event of a draw after regulation:
    Probability of MEX win after overtime: 8%
    Probability of PAN win after overtime: 5%
    Probability of draw after overtime: 18%
    Overall probability of MEX win: 62%
    Overall probability of PAN win: 38%​
     
  12. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Through the semifinals, out of 24 matches, 15 results were predicted correctly (63%), including 6 exact scores predicted correctly (25%).

    The probability of predicting 15 or more out of 24 matches correctly by chance alone is only 0.3%.
     
  13. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Updated rankings, based on match results through 2013/7/24:

    rank team w d l pf pp pct gd
    1 USA 13.8 2.9 2.3 44.3 57. 0.778 1.85
    2 MEX 13.0 3.8 2.1 42.9 57. 0.753 1.46
    3 PAN 11.8 4.2 3.0 39.5 57. 0.693 1.13
    4 CRC 11.1 4.7 3.1 38.2 57. 0.670 0.94
    5 HON 9.5 5.1 4.4 33.7 57. 0.591 0.60
    6 JAM 8.2 5.8 4.9 30.5 57. 0.536 0.33
    7 GLP 8.3 4.6 6.2 29.4 57. 0.515 0.27
    8 TRI 7.4 5.1 6.5 27.2 57. 0.478 0.11
    9 MQE 6.6 5.4 7.0 25.1 57. 0.441 -0.04
    10 DOM 6.9 3.8 8.2 24.6 57. 0.432 -0.23
    11 GUA 6.4 4.9 7.6 24.3 57. 0.426 -0.16
    12 SLV 6.1 4.9 8.0 23.2 57. 0.407 -0.23
    13 CAN 5.5 6.1 7.4 22.6 57. 0.397 -0.24
    14 HAI 4.8 6.0 8.2 20.5 57. 0.360 -0.37
    15 GYF 5.4 3.7 10.0 19.7 57. 0.346 -0.69
    16 ATG 4.7 4.4 9.9 18.4 57. 0.322 -0.75
    17 CUB 4.3 4.5 10.2 17.3 57. 0.304 -0.74
    18 NCA 4.0 3.9 11.1 15.9 57. 0.280 -1.01
    19 GUY 3.2 3.8 12.0 13.4 57. 0.236 -1.21
    20 BLZ 2.6 4.9 11.5 12.7 57. 0.222 -1.03

    With their win over HON, USA take over the top spot from MEX for the first time in Jurgen Klinsmann's tenure. The gold cup final will be contested between #1 USA and #3 PAN.

    Biggest gainers are PAN, up 16 pct points and 0.09 gd.

    Biggest losers are MEX, down 16 pct points and 0.09 gd.
     
  14. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Final Preview:

    #1 USA vs #3 PAN
    Probability of USA win after regulation: 63%
    Probability of PAN win after regulation: 14%
    Probability of draw after regulation: 23%
    Prediction: USA 1 PAN 0 (15% probability)
    In the event of a draw after regulation:
    Probability of USA win after overtime: 9%
    Probability of PAN win after overtime: 3%
    Probability of draw after overtime: 11%
    Overall probability of USA win: 77.5%
    Overall probability of PAN win: 22.5%
     
  15. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    More on Sunday's Gold Cup Final:

    Breakdown by goal difference after regulation:
    Probability of USA win by 4 goals: 4.4%
    Probability of USA win by 3 goals: 10.6%
    Probability of USA win by 2 goals: 19.6%
    Probability of USA win by 1 goal: 26.4%
    Probability of draw: 22.7%
    Probability of PAN win by 1 goal: 10.4%
    Probability of PAN win by 2 goals: 3.1%
    Probability of PAN win by 3 goals: 0.7%
    Probability of PAN win by 4 goals: 0.1%

    Top 10 most likely scores after regulation:
    USA PAN prob
    1 0 15.0%
    2 0 13.3%
    1 1 10.5%
    2 1 9.3%
    0 0 8.5%
    3 0 7.8%
    0 1 5.9%
    3 1 5.5%
    1 2 3.7%
    4 0 3.5%
     
  16. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Out of 25 total Gold Cup matches, 16 results were predicted correctly (64%), including 7 exact scores predicted correctly (28%).

    The probability of predicting 16 or more out of 25 matches correctly by chance alone is only 0.2%.
     
  17. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Updated rankings, based on match results through 2013/7/28:

    rank team w d l pf pp pct gd
    1 USA 13.8 2.9 2.3 44.2 57. 0.776 1.82
    2 MEX 13.1 3.8 2.1 43.0 57. 0.754 1.47
    3 PAN 11.5 4.4 3.1 38.9 57. 0.683 1.06
    4 CRC 11.2 4.8 3.1 38.2 57. 0.670 0.94
    5 HON 9.5 5.1 4.4 33.7 57. 0.591 0.59
    6 JAM 8.2 5.8 5.0 30.5 57. 0.536 0.33
    7 GLP 8.3 4.6 6.1 29.4 57. 0.517 0.28
    8 TRI 7.4 5.1 6.5 27.3 57. 0.478 0.11
    9 MQE 6.6 5.4 7.0 25.2 57. 0.442 -0.03
    10 DOM 6.9 3.8 8.2 24.6 57. 0.432 -0.22
    11 GUA 6.5 4.9 7.6 24.3 57. 0.427 -0.15
    12 SLV 6.1 4.9 8.0 23.3 57. 0.409 -0.22
    13 CAN 5.5 6.1 7.4 22.7 57. 0.397 -0.23
    14 HAI 4.9 6.0 8.2 20.6 57. 0.361 -0.36
    15 GYF 5.3 3.7 10.0 19.6 57. 0.344 -0.70
    16 ATG 4.7 4.4 9.9 18.5 57. 0.324 -0.73
    17 CUB 4.2 4.5 10.2 17.3 57. 0.303 -0.74
    18 NCA 4.0 3.9 11.1 16.0 57. 0.280 -1.00
    19 GUY 3.2 3.8 11.9 13.5 57. 0.237 -1.20
    20 BLZ 2.6 5.0 11.5 12.7 57. 0.223 -1.02
     
  18. tab5g

    tab5g Member+

    May 17, 2002
    #43 tab5g, Aug 20, 2013
    Last edited: Aug 20, 2013
    Do 1/3 of all soccer (or Gold Cup) matches end in draws?

    Wouldn't you need a die with many more sides than 6 to try to reasonably outline (project) the "results" of some series of matches based solely on chance?

    "Chance" should or would be better defined if the expectation weren't that a draw is equally likely to one team winning or to the other team winning a singular game/event.

    Not that I do not see the utility of your algorithm, but I do think there is real weight to the argument that the "baseline" comparison scale you have offered in this thread as "chance" is flawed somewhat (though perhaps not significantly).
     
  19. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    I already addressed this question (see post #11). I'm not assuming the probability of a draw is 1/3.
     
  20. tab5g

    tab5g Member+

    May 17, 2002
    You certainly look to be assuming that 1/3 probability for a draw when you write the following:

    I was just suggesting that a 10 or 12 (for examples) sided die (on which exactly 2 sides still represented "draw") would be more accurate for randomly selecting soccer games than would a six sided die.
     
  21. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    The point is, there are 3 outcomes, home win, away win, and draw, the probabilities for which always add to 1. Therefore, if you randomly select amongst the three, you will predict the correct outcome 1/3 of the time, irrespective of the distribution of probabilities. Assume, for example, the probability of a home win is always 64%, the probability of an away win is 24%, and the probability of a draw is 12%. If you select randomly from those possible outcomes, the probabilility of correctly predicting the outcome is 1/3*64%+1/3*24%+1/3*12%=1/3*100%=33.33%.
     
  22. tab5g

    tab5g Member+

    May 17, 2002
    #47 tab5g, Aug 21, 2013
    Last edited: Aug 21, 2013
    Is that equation accurate?

    Why use the 1/3 multiplier on the left hand side? (The "selected randomly for the possible outcomes" would not be at the consistent 1/3 level for each of W, L and D.)

    Should not the equation be (.64*.64)+(.24*.24)+(.12*.12)=48.16% to account for the random selection of more of the higher probability outcomes (and not just a 1/3 chance of each outcome W, L or D)?

    In the Gold Cup scenario -- which features a lot of neutral site matches especially -- if the baseline assumptions were using the 64% "stronger/high-ranked" team wins, 24% "weaker/lower-seeded" team wins and 12% draw probabilities, would not the odds of picking 11 or more games correctly out of 18 improve above the 1/3, 1/3 and 1/3 assumptions you referenced earlier in this thread?

    The assumptions (or the data put into the model) are important.

    Your baseline (for comparison) was very limited and faulty, and could easily be improved (wrt to predicting soccer match outcomes).

    Using a six-sided die and assigning 2 sides (1/3 probability) to draw is not a great approach, when likely using a 16-sided die (with 1-10 as Team1 Win, 11-14 as Team1 Lose, and 15-16 as Draw) would be the better assumption and approach (to more accurately account for the likely/historical number of draws in a competition like the Gold Cup). (And yes "Team1" would have be be assigned/selected based on some basis -- either HFA for USA and ranking, FIFA, ELO or otherwise for neutral venue matches.)

    Your algorithm and model are very solid. In large part because you rely on a huge set of data -- 5 years or past results -- put into the model. If you only entered 1 or 2 years would your model be as valid? Or would using 8 or 10 years of past data make it any more or less valid?

    Your baseline comparison is very weak, in that it relies on a 6-sided die and the assumption that 1/3 of matches will be a draw.
     
  23. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    #48 NoSix, Aug 21, 2013
    Last edited: Aug 21, 2013
    Again, for each match by definition (pW+pD+pL)=1, so if you choose one of the 3 outcomes randomly, in the long run, you will choose 1/3 W's , 1/3 D's, and 1/3 L's, and your percentage of correct choices will be 1/3*(pW+pD+pL)=1/3*1=1/3. That is a fact.

    Imagine you had a loaded coin, with pH=0.75 and pT=0.25. If you flip the coin 10000 times, and choose one of the two outcomes randomly, then half the time you will choose heads, half the time you will choose tails, and your probability of choosing correctly is 1/2*0.75+1/2*0.25=1/2*1=1/2
     
  24. tab5g

    tab5g Member+

    May 17, 2002
    "If you choose one of 3 outcomes randomly" is the point of contention.

    How randomly are you choosing those outcomes?

    Randomly (via 16-sided die for example) choosing 10/16 for a W, 4/16 for a L and 2/16 for a Draw is a better method of random selection (for soccer match outcomes), than is the pure 1/3 for each possible outcome (W, L, D).
     
  25. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    It is not better, it is exactly the same, because 1/3*10/16+1/3*4/16+1/3*2/16=1/3*16/16=1/3!!

    Take a look back at my edit to post 48.
     

Share This Page