|
|
 |
|
30 Dec 2003, 06:29 PM
|
#1
|
|
BigSoccer Member+
Join Date: Feb 2002
Location: Phoenix
|
Question about use of Poisson distribution
Let us say that San Jose were going to play DC in DC next week, and I would like to predict the result of that match.
If for simplicity's sake I am willing to assume that: 1) goals are Poisson distributed, and 2) any strength of schedule differences are small enough to be ignored, then it seems to me that if I know that San Jose averaged 1.53 GF and 1.07 GA on the road last season, while DC averaged 1.33 GF and 0.93 GA at home, then I ought to have enough information to make a statistical prediction of the match result, but I'll be damned if I can figure out how to do so.
Beineke, Voros, anyone?
|
|
Quote
|
TRY BIGSOCCER
NOW!
| Connect |
in the web's largest forums. |
| Blog |
about soccer from your point of view. |
| Shop |
17,000 authentic soccer items. |
|
|
01 Jan 2004, 10:22 AM
|
#2
|
|
BigSoccer Member
|
Re: Question about use of Poisson distribution
Quote:
Originally posted by NoSix
Let us say that San Jose were going to play DC in DC next week, and I would like to predict the result of that match.
If for simplicity's sake I am willing to assume that: 1) goals are Poisson distributed, and 2) any strength of schedule differences are small enough to be ignored, then it seems to me that if I know that San Jose averaged 1.53 GF and 1.07 GA on the road last season, while DC averaged 1.33 GF and 0.93 GA at home, then I ought to have enough information to make a statistical prediction of the match result, but I'll be damned if I can figure out how to do so.
Beineke, Voros, anyone?
|
Before you can plug any numbers into the Poisson distribution, You still have one more decision to make -- you need to decide how the two teams interact. To estimate a distribution for San Jose's goalscoring in DC, you need to combine your info about San Jose's offense (1.53 gpg) with your info about DC's defense (0.93 gpg). There are many options for doing this, but the simplest is just to take the arithmetic mean (1.23 gpg).
Then you can plug 1.23 into your Poisson distribution and get:
Pr(SJ scores 0) = 29%
Pr(SJ scores 1) = 36%
Pr(SJ scores 2) = 22%
Pr(SJ scores 3) = 9%
Pr(SJ scores 4) = 3%
Pr(SJ scores 5 or more) = 1%
Then you can do the same thing for DC's offense paired with SJ's defense.
|
|
Quote
|
01 Jan 2004, 01:36 PM
|
#3
|
|
BigSoccer Member+
Join Date: Feb 2002
Location: Phoenix
|
Re: Re: Question about use of Poisson distribution
Quote:
Originally posted by beineke
Before you can plug any numbers into the Poisson distribution, You still have one more decision to make -- you need to decide how the two teams interact. To estimate a distribution for San Jose's goalscoring in DC, you need to combine your info about San Jose's offense (1.53 gpg) with your info about DC's defense (0.93 gpg). There are many options for doing this, but the simplest is just to take the arithmetic mean (1.23 gpg).
Then you can plug 1.23 into your Poisson distribution and get:
Pr(SJ scores 0) = 29%
Pr(SJ scores 1) = 36%
Pr(SJ scores 2) = 22%
Pr(SJ scores 3) = 9%
Pr(SJ scores 4) = 3%
Pr(SJ scores 5 or more) = 1%
Then you can do the same thing for DC's offense paired with SJ's defense.
|
Yes, thanks, I actually dug out my old college prob and stats text yesterday and hit on the same idea.
Adding up the probability of all outcomes, I got a prediction of 37.0% SJ win/27.5% Draw/35.5% DC win. Even though a draw is the least likely outcome, the most likely single result is 1-1, with a probability of 13%. Interesting!
Thanks again for taking time to respond.
|
|
Quote
|
01 Jan 2004, 10:42 PM
|
#4
|
|
BigSoccer Member+
Join Date: Jul 2000
Location: Arlington
|
I can't believe I missed this forum for this long.
You method is probably pretty close to how the betting houses post odds, with the exception that they've got a lot more information about how two teams might interact.
|
|
Quote
|
02 Jan 2004, 11:11 AM
|
#5
|
|
BigSoccer Member
|
Re: Re: Re: Question about use of Poisson distribution
Quote:
Originally posted by NoSix
Even though a draw is the least likely outcome, the most likely single result is 1-1, with a probability of 13%. Interesting!
|
Last season, 24 of 161 games ended in 1-1 draws -- that's 15%, definitely in the right ballpark. In many cases, the Poisson approximation is pretty good. Then again, we should probably note that the 2003 Quakes were not a very Poisson-like team. They played a total of 34 games, in which 100 goals were scored.
Using the Poisson model, we'd conclude that they only have a 7.8% chance of having 5 or more goals in a game. Over the course of the season, we would expect 2.65 games where that many goals were scored. Instead, it happened to the Quakes 8 times.
Another way to put this is that 50% of the goals (50 of 100) were scored in only 23.5% of their games (8 of 34). When goals came, they came in bunches.
|
|
Quote
|
02 Jan 2004, 12:18 PM
|
#6
|
|
BigSoccer Member+
Join Date: Feb 2002
Location: E. Somerville
Supporter: New England Revolution
|
Quote:
Originally posted by Serie Zed
I can't believe I missed this forum for this long.
|
yeah i try to pimp this forum as much as possiable but if you want to go into the New Forums thread of Suggestions and ask them to put us on the front page then that'd be great.
Would someone mind giving us an idiots guide defanition to what a poisson distrabution is?
Last edited by mpruitt; 02 Jan 2004 at 12:36 PM.
|
|
Quote
|
02 Jan 2004, 12:49 PM
|
#7
|
|
BigSoccer Member+
Join Date: Jul 2000
Location: Arlington
|
Geek that I am, I thought about this a bit more and think a better approach would look something like...
Generate a mean and standard deviation for goals scored and surrendered by both the home and visiting team.
Here you might just average DC's goals surrendered with San Jose's goals scored (and vice versa). But you could probably find a few simple tweaks that give you a bit mroe insight into how the two teams might interact.
For exmaple, if DC has the worst defense and San Jose the best offense, you might find (using past data) that you actually move the "average" towards San Jose slightly. Or increase the standard deviation. Or something.
Then you just plug the estimated average goals and estimated standard deviations into Crystal Ball and run trials to see what the range of results is.
|
|
Quote
|
02 Jan 2004, 01:15 PM
|
#8
|
|
BigSoccer Member+
Join Date: Feb 2002
Location: Phoenix
|
Quote:
Originally posted by Serie Zed
Geek that I am, I thought about this a bit more and think a better approach would look something like...
Generate a mean and standard deviation for goals scored and surrendered by both the home and visiting team.
Here you might just average DC's goals surrendered with San Jose's goals scored (and vice versa). But you could probably find a few simple tweaks that give you a bit mroe insight into how the two teams might interact.
For exmaple, if DC has the worst defense and San Jose the best offense, you might find (using past data) that you actually move the "average" towards San Jose slightly. Or increase the standard deviation. Or something.
Then you just plug the estimated average goals and estimated standard deviations into Crystal Ball and run trials to see what the range of results is.
|
My understanding is that the Poisson distribution is a one parameter distribution, with the variance equal to the mean. As a practical matter, this means it is already "built in to the distribution" that teams that score more goals will also have more variability in the number of goals scored. Of course, I'm not a statistician, but maybe one of them on here can give you an expert opinion on your idea.
|
|
Quote
|
02 Jan 2004, 02:05 PM
|
#9
|
|
BigSoccer Member+
Join Date: Feb 2002
Location: Phoenix
|
Re: Re: Re: Re: Question about use of Poisson distribution
Quote:
Originally posted by beineke
Last season, 24 of 161 games ended in 1-1 draws -- that's 15%, definitely in the right ballpark. In many cases, the Poisson approximation is pretty good. Then again, we should probably note that the 2003 Quakes were not a very Poisson-like team. They played a total of 34 games, in which 100 goals were scored.
Using the Poisson model, we'd conclude that they only have a 7.8% chance of having 5 or more goals in a game. Over the course of the season, we would expect 2.65 games where that many goals were scored. Instead, it happened to the Quakes 8 times.
Another way to put this is that 50% of the goals (50 of 100) were scored in only 23.5% of their games (8 of 34). When goals came, they came in bunches.
|
If you use MLS regular season average home and away goals (rather than just SJ and DC) then the predicted probability of a 1-1 draw is 11.4%, though your point remains valid.
I wonder what the probability is of seeing 8 vs. the expected number of 3 games with 5 or more goals. If you go back to JG's post at the end of the season, the Poisson distribution still predicted SJ's points pretty accurately:
Team GF GA Pts PrPts Diff
San Jose 45 35 51 48.08 +2.92
As unlikely as they may seem, perhaps the 5 goal outbursts are still just random variation?
|
|
Quote
|
02 Jan 2004, 02:40 PM
|
#10
|
|
BigSoccer Member+
Join Date: Feb 2002
Location: Phoenix
|
Re: Re: Re: Re: Question about use of Poisson distribution
Quote:
Originally posted by beineke
Then again, we should probably note that the 2003 Quakes were not a very Poisson-like team. They played a total of 34 games, in which 100 goals were scored.
Using the Poisson model, we'd conclude that they only have a 7.8% chance of having 5 or more goals in a game. Over the course of the season, we would expect 2.65 games where that many goals were scored. Instead, it happened to the Quakes 8 times.
Another way to put this is that 50% of the goals (50 of 100) were scored in only 23.5% of their games (8 of 34). When goals came, they came in bunches.
|
By my calculation, if 100 goals were scored in 34 Quakes games, then the probability of 5 or more goals being scored in any one game is 17.48% and the expected number of 5 or more goal games is 6, not so different from 8, or am I screwing up something?
|
|
Quote
|
Share
Share
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
|