View Full Version : Dallas are a bunch of hacks! FC/FS
mpruitt
09 Feb 2004, 12:08 AM
This perhaps isn't the best way to answer your question, but I happen to have all-time* points per game and all-time cards per game.
PPG C/GP
Los Angeles 2.115384615 N.E. Revolution 2
D.C. United 1.868131868 Los Angeles 1.950549451
Dallas Burn 1.785714286 Chicago Fire 1.901869159
Chicago Fire 1.728476821 MetroStars 1.846511628
Kansas City 1.655737705 Columbus Crew 1.841059603
Tampa Bay 1.64516129 Colorado Rapids 1.807692308
San Jose 1.626373626 Miami Fusion 1.729032258
Columbus Crew 1.462616822 D.C. United 1.697802198
Miami Fusion 1.459016393 Kansas City 1.677570093
Colorado Rapids 1.364485981 Dallas Burn 1.675824176
MetroStars 1.313084112 Tampa Bay 1.508196721
N.E. Revolution 1.251162791 San Jose 1.495327103
So over the course of MLS history it doesn't appear that there's much of a surface connection. A lot of the problem is however, I'm not much of a statistician. This stuff is all pretty basic math and I fear that my simple rankings are just flat out missing something. I can go back and put in the stats for cards per season and run the numbers. I also have some more all-time stats related to cards but none of it is all that interesting.
*note again "all-time" is 2002-1996.
**Could one of you smart guys maybe suggest a better statistical manner for trying to map some of these trends?
beineke
09 Feb 2004, 10:36 AM
Originally posted by maxim-1
**Could one of you smart guys maybe suggest a better statistical manner for trying to map some of these trends?
1) It's worth the labor of producing year-by-year totals. By averaging across seasons, you're probably hurting yourself.
2) Once you have all the individual season data, make a scatterplot (x-axis: win pct, y-axis: cards per game) and see if a trend is visible.
gotyourback
09 Feb 2004, 02:57 PM
Originally posted by beineke
1) It's worth the labor of producing year-by-year totals. By averaging across seasons, you're probably hurting yourself.
2) Once you have all the individual season data, make a scatterplot (x-axis: win pct, y-axis: cards per game) and see if a trend is visible.
This sounds like it would do the job very nicely. Thanks.
ur_land
09 Feb 2004, 03:40 PM
One thing that (if you go to the trouble of doing a full on statistical analysis) you might want to consider is the fact that the data for team across years have some dependency.
Because teams have many of the same players (and coaches) from year to year, their fouls suffered and caused from year to year will not be independent observations.
And since independent observations are a major requirement for most inferential stats, this can cause odd results (even for things like correlations).
A within-subjects regression (taking team as the unit of analysis) would take care of this.
Heck, I'd be happy to do it quickly--can some post the GF/GA or Winning% data (or a link to it) and the card data?
Thanks!
mpruitt
09 Feb 2004, 03:56 PM
I'm still formulating it right now. A lot of this has been a learnign process for me so my original data was a bit sloppy in formating. I'm going through right now and forulating it in a concise way. It'd be done right now but I'm having trouble taking into account Shout Out Wins... Also some of the formating is a bit tedius because of expansion and contraction. As soon as I'm done with it i'll pm you. I'd much appreciate your help. Also, if you wouldn't mind walkign me through some of the analytical steps, then I'd be much appreciated. Even if it's just what's the most proper function and how to do it on Excel.
Edit: Link for the file I'm now using.
http://f2.pg.briefcase.yahoo.com/bc/maxim0ne/lst?.dir=/My+Documents&.view=l
ChrisE
09 Feb 2004, 11:07 PM
Originally posted by maxim-1
I'm still formulating it right now. A lot of this has been a learnign process for me so my original data was a bit sloppy in formating. I'm going through right now and forulating it in a concise way. It'd be done right now but I'm having trouble taking into account Shout Out Wins... Also some of the formating is a bit tedius because of expansion and contraction. As soon as I'm done with it i'll pm you. I'd much appreciate your help. Also, if you wouldn't mind walkign me through some of the analytical steps, then I'd be much appreciated. Even if it's just what's the most proper function and how to do it on Excel.
Edit: Link for the file I'm now using.
http://f2.pg.briefcase.yahoo.com/bc/maxim0ne/lst?.dir=/My+Documents&.view=l
I don't know anything about linear regressions, Maxim, but if I could make a polite suggestion: in Excel, if you go to format, cells, number, decimal places you can easily round off those numbers; I find I get distracted when they're that long, and I don't really think it's that important how many hundred thousandths of fouls committed Dallas had. :)
(also, the briefcase thing still doesn't work)
mpruitt
09 Feb 2004, 11:16 PM
Heh. Thanks for that Excel tip, believe me I was looking for it before. Yeah I think it only works if you have a yahoo id. I'm working on reorganizing my data by teams now, so when I'm done with that I'll send you the file if you like. I'll say something though, the formating of this kind of stuff is made so much more difficult by freaking MLS' inconsistency. The guys over at SABR don't have to work around contracted teams and shoot out wins, or varying point systems.
ChrisE
11 Feb 2004, 02:29 AM
Maxim, it might be (a lot) easier to just use goal differential instead of trying to work out some kind of points system.
mpruitt
11 Feb 2004, 11:36 AM
Well the reason I was using PTS/GP was that the ultimate object of soccer is obviously to get points. GF/GA allowed would be interesting to do but I think PTS/G is a little more accurate in terms of teams' success.
It's easy enough to go back and recalibrate the pre-2000 point numbers, just look at how many SOW they had to figure otu the number of total ties and adjust from there. The better question is however, is it doing a diservice to the data by rearranging it in that way? Before 2000 that's not the way that MLS operated. I'm trying to look at all-time numbers and get consistent results from data by taking away SOW. But in the process it may be affect the true relative success of a team pre-2000. I wonder what you think about that? When doing any kind of historical analysis should you just let the data in terms of points per game stay inconsistent by respecting the infamous Shoot Out Wins?
Also, another way to measure this data wich I didn't include was team winning percentage. Honestly I've got to look how they calibrate that in MLS. It's a league statistic that has always struck me as odd, looking at winning percentages when you're incorporating ties. I imagine the way to do it would be (3W+T)/(3GP) so you have the number of points gained in a season by the number of possiable ones. If anyone knows otherwise then please do tell.
Right now I'm waiting back for Ur_Land to see what he got by running some correlations and other stuff. I've tried to do it myself but either I can't figure out how to put in the data correctly, or I just have no understanding of what the result means. I will say though by looking at individual team history, throwing it up on a scatterplot and looking at pts/gp and fc/fs, with just about every team there was a distinct correlation in trendlines between the two. I just don't have any idea how to tell you what it is.
ChrisE
11 Feb 2004, 05:15 PM
Maxim, to avoid hijacking your thread, I tried to address your questions about winning percentage etc. in this (http://www.bigsoccer.com/forum/showthread.php?s=&postid=2132662#post2132662) thread.
ChrisE
17 Feb 2004, 10:45 PM
Did this ever get completed, Maxim?
mpruitt
18 Feb 2004, 01:16 AM
Nah I've hit a wall in terms of a learning curve in regards to the statistical analysis, with some teams there's obviously a stronger correlation than others but it still doesn't seem to make a lot of sense. Also, when you put all the numbers together there's virtually no correlation. Same time when you plot things like pts/gp and fc/fs on a scatterplot for each team there seems to be a direct connection between their two trends. so i dunno. Ur_land said he was going to crunch the numbers but never got back to me. I'll see if I can't make more sense of it or try to find a way to come up with numerical justification for what I'm seeing graphically.
beineke
18 Feb 2004, 01:43 PM
Maxim, maybe you could post the correlations (ideally, between goal% and foul%) for each team, as well as the correlation across all teams.
mpruitt
24 Feb 2004, 02:40 PM
Just fyi i've been pre-occupied with some other things recently, mainly job searching as well as being away from my home computer recently. I'll have the team correlation numebers posted in the next couple of days. Btw any suggestions as to whether FC/FS vs FC/FC+FS might be more valuble and why?
beineke
24 Feb 2004, 10:02 PM
Originally posted by maxim-1
Btw any suggestions as to whether FC/FS vs FC/FC+FS might be more valuble and why?
Personally, I prefer the latter. When you look at FC/FS, numbers like 0.67 and 1.50 are "opposites" of one another. That kind of thing is easier to keep track of (both mathematically and mentally) with numbers like 0.4 and 0.6.
Good luck with the hunt ...