Comments and forum for a non-linear regression on the Crew (SAS)

Discussion in 'Statistics and Analysis' started by taylor, Mar 1, 2004.

  1. ChrisE

    ChrisE Member

    Jul 1, 2002
    Brooklyn
    Club:
    --other--
    Nat'l Team:
    American Samoa
    Taylor, can you explain to me what your population variable means?

    edit: neat results, by the way.
     
  2. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany

    It's population of the columbus metro area per year. Once finals die down I will explain some more.
     
  3. numerista

    numerista New Member

    Mar 21, 2004
    A few thoughts ...
    1) It's hard to interpret "won last game" as a predictor without also including a measure of the team's overall strength. Is the real conclusion that good Crew teams draw more than bad ones, or do winning streaks inducing attendance (or both)?
    2) Rather than (inaccurately) claiming to be "working with a 7% significance level," you should really just state that the Spanish-language TV coefficient came out to be borderline significant.
    3) Kenn (kenn.com) has shown a huge weekday/weeknight effect. IIRC, he has also shown that holidays and opening day tend to have a large impact on attendance. These are easy predictors to collect, and because they aren't in the model, it's hard to place much faith in the results.
    4) It would be interesting to see the same analysis carried out on logged attendance. The model would then be interpretable in terms of % change, and you would probably come closer to satisfying your underlying normality assumptions.
     
  4. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany
    First of all I made a LABELING MISTAKE in my rush. THE "Wonlastgame" VARIABLE IS ACTUALLY A BINARY WEEKEND VARIABLE. To be clear, the "wonlastgame" variable is actually whether the Crew played on the weekend or not.
    I was excited about finally getting rid of the autoco and made a silly mistake. Forgive me econometrics gods. I apologize for my hasty labeling. When I copied the results over on to the BS page, I renamed the variables in a way I thought people could easily understand the variable labels. I obviously overestimated my own abilities in remembering my labels.

    Also, IMHO, there is no accurate or inaccurate level of significance. There are standards, but one level over another is not "inaccurate". Particularly when dealing with a one tailed test it is quite plausible to consider spantv significant. But then again, there is no correct level of significance, so interpret as you will.

    Second, as I cleary stated, the model is underfit, so take your salt.

    Third, I want to correct my R square value statement. That was an R square using a transformed matrix value for attendance. I've learned that I can't do that. Under a linear reg, the r square is around 25%.

    Fourth, once I get some time, I will try to log everything. The problem is that SAS is esoteric and labor intensive. Both of which I don't have a lot of right now.
     
  5. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany

    No the Crew were not helpful at all. I think they thought I was working for the players union or something. That or they really didn't know what the hell I was trying to do.

    ps, sorry for not responding sooner. I to do a couple papers over the past week.
     
  6. numerista

    numerista New Member

    Mar 21, 2004
    For the record, Taylor ran a statistical test and got back a p-value of 0.0694. Then he/she changed the significance cut-off from 0.05 to 0.07, just so that he/she could claim that the result was below the cutoff.

    Here's the problem:
    If you're willing to change your cutoff after the fact, you invalidate the probabilistic formulation that gives a significance level its meaning. For that reason, I use the word "inaccurate."

    In any case, the good news is that the weekend/weekday predictor is in the model. It'd really be nice to get seasonal effects (which Kenn has documented) in there, as well as opening day and holidays.
     
  7. ur_land

    ur_land New Member

    Aug 1, 2002
    Boulder, CO
    I agree with numerista. The "industry standard" for significance is .05--saying you're working with a significance level of .07, while technically not incorrect, is odd, and takes away from the impact of your other significant results. Just call the spanish-tv result "marginally significant" and move on.

    Another question--are these all one-tailed tests? If so, why?
     
  8. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany
     
  9. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany
    ok, Num after reading your message again, I may of come accross a bit too strong. Sorry about that.
     
  10. numerista

    numerista New Member

    Mar 21, 2004
    It is incorrect -- a significance level is defined as the probability one would reject in the case where there is no signal. By claiming to be working with a significance level of .07, Taylor is claiming that he/she would not have rejected if a p-value had come out to be .0704. Since Taylor admits having chosen a cutoff after seeing the results, it doesn't make sense to talk about a significance level at all.

    This is a technical point, but it's also the reason why it's important to have an industry standard, regardless of what particular standard is used.
     
  11. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    PEOPLE! PLEASE? We're all here for the love of numbers, and horriably esoteric analysis of them! Don't fight! We're all in this together!!! Lol you guys are having a pretty heated disagreement and believe me when I say, I have absolutely no idea what you're talking about. This thread has turned extremely weird.... Kudos to the both of you though for mixing it up on a topic like this.
     
  12. numerista

    numerista New Member

    Mar 21, 2004
    :)

    (For the sake of clarity, let me state that I'm not here to fight with anybody, and that I try to avoid making posts with that intent. But as an expert in statistics, I am prepared to be finicky about studies being done reasonably well. There's no point in breaking out heavy statistical machinery if it isn't used proficiently.)
     
  13. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    It's just cracking me up. As I just pmed you when we started this form it was with the idea of taking sabermetric principles and statistical analysis and using it to learn more about American soccer. The stuff you guys are intellectually sparring about is just so beyond anything I'd envisioned. It's cracking me up but is oddly validating. I'm wondering though for those of us with a slightly shall we say, 'less then proficient grasp of stats and economics' would the two of you mind explaining some of this stuff further in English. If you don't feel the want or need to break it down for my 5th grade grasp of this stuff then that's totally cool. It's just far far beyond me :)
     
  14. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany
    Maxim, I am in the middle of my finals period. After the 8th... well after 10th (I will be drinking heavily for my sister's graduation party)
    I/we/whoever can spend some time extrapolating this stuff. Part of the problem is that this stuff is interpretative. There are different camps when it come to this stuff. Some are more risk averse than others and that's where the debate begins.

    Finally, I learned along time ago not to fully trust statistician's/econometrician's presenations because of how much manipulaton can occur.

    Any person claiming their analysis to be a perfect representation of data should raise eyebrows.

    Hence, my attempted humble "salt comment".

    Now, to Num.


    First, the t tests the null hypothesis if mean estimate equals zero. I have no idea what you mean when you say "signal"

    I tried to be nice, but are you trolling this? If so, this is an unprecedented bigsoccer moment. Seriously.

    I have a year of SAS under my belt and would never want to call myself an "expert", hell even after five years.

    I still feel you still haven't substantively made a point. As I said before, the .1 level is also an "industry standard" level is it not? Check yes or no. If no, please cite why the .1 level is an unacceptable standard ( can you include several citations), because I can cite several too at the .1 level, albeit after the 12th.

    To be clear one more time, I do understand and agree that a majority of academia uses a .05 level. That, however, does not make a substantive point. Academia still applies a .1 and less level test. If I were to publish something, I would probably use a .05, but it really depends on how risk averse one is (i.e. type 1 or type 2 error), hence a .1 or even greater could be applicable, depending on the study.

    Just because the .05 level is used a majority of the time does not, however, mean one can not use other levels, to implicate otherwise is specious.

    You clearly feel some people really should only work with a .05 level, but the degree you are are harping on this, if indeed you aren't trolling, is what I really don't understand. People use other levels, as an "expert", you undoubtedly know this.

    I'm sure as an "expert" you recognize this stuff to be far more of interpretation than a binary anwser, so add some salt to make it more digestable for you.

    Even following your "expert" opinion, is it not reasonable to apply a one tailed test and have a 3.49 value, therefore satisfying your .05 level?

    Just so we can try to settle this, can you please anwser "yes" or "no" to a couple questions.

    Does the "industry" publish .1 levels of sig, therefore implying a .07 is also an acceptable level? Remember the cyber soccer Gods still hold hontesy to be a virtue.

    Will you except a one tailed test, if yes does it not therefore make the .07 level significant under a .05 test? If not please explain.

    The only real substantive critque SO FAR (because, again, I am not finished with the project) is the spantv standard error.

    Num, as the "expert" would you like to comment on why the SE would cause it to be "marginal"?

    The reason I have parathesized the expert is because you haven't, imho, demonstrated anything substatively, other that saying because you are an industry "expert" I am misinforming people by using a .07 level.

    To be clear, if I am wrong and you are right on anything, I have no problem stating it. I would prefer a collective effort over competitive anyday.
     
  15. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany
    I forgot one more point. To me, and implicitly the world, using any particuliar level is discretionary. We make these discretionary decision everyday.

    One can use whatever level they want, it merely depends on the risk aversion. EG in terms of industry, the EPA uses a different level than the OMB.

    For example on a personal level Maxim, if it comes to risk aversion to the life of your child, you would be as risk averse as possible. Therefore, using a .000000000001 level would be the only way you would risk your child's life. You would want this level to ensure yourself that there was as little chance as possible that your child could die. A .05 level would be far too great a risk level for you, because that would mean 1 out of 20 chances your child would die.
    On the other hand, if the test was to offer a life saving vaccine to your terminally ill child (assuming no cost), you would not care what the failure rate was, hence using .5 level or less, in the hope of saving your child (i.e. 1 out of 2 failure).

    I hope this has illustrated why it is really an interpretational issue when it comes to the t-test.
     
  16. ChrisE

    ChrisE Member

    Jul 1, 2002
    Brooklyn
    Club:
    --other--
    Nat'l Team:
    American Samoa
    In order to try to break the tension here, as finals have clearly gotten to numerista and taylor (yet I remain blissfully unconcerned with my paper due in two hours), let me try a comically bad explanation of what they're discussing. Probably my huge errors will make these two quit caring about the pretty insignificant problem they're having.

    The idea behind a regression, basically, is to use some variables to get an equation with which you can predict some other variable, in this case attendance. So, in a non-soccer context, you might want to be able to guess a person's weight from his height. So you do several statistical computations (mostly involving expected values, I think, but it's not important), and you get an equation. I did the height/weight thing for MLS players, the equation I get is y= -177.2+(4.89x). So, if you've got a guy who's 60 inches tall, you get his expected weight as -117.2+ 4.89(60)=116.2lbs. The strength of this relationship is really strong (r=.796), so 63% (r^2) of the variation in weight is accounted for by height. If you've got a guy of average (for soccer) height, 71 inches, you'd expect him to weigh (4.89*71)-177.2 = 169.99 lbs. MLS players who are 5-11 have actually averaged 169.1 lbs.


    What taylor is doing is trying to predict Columbus Crew attendance using way more variables. In the last case, -177.2 lbs was the intercept, it's sort of a base, and is going to be the same regardless of height. In the Crew's case, the intercept is -38411 people, but that's not really important. And I may be misunderstanding.

    What is important are all the other variables; so, for price, taylor's model says that $1 in price change reduces attendance by 2296 fans. You can do this whichever direction and whichever amount you want, so we can interpret it also as saying that dropping prices a dollar would increase attendance by 2296 fans, and increasing prices by $.50 would lose the Crew 1148 fans/game. Likewise, he gets that televising a game on Spanish tv causes a mean drop of 1696 fans/game. However, numerista's problem with this is that the variance (so some games may havea loss of 3000, and some a loss of 300) a is too high, so it's possible (though pretty unlikely) that this can be accounted for just by chance fluctuations in attendance; the industry standard for significance is a little bit below what taylor got for Spanish television, and I think it really bothers numerista that he decided to lower his standard for significance after he saw his numbers.

    In taylor's defense, if you expect that Spanish television is going to decrease attendance (as opposed to saying it will change it, but you don't know in which direction), spanish television becomes significant even by numerista's strict standards.

    Just in explaning taylor's results a little bit more:

    Crew Stadium caused an average increase of 6459 fans.
    Televising a game on Spanish tv causes (maybe) a drop in 1696 fans.
    Televising a game in English doesn't affect attendance.
    Your opponent has no effect on attendance.
    The population of Columbus does not affect attendance.
    Increasing prices by $1 reduces attendance by 2296, and vice versa.
    Games played on weekends average 3146 more fans than games on weekdays.

    This model accounts for 27% of attendance variance, so you're not going to get hugely precise results from it. Other factors - weather, Andrulis still being around, stuff nobody's thought of, simple random variation have a huge effect on what the attendance for any individual game is. However, the more games you use this for, the closer the model's predicted mean should come to the actual attendance mean.

    I hope i was clear and didn't make any huge mistakes. Additionally, I hope the statisticians don't mind a rank amateur moving in on their territory, and I definitely hope they'll correct everything I got wrong.
     
  17. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany
    Three quick things. It's assuming a linear relationship, I never had decided on a "standard", nor did I think a range between .1 to .05 to be problematic and I am no expert.

    Thanks a lot for clearing some stuff up.
     
  18. numerista

    numerista New Member

    Mar 21, 2004
    Taylor, this isn't such a difficult point to understand, but you're throwing around so much attitude that you've missed it entirely. I've said it, and ur_land has said it, too -- regardless of what threshold you choose, you need to choose it before you run your analysis.

    Note that the steps in this link have an order to them ...
    http://rimarcik.com/navigator-en/hypotezy.htm
     
  19. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany

    OMG.
    Umm no disrespect, but I was expecting more than some Slovokian ( I think) website when I said citation, e.g D. Gujarati. Second, even using your own Slovokian website, the site itself says one can "conventionally" use a .05. I don't know how you define conventionally or what the word is in Slovokian, but the word implicitly means you CAN use other "expert standards", so again why such grief?

    Also, for clarity, I was hoping you would respond to the yes no questions, just so we can be perfectly clear. If you imply that I misleading people or am opperating incorrectly, you need to back it up. As I said before, I am no expert, it is quite possible that I am missing something, but I still haven't heard anything substantive yet.

    I am no longer clear if your problem is methodology or initial findings. If you feel I should have set a level before I started, that's entirely fine. I respectfully disagree. I am in a camp that believes in interpretation. I don't say this out of self-interest, merely because these are estimations and require a degree of tolerance when reading results, particuliarly underfit ones.

    Perhaps it is due to my insufficiencies in this stuff, but I frankly don't understand your case. The findings are acceptable under your own website.
     
  20. ur_land

    ur_land New Member

    Aug 1, 2002
    Boulder, CO
    ChrisE,

    This was a really good description of a regression. You described numerista's objection pretty well, but I'll go into it a little more.

    I think Taylor and Numerista are arguing past each other about different points. I think Numerista knows and acknowledges that .05 is an arbitrary cutoff point that is the typical threshold for statistical significance in academia. Taylor, I don't think that Numerista cares if you use .05, .01, .07, or .54321. He just wants you to pick a significance level beforehand, because this is how the logic of statistical testing is supposed to work.

    I think Taylor, as he says in his previous post, is operating more from the perspective of how people actually do stats. You run your test and see what you get. Most of the time people implicitly are using the .05 criteria when they do this. For this reason, I really don't think that you should say they are significant at the .07 level; just use .05 and call the spanish TV result marginally significant. Substantively, it takes nothing away from the impact of your results and it prevents red warning lights from going off in the heads of people like numerista.

    For those that don't know much about statistics, this is a contentious issue becuase you can never prove something with 100% certainty when you use inferential statistical tests. You can be 95%, 99%, 99.9999999.....999% certain that your result isn't what's called a type-one error (or false positive), but there is still some doubt. Setting the significance level tells you what level your chance of having a false positive is set at. When the level is set at .05, there is a 5% (or 1 in 20) chance that a significant result could have been arrived at not through any meanignful relationship between the variables, but purely through chance. However, for reasons that I won't go into here, this logic is only true if you set your level before you do your test. Which is why numerista is annoyed.


    The intercept is what the value of the predicted variable would be when the predictors equal zero (it's just like the y-intercept when you graph a line: y=mx+b). So the predicted weight of an MLS player that was 0 inches tall would be -177.2 pounds. Likewise, in Taylor's data, the predicted attendance is -38411 when price is $0, when the population of Columbus is 0, and when the values of newstadium, spanish TV, english TV, opponent, and the weekend variable are all zero (are these dummy coded? what are the codes?). So, yeah, not very meaningful.

    Actually, if I understand Taylor correct, these all already are one-tailed tests, which I think is a bigger sticking point than the p-value dispute. When you test a hypothesis, the test can either be two tailed (or non-directional, i.e., teams in white nike jerseys have a different average ability than teams in green nike jerseys) or the test can be one tailed (or directional, i.e., teams with white nike jerseys have a higher average ability than teams in green nike jerseys). If the test is one-tailed, your criteria are looser-- your test statistic needs to be half as large as it is for a two-tailed test to reach the same level of significance.

    The caveat to doing one-tailed tests is that you need to have some really really good a priori (before the fact) hypothesis to justify doing a one-tailed test. I'm assuming Taylor has those, even though he didn't express them, and if he was actually going to publish this, reviewers would expect him to justify the use of a one-tailed test with those hypotheses. YMMV, but in some academic fields (like social psychology and neuroscience, which are my fields) one-tailed tests are generally frowned upon. Not to say that a one-tailed test was wrong here--it's just that I think Taylor should have said up front that these tests were one tailed and then given his justifications for doing so.

    Hope this helps clear things up a bit.....
     
  21. mpruitt

    mpruitt Member

    Feb 11, 2002
    E. Somerville
    Club:
    New England Revolution
    Thanks for the clarification guys. Good luck with your stuff taylor. Seems like you've taken on a pretty big task here and done a commendable job. We all should be open to agrressive peer review with whatever we do on here, I'm just getting a kick out of this because it's so relatively obscure. Finally we see the type of debate on this forum that's normally reserved for promotion/relegation, Chivas USA, and whether Chris Armas sucks on other parts of bigsoccer!
     
  22. numerista

    numerista New Member

    Mar 21, 2004
    Very articulate post, ur_land ... I've singled this out because it's pretty darn close to my initial comment, which didn't come with the elaborate explanation.

    My other primary reason for being concerned about about Taylor's results is that there are known important factors (e.g. seasonality, special events) that are easily measured and are known to impact attendance but have not been taken into account. As such, some of our "effects" may be due to bogus correlations. (Ironically, this is similar to your objection to ChrisE's offside study, although that one didn't come accompanied by such grandiose claims.)
     
  23. ur_land

    ur_land New Member

    Aug 1, 2002
    Boulder, CO
    Thanks for the props--it's always nice to be called articulate.

    The spurious correlations issue is an important one, and it would be nice for taylor to go back and try to code for special events and other issues (by seasonality, do you mean spring/summer/fall or do you mean changes from season 1 ot season 10?). Then again, his model explains 27% of the variance--perhaps the rest of the variance is eaten up by $1 brat night and USMNT doubleheaders.

    However, despite all of the gripes that we have, I do want to say thank you to Taylor for doing this and posting it publicly. None of my criticisms are made meanspiritedly, and I think it's great that we're starting to get a group of people interested in applying statistical analyses to soccer. Now if I can just figure out a way to get paid for doing this (anyone in MLS need an ABD grad student that will have his sheepskin by the end of the summer?).
     
  24. taylor

    taylor Member+

    Jun 9, 2000
    Fav team: FC CARL ZEISS JENA
    Club:
    --other--
    Nat'l Team:
    Germany
    thanks Ur_land and co.

    First, I was only using a one tailed test on spantv because of all the informal info on the negative impact of tv on fan attendance. The problem with spantv, imo, is the se. But if you want to use a two tailed test at .05, then sure it is marginal. I again am completely open to interpretations. Also, assuming the spantv to be significant, (at whatever level you want) the se is so large, relative to the mean, that it becomes "marginal".

    Second, if I have not clearly stated that the model was underfit and that some variables were obiously missing, I apologize. Although I do take issue with people misrepresenting information and statements, I will blissfully accept my own imperfections to say I don't know a lot. I only have a year's worth.

    Third, if you don't like the initial methodology that is perfectly fine. I am open to your interpretation, but respect that I am in a different camp. The EPA, OMB, Moody, and academia use different levels depending on the study. They define the industry so I feel quite comfortable using a different level than .05 (and INDEED, they use several levels in their analysis). I believe risk should be assesed looking at the entire model, not necessarily before one starts the model.

    Finally, you need to realize that I have a time and information constraint. I tried contacting the Crew several times, they never called back. I've told people that if they have the desire to get more data, I will gladly cite them and incorporate it into the model. I'm not getting paid for this and I have no intention to publish anything in stats (I'm a political scientist at heart). I do however hope to pass this semester.

    So thanks again Chirs and Ur and Maxim.
     
  25. numerista

    numerista New Member

    Mar 21, 2004
    I meant spring/summer/fall (from Kenn's data, it looks like conceivably a quadratic time-of-year variable) ... I expect that the true underlying cause for this is that people are away on vacation in the summer. But you're also right to imply that the age of the league is something else that Kenn has shown to be very important ... in particular, the year-one novelty effect.
     

Share This Page