Edward, Do you code venue in your tables? It would be interesting - especially for the older teams with different eras - to do breakouts by venue. It would possibly make some of the min/max/median deviation data more meaningful. It would also pull out some outlier games that pollute data - Kansas City getting 26,000 for a game in a season where their capacity was 10k. It makes more sense for some teams than others. KC (Arrowhead/CAB/SP) or Dallas (Cotton Bowl/Southlake/Cotton Bowl/Frisco) would be easier. Maybe even Chicago (Middle SF/Cardinal/New SF/Bridgeview).
I do not code by stadium as I started keeping my records long after the stadium/capacity data was reliably available for the ealier seasons. Adding it would not be difficult if I had it though.
It's largely obvious, but there HAVE been stadium exceptions (little odd one-offs that you'd have to track down), almost all of them in the pre-SSS era. We have a responsibility to be accurate only insofar as it reasonably affects the conclusions we can draw. And there are always going to be caveats ("Yeah, but, that was a doubleheader...." or "Yeah, but, that was Beckham's first year" or any one of a number of other things that skew data). I don't know exactly what ONE number people are looking for that will tell them everything they need to know. Kansas City had trouble drawing, and then Lamar got really involved and they had a big boost, and then they dropped off again, and for a while they played in a very small venue and then they built their own venue and now they sell lots of tickets. The End. Whether this Atomic Level Number some are seeking is 11,547 or 9,239 or 16,431...what does that mean? If you've been paying attention long enough, you know the history of teams that haven't drawn well and those that have....those that have played to higher percentages of capacity and those who haven't...those who have played in big stadiums, small stadiums, and in-between stadiums. And, of course, you know that it always looked like more on TV.
I wonder if they argued at the barbershops in the 1930's that the attendance sounded like less on the radio . I am not sure how I would deal with event matches such as doubleheaders or fireworks. I would normally be inclined to take whatever the base value was nominally and just use that noting the specific event as an overage exception. But then I am not sure how I would handle Seattle and its planned full capacity matches. I would probably treat those as two distinct stadium configurations. These variations are one of the reasons I have not tried very hard to get this data. If it falls into my lap I would definitely update for it but it is a lot of work to gather.
Yes I had an error in my Colorado data pointed out to me. I apparently did not correct it in all places here in. I will do so.
No worries - I appreciate your work. I happened to be using that point for a post and first thought I made a mistake as I'm prone to do. If you look at the data, there is a clear inflection from the 5 years before DPs to the 5 years after. New England for example was on clear downward path their first 3 years in the league to 2006. Then they had a huge jump.