Statistics recovery project

Discussion in 'NWSL' started by Ben James Ben, Feb 11, 2012.

  1. Ben James Ben Member

    Member Since:
    Apr 28, 2009
    This is a topic to discuss recovering/archiving WPS statistical data. Although I have faith that the WPS will be back in 2013, there is also the possibility that one day the WPS web site will simply disappear and all the stats hosted there will be lost to posterity. The idea is to copy the data and store it somewhere else.

    I need to think about this more. Just a few random thoughts:

    Box scores: I have all box score web pages (raw HTML) from 2009-2010, and I have the 2011 regular season. I'm missing the 2011 playoffs. I actually should grab a fresh copy of all the 2011 pages, since some of them might have been corrected since then. I need to take a bit of time to do this because I'm afraid of losing the data I already have. I have a script that parses box score pages and extracts the statistical data.

    Team stats: Need to work on a script to grab team stats. I think that I already have one half-written that parses the data. Should just concentrate on grabbing copies of the web pages. Processing data can wait. Unfortunately, historical data is already lost since WPS wipes out team data web pages when a team leaves the league. Missing include LA, GP, CHI, STL, WSH.

    Team rosters: same.

    Player stats: same. Many player stats are also already lost. The WPS web site does an extremely bad job of keeping track of former players. (It wasn't designed to do so, from what I can tell.) Many WSH stats got messed up during 2011 during the conversion to MJ. Many MJ player stats were also messed up in 2011. The MJ stats have disappeared anyway.

    I'm wondering whether the WPS subscribed to one of the sports reporing bureaus. If so, then at least the stats are safe, somewhere. Though, I get the impression that ordinary people aren't able to access the data.

    A good (but relatively huge) project would be to collect media guides and scan in the data from them.
          
  2. SiberianThunderT Member

    Member Since:
    Sep 21, 2008
    Location:
    DC
    Club:
    Saint Louis Athletica
    Country:
    Spain
    I know Wikipedia is a relatively poor place to get information when you want to be 100% sure it's right, but most (all?) of the 2009 teams have a page that lists their "all-time roster" with stats for each player. I think none of those "all-time rosters" have been updated since early 2010 at the latest, but they're still pretty thorough for the purposes of 2009 data.
    e.g. http://en.wikipedia.org/wiki/All-time_Saint_Louis_Athletica_roster
  3. Mister Crossbar New Member

    Member Since:
    Aug 21, 2011
    Location:
    USA
  4. Bonnie Lass Super Moderator

    Member Since:
    Oct 20, 2000
    Location:
    Up top
    Club:
    Olympique Lyonnais
    Country:
    Norway
    With women's soccer, don't EVER assume that the stats are safe. Sure, they may be sitting on a server somewhere, or in a notebook, but there's a good chance they'll never see the light of day. Much less be available to someone who'd like to actually access them at some point.

    There's a guy who started archiving box scores for Norway's Toppserien (And Women's National Team) back in the late 90s. And I thank god he did every day. His site was a bare bones, black text on white background, and lwas basically a big text file. It wasn't pretty, you couldn't really search for a particular player or anything like that. But at least it existed. The same thing goes for the RSSF stats. (Although sadly, some of those have disappeared over the years.)

    I'd also say, if they're available, to start saving player mugshots and/or bios, if you can. If the league doesn't come back, there's probably going to be several players who will vanish into obscurity -- get a real job, stop playing, etc.

    And really, if anyone (not just you) wants to save anything from the various sites, they'd better start now.
  5. Cville K C Member

    Member Since:
    Nov 3, 2008
    Location:
    Collinsville, IL
    Club:
    Saint Louis Athletica
    Country:
    United States
    At one time, I had a dream of putting together a Women's soccer encyclopedia, even if I was the only one that would ever see it. Unfortunately, it takes a lot of time and effort. Every time I got started, something else would come up (like my real job) and I would have to put it to the side.

    The goal was to start with WPS and the major international competitions like the World Cup and Olympics, then branch out into the other top leagues around the world and later go back and hit the W-League and WPSL, plus the old WUSA. One problem I always had was that websites are constantly changing and archived stats were often handled poorly. I would find a good source of information and the next time I would go to the site, it had changed and I couldn't find how to get to that info anymore. It became very frustrating (a good example was when the US Soccer site was revamped and I could no longer go back several years to get stats from each game and I also noticed the Canadian site had changed quite significantly in the past year or so).

    Another problem with WPS is that often I was able to print the original box score, but there were later scoring changes that were never updated in the original box score. I kept a player by player stat sheet for every single player in 2009, but my single game stats never added up to the team stats on the WPS site because scoring changes were made, especially in the area of assists.

    I certainly admire anyone who has the time and perserverence to delve into this. It isn't easy.
  6. ceezmad Member+

    Member Since:
    Mar 4, 2010
    Location:
    Chicago
    Club:
    Chicago Red Stars
    Country:
    United States
    Well WPS webpage is down, so I assume all the stats are lost for ever.
  7. Cliveworshipper Member+

    Member Since:
    Dec 3, 2006

    The wayback machine is your friend.
    ceezmad repped this.
  8. ceezmad Member+

    Member Since:
    Mar 4, 2010
    Location:
    Chicago
    Club:
    Chicago Red Stars
    Country:
    United States
    wikipedia?

    I was trying to update the Red Stars page, but I can not find the game info for their 2 WPS seasons, and can't find much from theri WPSL and WPSL Elite seasons.

    http://en.wikipedia.org/wiki/Chicago_Red_Stars
  9. necron99 Member

    Member Since:
    Oct 17, 2011
    Club:
    Washington Freedom

    He means the Internet Wayback Machine at http://archive.org/index.php

    That site is connected to a giant archive that collects snapshots of websites on the internet. They will have multiple copies of the various webpages that you found useful. Of course it isn't perfect, sometimes links don't work, or they might have long time gaps between their versions of a webpage. You can go there and look at the old WPS website.
    ceezmad repped this.
  10. ceezmad Member+

    Member Since:
    Mar 4, 2010
    Location:
    Chicago
    Club:
    Chicago Red Stars
    Country:
    United States
  11. ceezmad Member+

    Member Since:
    Mar 4, 2010
    Location:
    Chicago
    Club:
    Chicago Red Stars
    Country:
    United States
  12. ceezmad Member+

    Member Since:
    Mar 4, 2010
    Location:
    Chicago
    Club:
    Chicago Red Stars
    Country:
    United States

Share This Page