A statistical analysis of the USA player pool in MLS

Discussion in 'USA Men: News & Analysis' started by NoSix, Nov 27, 2015.

  1. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    OK, you’re done stuffing yourself on turkey and all the trimmings, but Camp Cupcake 2016 is still over a month away – what to do with ourselves in the meanwhile? Let’s see what we can learn from some basic analysis of publicly available statistical data on USA and MLS players. Later, we’ll delve into more “modern” data, but we’ll begin our investigation with the tried and true old “chestnuts”: goal and assist data.

    Here’s USA 2015 goal and assist data broken down by actual minutes played at each position:

    For forwards (F):
    [​IMG]

    Outside Mids:

    [​IMG]

    Central Mids:

    [​IMG]

    Outside Defenders:

    [​IMG]

    Central Defenders:

    [​IMG]

    And Goalkeepers:

    [​IMG]

    Where min is minutes played, g is goals, and a is assists. The data source is mlssoccer.com. (Haven’t seen this guy Total play yet, but we should definitely cap him now.)

    The “wisdom” of the prominent positions held by the likes of Chandler, Brooks, and Alvarado on these charts has been beaten to death on these boards already. For our current purposes, let’s look at the totals by position:

    [​IMG]

    Where g90 is goals per 90 minutes, a90 is assists per 90 minutes, and ga90 is the sum of g90 and a90. From this data, it is not hard to diagnose why the USA have struggled to create and finish chances in 2015: the black hole that represents the offensive contributions of our outside midfielders is a gaping chasm too large to ignore. Note that despite all of the hand-wringing on here about our forward pool, thanks to proven veterans Dempsey and Altidore, and with contributions from youngsters like Wood and Morris, the USA received healthy production from its forwards in 2015, and that position would seem to be the least of the team’s offensive worries. Note also that most of the offensive contribution from the CM position is due to one individual, Michael Bradley, who perhaps unsurprisingly has struggled at times with the role of simultaneously being the #10, #7, AND #8 for this team.

    Be that as it may, the above table showing average performance by position sets the standard for USA in 2015. Surely, there must be some outside mids in MLS who could exceed the extremely modest 0.29 ga90 standard, and the CM position seems another likely place to look for candidates who can contribute offensive production in 2016.

    Much more to come…
     
  2. swedust

    swedust Member+

    Aug 30, 2004
    I always enjoy your well-constructed posts. At the risk of sounding ungrateful, does your source data track how many of these goals are penalties and/or free kicks? Just curious if we can isolate goals from the run of play (for instance, I'm sure all the CD goals are corners or free kicks).

    Anyway, thanks and looking forward to more.
     
  3. AutoPenalti

    AutoPenalti Am I famous yet?

    Sep 26, 2011
    Coconut Creek
    Nat'l Team:
    United States
    The outside mids seem just a tad bit off.
     
    Marko72 repped this.
  4. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    We begin our analysis of the MLS player pool by looking for the top performers in 2015.


    First, goals per 90 min:
    [​IMG]

    And we immediately see the benefit of starting with the basic g90 stat: it seems to be the metric Klinsmann’s crack analytics staff is using to select forwards, given that the three current national team forwards with the most USA minutes top this list. (On this and subsequent tables, players with 2015 national team caps are shown in bold.) Further down the list are Agudelo in 11th, and Nguyen in 19th. Note that Dwyer is not yet USA eligible. Thus far, Lletget and Finlay would appear to be leading candidates to provide some goal-scoring capability from the outside mid position.

    Next, assists per 90 min:
    [​IMG]

    Note that for assists I am counting only primary assists, consistent with the scoring standard used for international matches. The source for all MLS (regular season) data is whoscored.com. Out of the top 10 assisters in MLS in 2015, only 2 received a USA cap in 2015, and as we saw in the previous post, Nguyen received only 76 min of playing time. Perhaps the problem isn’t that there is no talent in MLS, but rather that what is there isn’t being utilized. Further down the list, Zusi is 20th, Nagbe 22nd, Shea 27th, Bradley 28th, and Zardes 34th. Davis is 34, but otherwise there seems to be no lack of potential candidates to help set up goals from both outside and central midfield positions. In terms of attacking fullbacks, the first outside fullback on the list is Taylor Kemp in 13th place.

    Obviously, soccer is a team sport, and places in the line-up are limited, so ideally we would like to identify players who can help to both score and create goals.

    Goals and assists per 90 mins:
    [​IMG]

    Again, we see that from the top 10 performers in goals and assists in MLS in 2015, only 3 received USA caps in 2015. Thus far, Finlay, Rowe, and Grella would appear to be outside mid candidates who could help score and create goals. Feilhaber and Kljestan would appear to be central mid candidates who could do the same. Further down the list, Nguyen is 13th, Agudelo 14th, Zardes 22nd, Bradley 24th, and Nagbe 29th. The top outside defender on the list is Kemp at 34th.

    Note that Dempsey was the top American performer on both USA and in MLS in 2015. I get that he played a stinker against MEX, but should Klinsmann heed the calls on BS to drop Dempsey from the national team, that would be the 2nd stupidest decision of his tenure.

    OK, so far this analysis is all well and good, if a bit simplistic, but how to project MLS performance onto USA performance? The conventional wisdom, propagated especially by certain USA team managers, is that there is a large performance gap – a chasm, actually – between MLS and “the international level”. How big is that gap, really?

    Let’s find out:
    [​IMG]

    This table looks to be a bit of an eye chart at first, but it’s fairly simple. It compares the ga90 values over the USA and MLS careers for all American players I could find with a large enough sample of USA and MLS minutes to make such a comparison meaningful. In the last line of the table, you can see that overall these players averaged 0.47 goals and assists per 90 minutes in their MLS careers and 0.41 goals and assists per 90 minutes in their USA careers. In the last column, the ratio of those two values is 0.88, indicating that, on average, the level of MLS is 12% below international level, or equivalently, international level is 14% above MLS level. Looking at this ratio by player, it is interesting that Benny Feilhaber has the lowest USA performance compared to his MLS performance. Clearly, there is a fair amount of random variation between players. In fact, based on this data, the median ratio is 80%, and the interquartile range goes from 62% to 94%. For the stats aficionados among us, the fact that the median is less than the mean is consistent with goals being Poisson-distributed, since the median of a Poisson distribution is less than the mean value about 2/3rds of the time. For the rest of us, what this tells us is that the performance of a USA player will typically be about 80+/-15% of that player’s MLS performance, and the range will tend to decrease in size as a player accumulates more USA minutes. That Grand Canyon of a chasm between MLS and the international level looks to be more like a culvert.

    Now we can stroll home from here. Dividing the USA 2015 standard we identified from the 2015 data by 0.8, we get the standard to be applied to MLS 2015 data:
    [​IMG]

    The last column of this table suggests there may not be any forward or outside defender candidates in MLS who are likely to surpass the average standard set by USA players in 2015. However, there are no fewer than 9 candidates above the 0.36 and 0.42 levels for outside mids and central mids:

    [​IMG]

    Note that this implicitly assumes that younger players will continue to perform at their 2015 MLS levels in the future. For players significantly exceeding their career average performances in 2015, it might be wise to average their 2015 ga90 number with their career average value.

    For me, the bottom line of this analysis is clear: there is plenty of attacking talent in MLS capable of meeting or exceeding USA 2015 standards. Now it is up to Klinsmann to invite some of these players into the January camp and integrate them into the team to improve USA’s offensive prospects in 2016.

    Now, this being BS, there will be critics. Surely, you will say, there is more to soccer than goals and assists. The critics will have a point. The advantage of using goals and assists is that you still win soccer matches by scoring more goals than your opponent. Goals are the single most direct measure of results in soccer, but goals are also quite rare. If you look at Lletget’s 0.43 ga90 value from the above table more closely, it is based on only 7 goals (and 0 primary assists), effectively a sample size of 7. Basing any analysis on such a small sample size can be a risky business. Are there more “modern” statistics available that, while being less direct, might perhaps be more robust? There are.

    Stay tuned…
     
  5. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Thanks for the kind words. I apologize for the delay in getting more posts up; getting the tables into a form that can be posted on BS is a pain in the neck.

    Philosophically, I think a goal is a goal, they all count the same. Some goals from the run of play come from shots with higher probability of scoring than free kicks or penalty kicks, so to me it seems rather arbitrary to separate those out. That said, the data come from whoscored, so anyone interested can redo the analysis based on any goal subset they find there. The rest of my posts will use stats that don't involve goals directly, so the issue becomes moot.
     
  6. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Oh, how so?
     
  7. AutoPenalti

    AutoPenalti Am I famous yet?

    Sep 26, 2011
    Coconut Creek
    Nat'l Team:
    United States
    Check your OP.
     
  8. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    Now we depart the safe surroundings of ussoccer.com and mlssoccer.com, and strike out for whoscored.com: “revolutionizing soccer statistics”, or so they tell us. Personally I’m a little skeptical of any proprietary performance rating that purports to rate both strikers and defenders on a single scale, but that’s just me. That said, if we don’t mind sorting through the general clutter on the site, with a bit of effort we can collect some useful data and then define our own rating which we can understand. So, let’s try that approach.

    The basic idea is that soccer players create goals by applying basic skills: passing, shooting, etc. While a less direct measure of results than goals, such skills also occur more frequently than goals, so it may be possible to construct a more robust metric of offensive performance by using an appropriate combination of skill-related statistics.

    The single soccer skill most highly correlated to scoring goals is shots on target, so let’s start there. The following table shows the top 10 American MLS players in 2015 in shots on target per 90 minutes:

    [​IMG]

    We see many of the usual suspects we’re used to seeing from the g90 table, but there are also new names like Manneh and Agudelo moving into the top 10.

    Getting shots on target is certainly an important component of goal-scoring, but we also need players who can deliver the killer pass that leads to shots on goals. The next table shows the top 10 American MLS players in 2015 in key passes per 90 min:

    [​IMG]

    Now things are starting to get interesting: players like Zusi, Nagbe, and Bradley appear in our top 10 list for the first time, and lo and behold, there’s even an outside defender at #6 in Chris Tierney.

    Key passes are clearly an important component of goal-scoring, but more generally successful passing is also necessary in order to advance the ball to a position from which the killer pass can be played. The next table shows the top 10 American MLS players in 2015 in passes completed per 90 mins:

    [​IMG]

    Now we’ve clearly left goals and assists behind and struck out for new frontiers: Bradley and Kljestan remain, but otherwise we’ve identified a whole new batch of goal contributors.

    Passing is one way to advance the ball, but there is an available alternative: dribbling. The following table shows the top 10 American MLS players in successful dribbles per 90 minutes:

    [​IMG]

    Once again, we see some familiar names, but also a few new entrants such as Amarikwa and Duka into the discussion.

    We’d like to combine the various components of goal-scoring into a single all-encompassing offensive rating, but how to weight the different components? I suppose we could do a regression analysis on goals as a function of our predictors, but as this is just for fun I propose instead to deduce approximate weights using a bit of logic. My logic is as follows: shots on goal is clearly the most important component, so it should get the highest weighting. Key passes is clearly the next most important component, so it should get the next highest weighting. You can advance the ball further and faster through passing than dribbling, so passing is the next most important component: it gets the third highest weighting, followed by dribbling. Absent any further information, I’ll choose to space my weights equally: 40% for shots on goal, 30% for key passes, 20% for successful passes, and 10% for successful dribbles. Combining all of our metrics into a single offensive rating, the top 35 most skilled American players in MLS in 2015 are as follows:

    [​IMG]

    Where ss is shots, successful (on target), kp is key passes, ps is passes, successful, ds is dribbles, successful, and orat is our new offensive rating. The left-most column is the rank according to orat, the right-most column is the rank according to ga90, so for example, Nagbe went from being rated the 29th best player based on ga90, to the 6th best player based on orat. Some big movers up the table include Bradley, Nagbe, and Diskerud. Have we rediscovered the magic Klinsmann rating formula? Interestingly, Kljestan moves up, but Feilhaber moves down in our rankings.

    That’s all well and good for offense, but what can we say about defense? As it turns out, not much - but a little bit.

    More to come, but not until tomorrow…
     
  9. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    I checked it. It looks fine to me.
     
  10. Mahtzo1

    Mahtzo1 Member+

    Jan 15, 2007
    So Cal
    He may be saying that Bradley, Jones, Shea, Wood aren't outside mids?
     
  11. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    True, but this is Klinsmann we're talking about here: those are the actual minutes those players spent at those positions in 2015. For example, against NED the formation was a 4-3-3 with Beckerman in the middle of outside mids Bradley and Morales, and with JohnsonF pushed up front as a third forward.
     
  12. Elninho

    Elninho Member+

    Sacramento Republic FC
    United States
    Oct 30, 2000
    Sacramento, CA
    Club:
    Los Angeles Galaxy
    Nat'l Team:
    United States
    It's a little depressing that the player who got more minutes than any other at outside mid for the USMNT was DeAndre Yedlin.
     
  13. swedust

    swedust Member+

    Aug 30, 2004
    Cool. I'm no whiz at statistics (or, perhaps, that's exactly what my analysis is worth) but I will take a look. Thanks.
     
  14. chad

    chad Member+

    Jun 24, 1999
    Manhattan Beach
    Nat'l Team:
    United States
    Thank you for all of this info.

    It is very depressing.
     
    russ repped this.
  15. Suyuntuy

    Suyuntuy Member+

    Jul 16, 2007
    Vancouver, Canada
    Focusing just on production for the USMNT, the ga90 per player (not included in the original tables):

    Playerg90a90ga90
    Wood, F1.170.291.46
    Dempsey, F1.2001.20
    Altidore0.590.200.79
    Dempsey, CM0.6100.61
    Zardes, OM0.220.320.54
    Bradley, CM0.170.340.51
    Johannsson0.290.150.44
    Zardes, F0.160.160.32
    Yedlin, OM00.300.30


    Rushed Conclusions:

    - Wood is on fire.
    - Dempsey still by far the best of the guys with many minutes, producing as F and CM.
    - Altidore not half as bad as some think.
    - Zardes's numbers not bad at all.


    PS: I'm not convinced club production and NT production are necessarily related.
     
    juveeer and JJV1994 repped this.
  16. TheHoustonHoyaFan

    Oct 14, 2011
    Houston
    Club:
    FC Schalke 04
    Nat'l Team:
    United States
    In the 4-3-XX sets that the USMNT plays the 2 midfielders on each side of the DM are not "outside" mids, they are in fact center mids.
     
  17. adam tash

    adam tash Member+

    Jul 12, 2013
    Barcelona, Spain
    Nat'l Team:
    United States
    So amarkiwa is a top 2 us forward? makes sense to me, I've always wanted him to get a chance. plus, I think he's a good obafemi foil for Dempsey....the two of them in front of nagbe, ngyuen, and Bradley....

    also, I think it is important to note that the gap between international vs mls was much more uniform and wider the farther back in MLS you go....in other words, "MLS now" is much more comparable to usmnt level than "MLS then". there are likely now players who would play better internationally than in MLS...whereas, before, that was extremely unlikely. Bradley's stats are weird...he is waaay lower in MLS. I think maybe that signals that his role and privilege on usmnt is bigger, so he produces more. role on the team and "spine status" matters a lot more on usmnt. however, funneling everything through him doesnt really make sense when looking at his MLS stats. personally, I don't think it is a coincidence that the team has struggled so much with Bradley taking such a huge leadership role and these stats back that up somewhat.

    other things these stats don't reveal/consider: the role/instructions players are given, opponent strength, teammate strength and "game situation". a sub made in a game where their team is getting bossed doesn't have much chance to shine.etc etc

    also, I think PK's should be weighed less, especially if the taker didn't earn the kick. PK's are random and easier to score...if we want to predict how the team will play in the run of play, PK's should not be the same as run-of-play goals. I would eliminate them completely.

    lastly, the weights for different aspects of play should maybe differ by position? dribbling sucks for the usmnt as a whole and really lets opposing teams pin their ears back on d. its funny to me to see a lot of names I personally rate: manneh, lletget, amarikwa very high on the list. I guess in my internal stats spreadsheet, I give dribbling a high weight.
    also, Bradley is actually pretty high on the dribbling ranks...maybe he could play wing and open up a cm slot for someone better suited to that role creatively?
     
  18. thedukeofsoccer

    thedukeofsoccer Member+

    Jul 11, 2004
    Wussconsin
    Club:
    AFC Ajax
    Nat'l Team:
    United States
    #18 thedukeofsoccer, Nov 27, 2015
    Last edited: Nov 27, 2015
    We've discussed at length about how Wondo's goal totals are greatly inflated by how many pk's he took. He's well down the list per 90 otherwise.

    I don't have much doubt Klinsmann weights raw goal totals higher than most given his disinterest in MLS and how he himself played. That said, Altidore, Dempsey, and Wondo were grandfathered in; so them still being top 3 in raw gs per 90 doesn't say a lot. Wood is scoring now, but he wasn't initially when Klinsmann was calling them in. More so he has an affinity for vets and the German leagues. There just happens to be plenty of overlap in the interests too.

    Lletget was 4th in gs per 90, the ones above him all scored some pk's, and they played forward while he was a winger. Half the time he was a straight up left wing, not even attacker. Speaks volumes as to his ability to get shots off and find the net. Put him as a CAM or second striker, what does he yield then? Something to think about. We shouldn't use him at left wing primarily because Arena had to in order to appease his DP's. I think he will be an impact player at CAM/SS in the prime Dempsey mold, if used at his proper positions.

    4-10 in key passes have all been shouted out for call-ups for a while now. Only 1-2 of that group has Klinsmann had some interest of integrating. Same thing with 2-10 on assists. Apparently passing is an ability that doesn't interest Klinsmann much. Therefore we struggle to link between defense and attack and come up with final balls.

    Bradley grades well in dribbling, key passes, and you know he's defensively responsible, albeit not an aggressive tackler. Giovinco was the man in Toronto, but I think he overshadowed Bradley, and Bradley got unfairly blamed in part for the team's overall struggles. It was everybody else's fault but Giovinco and Bradley as Montreal scouts suggested.

    Salinas shows up in key passes, dribbling, and I wouldn't be surprised if he were up there in tackles. He's a pretty mindful defender and starts plenty of breaks. I think he's underrated as a USNT prospect in part because of his age (29), but his attributes and ability to play anywhere on the left makes him of possible significant value to us for a few years at least.
     
  19. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    If I recall correctly, in his first cap Eddie Johnson scored 3 goals in something like 30 minutes. After that match, mechanically we could have calculated his ga90 to be 9.00, but obviously that is not a meaningful number. The reason why I didn't calculate ga90 values for the USA players in the opening post is that the sample sizes available are not sufficient to make the ga90 values meaningful in this case either. A useful rule of thumb is that a sample size of at least 16 is necessary in order to estimate a mean. In this case, 16*90=1440 minutes, and Michael Bradley is the only player on the USMNT who exceeded that total this year. We can calculate a probably meaningful number for Bradley, but we don't have anyone else to compare him to, based on the available data.
     
    blacksun repped this.
  20. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    I think it likely that the extreme ratios of players like Bradley may be in part an artifact of my methodology. To save time and also to get the largest possible sample size, I used career USA and MLS numbers. Players like Bradley and Beasley compiled significant portions of their USA numbers while playing in leagues abroad. If you really wanted to nail the MLS-USA gap precisely, it would probably be better to go through and pick out only the data from years in which the player played for USA and in MLS simultaneously. That would be a heck of a lot more work, and I'm convinced that the primary conclusions would not be significantly different. Probably the width of the inter-quartile range in usa/mls values is even less than what my analysis would suggest.

    This is a good question. The weights represent the relative importance of the various skills to goal scoring, so as such I would argue that they should be independent of position. A successful dribble at the mid-field stripe is never going to increase your probability of scoring as much as a shot on goal would. It is tempting to think, maybe I should weight, say, shots on goal less for outside backs than for forwards, but if you do so then you are failing to give credit to outside backs like Tierney who do get forward and have a go occasionally: the whole purpose of the analysis is to identify those types of players. It is better I think to keep the weights the same for all positions, but apply different standards for success to different positions. Clearly, outside backs shouldn't be expected to contribute as much to goals scored as forwards should.


    I agree with you. That said, I do think dribbling is relatively the least important of the skills considered here, but still an important one.

    It is as clear to me as the sky is blue that Bradley is ideally suited for a #7 or #8 role in a skinny diamond, behind a true #10 and ahead of a true #6, maybe with Nagbe as the other #7 or #8. Against Concacaf opposition I would think that would typically give you 65-70+% of the possession, and JK could use his beloved burners like Yedlin at the outside D positions to provide width in attack.
     
  21. Suyuntuy

    Suyuntuy Member+

    Jul 16, 2007
    Vancouver, Canada
    A national team doesn't play that many matches. 16 works for a league, with 38 games per season.

    For a NT, 16 is far too high: the great majority of NT players don't reach 60 caps in their whole career. You won't find many who manage 1440' in a cycle. IMO, half of that (8 games) should be enough for NT duty.
     
  22. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    I think you may want to tweak the settings on your sarcasm detector. :D
    What can I say, my sardonic sense of humor is a bit of an acquired taste.

    Salinas ranks 4th in dribbles and 10th in key passes, but 42nd in shots on target and 75th in completed passes. He's a quality player, but perhaps doesn't make enough of an impact to merit a USA call-up.

    As for tackles, what a beautiful segue - more on that coming soon...
     
  23. NoSix

    NoSix Member+

    Feb 18, 2002
    Phoenix
    A minimum necessary sample size is a statistical requirement, not something you can negotiate if you disagree with. You can calculate numbers with small sample sizes, but they are just noise. It's like tuning in to the white noise between stations on the FM dial and trying to convince yourself that you are listening to news radio.
     
    ChrisSSBB repped this.
  24. Suyuntuy

    Suyuntuy Member+

    Jul 16, 2007
    Vancouver, Canada
    A NT plays 18 to 20 games in a full year. That's half the number of games played in a regular league, in that same year. You're using the standards employed when analyzing club play, that is, 38 games in a year. It makes perfect sense to reduce the requirement when your total sample size is also reduced.
     
  25. Mr Martin

    Mr Martin Member+

    Jun 12, 2002
    Club:
    Philadelphia Union
    Nat'l Team:
    United States
    Firstly, thanks for the excellent thread and terrific analysis.

    I'd like to see that very much.

    I don't really have a problem with JK having given MB a shot at the #10 role, and I actually think MB slightly exceeded my expectations by upping his assist numbers above what I thought he could produce (he rarely assisted under his dad in the 4-2-2-2). But in the end, I still view MB as perhaps the best #8 the US has ever had and I'd like to see someone else given the more creative role. Find a #6 to screen the CBs and win balls in the center. Find a #10-like player for the creative role. Play MB as a #8 and either focus on a 3-man midfield with 3 true goal-scorers up top, or add a #7 in a 4-man midfield.

    And I think the Yedlin as Odonkor 2.0 experiment needs to end at RM. Develop him as an international RB, or use him ONLY as a speed SUB in midfield.
     

Share This Page