It is still not the same. There is a high percantage of good, valuable key passes worth noting that dont end up as assists, while it is very unlikely to make a great shot on target that doesnt end up as goals. Eliminating key passes altogether and only using assists eliminates a big chunk of what creator does that doesnt emd up as an assist. While eliminating shots on target and only using goals, possibly eliminates only rare instances in which good effort might not be counted for. Sheer number of shots on target does not say anything about quality of those shots. On the other hand, if player is generating huge amount if key passes, it usually means the player is doing at least something well regardless of assists. Player can create a lot of chances and if his teammates miss those, he will not be valued whatsoever. On the other hand, over longer period of time, if player makes great shots on target, he will inevitably score a lot of goals. Maybe in a sample size of 40 shots on target, player might be very unlucky because goalkeepers are saving all of his great shots, but over longer period of time, that will not be the case.goals scored will accurately represent quality of his shots. Assists are not like that. You need to create a bunch of chances to get one assist. What about chances not scored. Performance of teammates is much more variable than quality of shots on target.
To use another player that might aid the conversation, I think Robert Lewandowski is actually not that fantastic at the actual ball-striking part of the goal-scoring. Harry Kane, for me, is a superior raw striker of the ball, in my opinion. However, Robert Lewandowski for me, is amazing at getting into the necessary body shape and angle to shoot, via his first-touch, anticipation, stamina, and manipulation of the ball in tight spaces. So it doesn't really bother me all that much, if Robert Lewandowski slightly underperforms compared to his expected goal-value, because the way in which he maximizes raw expected goal values via individual qualites such as off-the-ball movement, stamina, flexibility and range in the manner of goal-scoring scenarios, first-touch, and manipulation of the ball in tight spaces into getting himself into shooting body positions is all top notch. So over-performance of expected goal values, is not enough, for me. Efficiency matters. Raw volume matters also. This is not a turn-based game like chess where each player does one move at a time. Some players just find shooting opportunties better. I don't need a one-shot-one-kill striker if another player can create or find 1000 shooting opportunities for himself within the same 90 minute period (obviously, such absurd, drastic comparisons will never occur in real-life, just hypothetically speaking to get my point across).
Let's take as an example Kevin de Bruyne who is by far the best playmaker in the world. Last season he had 98 key passes but only 14,1 xA. It's 0,14 xA per key pass. And again, he's by far the best playmaker in the world
On the other hand Haaland had 27,1 xGOT out of 59 shots on target. It's 0,45 xGoT per shot on target. So Haaland's shots on target are more likely to turn into goals than de Bruyne's key passes to turn into assists
This thing about the Plus-Minus model is something that was invented for basketball, which is a sport with many points and a very homogeneous league like the NBA. Transferring something like this to football makes no sense for me, first because there is a big difference in the level between teams, which leads to a lot of randomness. Generally, in easily winnable matches, more substitutes are used, and in tougher matches, the best players are played. If you analyze the number of goals that teams score, you will see that the number of goals is very low in very competitive or close matches. This does not happen in basket, where the number of points tends to be very stable. Basically, what happens in football is that teams that play high-level matches will have a low goal difference, and the best players are usually the ones who play in higher-level matches. For example, if a great player misses a Copa del Rey match against a third-division team where his team wins 6-0, would you penalize that great player because his team is playing better without him? It makes no sense Another point: once, someone asked Goalimpact on twitter who was sharing player data with these models regarding Maradona, and he said that 'in his time in the 80s, he was one of the best, but today there are many players with a higher score.' Do you know what that means? It means that the model also favors teams with a high goal average and does not take into account the differences in the different eras of football. How can it be that a player who was one of the best in one era is then surpassed by hundreds of players in a different era? This means that your model is highly imperfect because it would make sense that the best players from, say, the 1980s, would be just as good as the best players from the 2010s. Since this Plus-Minus system works with absolute values, the goal difference will obviously have larger numbers in eras where more goals are scored. This is also the reason why Thomas Müller has such high numbers; Bayern Munich has had the highest goal average of the top five leagues for many years. I see this model as impossible to apply in football. The only way it would make a bit more sense is if football were played with a single league that is very homogeneous, maybe something like the European Super League, and if the goal difference were not thought of in absolute terms but in %, so it wouldn’t be so biased in favor of high scoring teams.
Thanks for the insight, and here would be my unpolished thoughts on the issues you've addressed. It just reflects my stance on the matter. 1) The problem of judging via overall team goals, which causes more variability in football than in basketball. I would say it is more of a difficulty issue than a deal-breaker for me. There are too many upsides for me to give up immediately. I always hated the somewhat cult-like insistence on on-the-ball centric judgement, since it basically boils down the conversation to who played the best via a certain stylistic manner. Players like Trent Alexander-Arnold will never be able to provide qualities that Kyle Walker provides just by existing. Kyle Walker may be missed more by his non-existence, rather than counting actions he does with the ball. This seems like a really good way to positively measure those who impact the shape and status of the game via their off-the-ball actions, like Thomas Muller. Or those you appreciate more by their absence than presence, like Kyle Walker. His coverage speed and physicality alone, changes the entire dynamics of the game, especially versus speedy wingers, even if Kyle Walkee isn't necessarily being the best distributor of the ball, or tidiest tackler of the ball. At the very least, this is a counter-measure that addresses some of the limitations of more on-the-ball centric algorithms. We can at least see who is appreciated less by the more popular metrics. If the nature of footballing competition is less than perfect for this model, this is still a metric I wish was more readily available. It isn't as if the other algorithms are perfect either. 2) Problems of favouring those with superior availability. It doesn't make sense to rate those who take presence in every large victory, if you assume all football players have zero injury issues, or zero problems performing across the yearly schedule set for them in advance. Mohamed Salah, for example, has been amazing in terms of his availability and reliability. I don't need to wait for his fitness to recover, or hypothesize based on his best big-time matches to imagine how he would perform on a weekly basis. He just does it. If I just wanted to extrapolate a player's peak capabilities, with zero regard to their ability to perform to the yearly schedule given, I would just rely on my memory, not look at the entire data. An injury-prone, unavailable player is still fondly remembered from a spectator point of view (as long as they create great moments), who can just switch the channels and watch somebody else. The team is stuck with that player. I think the current analysis is too driven from a very spectator point of view. 3) Problems of favouring those who compete in less competitive leagues. This is a problem that might favour Bundesliga players, but it still is some sort of representation of what actually happened, not what we selectively measure based on what we wish or think should happen. I think we can adjust after the data has been taken, instead of giving up on taking the data-set. All statistical analysis improves over time. The premise itself isn't that problematic for me personally, just the execution. Maybe it is impossible, but we attempt our own impossible missions all the time here. This is no different for me. 4) How such models measure past legends. This is a statistical model, not the gospel. It is okay to fail to recognize every amazing aspect of a legendary player like Diego Maradona. I really don't like the idea that all statistical analysis should come from a place of selective adulation for somebody, and fear that it might not reflect that player properly. When I notice statistical trends, I learn more trying to analyze the process, the fact that somebody is wrong or right with the model, is just merely a matter of internet ego, not the core of what we attempt to do. We can dissect and analyze the data, and try to figure out the confounding factors. If there are any outlier data-sets within the schedule that massively warped the formula. That's the fun of data, set parameters, measure everything, and explain the conclusions (wrong or right) afterwards. The process is what creates insight. Aren't we at our most creative and productive trying to prove each other wrong? The fun of data is not setting up an absolute pre-requisite, that banishes all forms of measurements that could potentially be harmful against legendary players who should never have anything questioned about them, ever.
Let’s see, there’s something I don’t understand. You are talking about analyzing what a player does 'off the ball,' but you’re referring to a model that aims to measure the impact of a player on their team when they are playing versus when they are not, assuming that the team performs better when the player is on the field. How is one supposed to connect with the other? The reasoning is something like, Bayern has a better goal difference when Müller plays, therefore, there are 'off the ball' actions that Müller performs that we can’t measure, but they are there (? Beyond the fact that this Plus-Minus thing seems very imprecise to me, as I explained in the previous comment, where there is a lot of randomness, and it has so many flaws that it’s like trying to paint the Mona Lisa by throwing paint balloons at a wall. The impact that a player has on the team still also involves their actions with the ball.
While I cannot comment on the technical aspects in relative terms between Lewa and Kane, you have made very good points. The idea that xG over-performance is the end all be all is erroneous.
I think this is where the argument might get circular in nature, I am not done fully-processing the nature of this model. Most of my statements are rehashed from my original post. What are the other alternatives available? Isn't that the key question? Why are you asking if the metric served apt homage to Diego Maradona? 1) How to best evaluate an excellent off-the-ball player who isn't necessarily that amazing with the ball like Thomas Muller. If a player consistently causes his team to score a lot more, and is sorely missed when absent, across a significant enough sample size, despite not being that highly rated as an on-the-ball talent, maybe that alone suggests something. Thomas Muller is not the highest rated player on WhoScored by any measure for the given time-frame and sample size, yet he tops this particular model given the calculations. What does that suggest for you? Maybe the measurement is pure trash, with nothing valuable to extract from the data. Maybe it suggests something. I lean towards the latter. I did mention the precision seemed way off. However, that's where the difference in the perspective lies. I see much promise here. It is indeed a measurement-attempt by proxy, not a direct measurement attempt. It is indeed a very crude measurement attempt, but it is better than giving up entirely on quantifying off-the-ball actions. Maybe you know other models, or perhaps you wish to wait until there is a precise model. 2) The precision of the model. This is just a tool, and in the absence of a better model, I prefer having this measurement over not having any. If the best off-the-ball action measurement says something bad about players I personally like, I am okay with it, because it is not the final verdict on anything. More analysis. More data-sets. Mistakes will be made in the process, but ultimately we know more. What I am not okay with is just saying don't bother evaluating anything until it is perfect and has zero chance of saying something inaccurate or bad about players I like. The attempt was good. The premise seems fine. The execution and development of the ideas were not fully evolved for my tastes. I certainly wish there were more openly available data-sets of this nature. Spectator bias is inevitable with on-the-ball talents because it is the vantage point from which we perceive the game. We mostly follow what happens around the ball. That is certainly the case for me at least. If there is a good statistical attempt at a rectifying that spectator bias, I'm all for that. Maybe you already thought through the entire model, and see no possible methods to improve it, or work on its precision. However, that's your opinion. How many hours have you spent ciphering through all the possible ways to maybe measure off-the-ball contributions via this pathway? Given the choice, I'll love to have the data-sets for this particular algorithm calculated for every single player in history, with the ability to change the time-frame as I see fit. I will most likely not agree with the conclusions at all, but I will learn a lot of repeating patterns and what sort of players I might have under-appreciated in the past or present.
The reason I used the example of Maradona is because I remember it was mentioned that with this metric, the best players from the 80s were ranked much lower than the best current players. This clearly suggests a fundamental error, and the reason this happens is primarily a mathematical issue where it benefits teams that score more goals. This was something I suggested to a user who was working with these models to review. I’ll give you an example: in the 1986/87 season, Napoli had 1.585 goals for per match and 0.634 against; their goal difference was 0.951. In the 1976/77 season, Bayern had 3.162 goals for and 1.703 goals against; their goal difference was 1.459. Basically, Bayern's goal difference was higher. The Plus-Minus model works on this logic. You have more chances of having a big impact if you play on a team that scores a lot of goals. Now, if we analyze the goal difference as a percentage, That is, (goals for) / (goals for + goals against), the result is that Napoli has 71.43% and Bayern 65%. My suggestion is that in this way, it can be adjusted based on different football eras, and this can help the model. How the number of goals changes in different eras Even so, even if that problem is solved, you still have the bigger issue, which is that the model assumes that all opponents are the same. If a player plays in a Champions League semifinal and his team draws 1-1, and the following week he doesn’t play in a Copa del Rey match where his team wins 6-0 against a third-division team, the Plus-Minus model will tell you that this player is making his team play worse because, without him, they win and score many goals. Does that make any sense to you? Now, yes, I do believe there are players who can truly make their teams play better. My problem is that I don't see a possible way to measure that fairly. I think making the criticism is worthwhile, that's what we're here for, to debate. What catches my attention is how you directly relate this concept of "the player's impact on the team" with "off the-ball actions," as for me they don't have much to do with each other. If Müller has a great impact on how Bayern plays, this could easily be explained by actions he performs with the ball. In any case, the stats from webs like whoscored or sofascore are a minimal representation of reality (like all stats) they do not measure the complexity and creativity of certain actions, actions that Müller could be performing with the ball.
1) Comparing players belonging to various circumstances. Lets assume that the Plus Minus model was used for all the players within the same league and the same season, and calculated the distribution of the players belonging to that sample, and the amount of deviation from the norm was used as the metric. Would that be a possible solution? If Thomas Muller was a beneficiary of the circumstances set up by Bayern Munich competing within the Bundesliga, how would you explain the numbers represented solely within the Bayern Munich squad for that specific time-frame? Is the data still useless? 2) Schedule and availability. Availability is heavily factored in this particular formula. I don't mind that, since availability is never rewarded by the standard measures. If anything, being rested as opposed to being available for every schedule actually helps your per 90 minutes statistics, so it is actually a minus in most metrics. Why does availability have to be punished so much? If a player can play every single match in a season, as opposed to being available only for the important matches, how is that not a plus? 3) Thomas Muller. I do not rate Thomas Muller with the ball at his feet as highly as you do. I think he is much better player than his on-the-ball algorithms or eye-test suggests. So when I see the disparity in rankings between WhoScored (on-the-ball action biased formula) and Plus Minus model, and I see Thomas Muller ranked highly, I think his awesome off-the-ball probably played a role in him skyrocketing up the rankings for the same sample. Let's just limit things to Bayern Munich. That should remove much of the issues you've addressed. Thomas Muller has never topped the charts within his own team, for his seasonal Bundesliga performances if you judge him by WhoScored. Not even once. Maybe that is fair. Maybe he was underrated. I personally think so. So if this Plus Minus model rates him higher, I am guessing this model takes into account his off-the-ball attributes better than WhoScored. If you ask me if it is a direct one-for-one measurement of off-the-ball actions, of course not, but it's better than what we had prior no? Weren't we just rescrambling WhoScored numbers with different ratios just a couple of pages ago? How is this not a positive direction? 3) Difficulty of capturing the full scope of football into numbers. It is better now than 10 years prior. It will be better 10 years from now. It will always be impossible. This is a step in the right direction. I am too stupid to recommend instant fixes on the fly, and if I was that competent, I wouldn't be sharing the ideas on a random internet forum. 4) Numerous pitfalls of the formula. It didn't stop you from gathering information about anything else. Why should we draw the line here? None of the discussions we do here solve football forever till the end of time. That was never an option.
Yeah but you are assuming from the start that the Plus-Minus model is correct or at least a very good method, a model that has Thomas Müller as the best in its entire database, and from that premise, you draw your conclusions. Since whoscored, which is a rating based on on-the-ball actions, does not have Müller at the top and the Plus-Minus model has him as the best, then the logical conclusion is that Müller performs many off-the-ball actions that positively impact his team. Well, my point is this, this Plus-Minus model has many flaws, and Müller is definitely being massively overrated for the reasons I explained above. And the on-the-ball action ratings are much closer to reality. I mean, the best players in the whoscored rating are exactly those you would expect to be the best, literally Messi, Neymar, Cristiano. There is no absolute truth, but I think whoscored is closer to it. On the other hand, with the Plus-Minus model having Müller as the best, do you think that is a step in the right direction? I would say the opposite.
I truly do. The idea of the premise sounds much more ambitious, than trying to count actions done with the ball and assigning points to it, being basically a glorified Fantasy Premier League. I truly don't have that much respect for the tallying of on-the-ball actions, it gives too much bias to people like Adel Taarabt who are better at proving that they have potency performing attacking actions with the ball, than they are at helping the team win. Is the Plus Minus model currently accurate, or even anywhere close to being as toyed around with as much as tallying of actions with the ball, as an idea? No, people have added up actions with the ball for decades on end. The sheer amount of practice, and number of people attempting to do it must be massively disproportionate. I hope you understand that the WhoScored formula, in my opinion, was probably first created not with an ambitious unbiased objective in mind, but just a random formula with its ratios tweaked until all the biggest names filled up the front page, usually being of the attacking kind. If this is true. There was no aim to measure footballing effectiveness, and see what pops out, rather, there was a need to capture all the biggest names first, and then the formula was reverse-engineered to mimic that big star list. Do you see the difference? The fact that the WhoScored formula captured all the big names like Neymar and Hazard is not a sheer fluke, it was a set-goal with hours of zero insight, mechanical tweaking and mind-numbingly painstaking work to guarantee it happened no matter what. Spectators enjoy on-the-ball magicians who contribute to the attack. Awards do also. It also helps your social media presence, your individual legacy, your YouTube highlights hit count, your ad revenue and probably your salary to be a good player via this pathway. It doesn't mean it is the best possible way to measure pragmatic effectiveness in terms of helping the team win. I'm saying there is a clear limit to how much tweaking of various ratios for WhoScored-esque tallies will help evaluate players in their actual full scope, in purely pragmatic terms of helping the team win. The Plus-Minus model, is a model that I see very high potential for, compared to the twiddling around of the ratios for on-the-ball actions that range from WhoScored to Sofascore to FotMob. The work is no-where near being done, but if I was to assign a supercomputer of monumental capacity to the task of trying to evaluate footballers for their effectiveness, not their Instagram follower count, or number of mentions in YouTube comments, I would set the artificial intelligence to reading some of the work done in this particular manner, not WhoScored algorithms. I hope you get the point. I am not trying to argue Thomas Muller is number one. I like the premise behind formula, think it can work with improvements, and wish more companies followed this footpath rather than seeing which dribbler plays the most like Lionel Messi.
The system of measure that on the ball algorithms like WhoScored, SofaScore, etc... have put in place is designed purposely so that Messi can literally never fail. As if the Messi way of playing football is the only way and it rewards the players who mostly plays like Messi. Ronaldo's strengths are given no weight what-so-ever. Ronaldo making a run is using his vision to read the game, timing to create the opportunity and skills to control the pass. Why is the passer automatically doing more? It's bullshit. Without creative and skillful off the ball movement there would be no pass to begin with. In the analysis of a game Ronaldo's movement, placement and timing isn't valued or statistically tracked. Even though this is a shared proposition. This is just one example of how judging a player is skewed towards Messi and the list goes on.
Look at it this way for simplciity sake.. Player A has 100 shots on target and he scores 50 goals. Player B has 100 key passes and out of those 50 assists. By taking in consideration only goals and assists we ignore 50 shots on target that werent scored and 50 key passes that didnt end up as assists. Now let's look at those shots on target and key passes that were ignored. What did we exclude from consideration. In the case of sgots on target we've excluded some great shots that were saved, some normal shots that were saved and ALSO some big chance misses that end up as shot on target. For example, if some takes a penalty in a horrible way, but hits the the target, it will be registered as shot on target. In this case, player was supposed to score and him managing to only register shot on target is actually a bad thing - it has a negative value and impact. It is bad to miss a big chance even if it was shot on target. In the case of key passes, there is no circumstances in which key passes is a negative thing - has a negative value and impact. The worst case scenario for key passes is if player B makes a 2 meter pass to his teammate 50 meters away from goal and the teammate shots on goal. That will be registered as a key pass even tho player B hasnt done anything worthy of recognition. In the worst case, key pass has minimal, positive value. There is no such thing as negative key pass.. In the worst case scenario for shots on targets, it has a big negative impact. That is not the same. More key passes always adds value to player's contribution, while more shots on target doesn't necessarily add value, because we dont know what kind of shots on target we are adding (positive or negative impact). Player could register a lot of shots on target by missing big chances or by making decent effort but unlucky with goalkeepers performance. This is why key passes are used in this kind of evaluation and not shots on target. The sheer volume of shots on target doesnt say anything about whether they had positive or negativ eimpact.
Again you're giving an extreme example. I've got your point but what about the actual data I've shown that Haaland's shots on target are more likely to turn into goals than KDB's key passes to turn into assists according to the xA and xGoT models? On that case I would say that Haaland's shots on target are more valuable to man city than KDBs key passes. And again, we're talking about the best playmaker in the world. An average player having a lot of key passes likely will have an even worse ratio of xA per key pass
There’s a real football logic behind why algorithms like those of Whoscored and Sofacore value a certain number of actions in the way they do, and it makes all the sense in the world because in football, the team that generates more actions wins. Football is a sport of many chances and few goals. The more chances you generate, the higher your chances of scoring a goal and, therefore, winning. The best players in history coincidentally generated a massive amount of positive actions—doesn't that tell you something? Pelé, Maradona, Messi, Cruyff, Puskás, Di Stéfano—the list goes on. These are the players who had the most presence in their teams' play, the ones who generated the most dribbles, took the most shots, made the most key passes (and also lost the ball the most). In general, it’s very clear that what makes you great in football has to do with that. That’s how you gain an advantage over your opponent. Obviously, there are many other factors that are impossible to measure statistically, like intelligence, creativity, technique, etc. On the other hand, you're talking to me about 'off-the-ball actions' and mentioning a model that doesn’t even have anything to do with off-the-ball actions. Besides, I think this Plus-Minus model has so many obvious flaws that it’s really useless. You also assume it has something to do with the 'off-the-ball actions' that enhance players' performances. I honestly don’t see the logic in that.
Of course, but not only that, assuming that xG represents the entirety of the real value of a play is nonsense. If a key pass has a low xA value, that still tells you almost nothing about the real value of that pass. A player could have made a very creative and intelligent pass at the perfect moment, and the xA model tells you nothing about that. Furthermore, before making the pass, the player could have, for example, dribbled past opponents while carrying the ball across the field. A key pass can have much more value than what the xA model can even measure. On the other hand, saying that if a shot has a high xG value, it means that the player who took the shot was the one who created the chance is also nonsense. Literally just tapping the ball into the goal would be considered a high-value action with that logic, and it’s the complete opposite. What has 'individual' value would be receiving the ball far from the opponent’s box and having that same player finish the play with a goal.
I don't understand exactly where you are coming from. Are you upset with the current unpolished conclusions of the attempts, or are you absolutely verifiably sure that this mannerism of statistical analysis is an absolute dead-end? The slightest tweaks in variables and changes in formula give way to different lists of names. Is it the final list you seek? Or that you want me to agree that no mathematical formula can ever be attempted for football, and we should let the people who have pre-determined lists just assign numerical value pretend to measure football mathematically? If it is the latter, I think we have a severe clash in philosophy. I do not wish to continue the argument if this is the case. The cause for the conclusions made in the journal does not have to be exclusively off-the-ball actions, it can be any actions, as you addressed. I just assumed for the case of players like Thomas Muller and Kyle Walker who provide much value without their on-the-ball actions tally, and therefore somewhat under-appreciated by algorithms like WhoScored, have good reason to be appreciated by another metric that attempted to take into account things beyond mere on-the-ball action count with ratios. Didn't I already say this is not a direct quantification of off-the-ball actions the way you count key passes? If this is an impossible step in logic, or an impossible conclusion, maybe we have differing ways of thinking. If you are talking about the immediate application of the conclusions, of course, there is much more alignment to the general public consensus by the WhoScored algorithms. It was most likely, in my opinion, created with the agenda of capturing all the famous names, and constantly tweaked until it hit the target list. How is that somehow a statement of an objective premise set with a goal to measure pragmatic effectiveness? That is just raw man-hours wasted into making a glorified famous name list with trivia and numbers. If a formula with zero prior agenda, and zero post-conclusion modifications immediately was able to recognize all the cases of successful footballing individuals, that is the sign of us mathematically solving football. You can literally write a code off-the-formula. That goal will never be reached. Doesn't mean that the mathematical formulas can never be attempted and reformulated. It seems you are more concerned about the current hierarchy of the lists, and wish to banish all attempts as useless and ultimately futile. I am too stupid to come up with a better mathematical formula on the spot. As are you. Are you so definitely sure that no superior intellectual being, artificial intelligence or otherwise, with more resources and time, can ever hope to come up with a formula with a similar mathematical premise but with modifications to align with reality better? Is that really your conclusion? If so, yes, this is a step in the wrong direction. I'll stop here.
I think I got too pissed off, and didn't really give much of a productive outlet for the conversation. Like the possibilities are endless with the equations, it is never the finished product. Here's a rough suggestion that the paper doesn't seem to address in full (not sure but I think this is the case). Let's say a player being subbed on, with 15 minutes of playtime, has differing expectations for the three following scenarios: 1) Both teams relatively stable, and somewhat content with the current status quo in terms of the scoreline. 2) The team that the player being substituted on for, is desperately chasing a goal. 3) The team that the player being substituted on for, is desperately defending a lead. We can just use to prior data of thousands and thousands of prior games, that had such conditions, to try and calculate and analyze the amount to which those pre-existing circumstances impact the odds of a positive-impact being caused by the substitution, and try and incorporate that into the equation. So on and so forth. After thousands of hours done to the equations, by competent and intellectually sound people, I think the conclusions will be really appetizing to consume. Now the workload is immense, and it is way harder than tallying actions around the ball and assigning ratios to them. It is why I am not matched for the task in any shape or form, and why I think artificial intelligence or the correct number of competent people might take on the insanely complex mathematical work in the future for such tasks. However it is something I really want to see.
Don't you see that football is a sport where teams take 20-30 shots on goal and barely score one? If you know in advance that a lot of chances will be missed, the best way to maximize your probabilities of scoring is by creating as many chances as possible. Obviously, you have to give importance to the quality of those chances, but the priority is the quantity of chances. The best players in history are characterized by being beasts in terms of the quantity of actions. Pelé, Messi, Maradona were always the ones who generated the most shots, key passes, and dribbles because that's exactly how they won matches. In fact, they were also the ones who lost the ball the most, precisely because they were the ones who took the most risks. The concept would be one of 'brute force': you try, try, and try until at some point it works.