The point is if someone takes a low .02XG shot that hits the post - that is pretty much the expected outcome - a near zero expectation of scoring. You want to introduce an element of luck - which is a fine and human thing to do - but it isn't anything to do with probability and is not something that can be measured.
This was the biggest insight of XG IMO - virtually everthing that pundits, coaches, etc thought mattered, didn't matter that much. Take RvP. He looks amazing with his tekkers - but in fact his great skill was his shooting locations. We got to witness Eddard Moyes tossing in 70 crosses per game because he was a dinosaur who didn't understand it, while the man he replaced, Rene Muelensteen, had optimised RvPs play around this single idea.
That's not the job of stats though. The point you are addressing is more to do with the coach or the player decision if the shot is worth hitting regardless of its low xG. Good coaching would be using xG models to say this where your run needs to be taking you. The technical stuff is all player dependent. Martinelli is the perfect example. His shot volume seems to be improving and a lot of this is just having left footers who can pass onto a run that lets him shoot with his right foot from a high xG position. Rather than a right footer making the same pass because of body positions and the path that means the ball needs to take.
Huh? Why is factoring the outcome into the algorithm/assessment (how close the shot actually was to the goal, whether it hit the woodwork, etc) introducing an element of luck? To me it's a fact and arguably worthy of inclusion when grading the shot. And I think it was you who said that some xG models do factor in outcomes like this, no?
YES! And his partnership/interchange with Kai has become quite good and valuable. Which reminds me of a thought I had mid-match: I wonder what the comparative heatmaps or average positions would be for Kai vs Leo? Anyone got a quick way of finding those? Also, the other player that I'm quickly becoming a big fan of is Calafiori. So many things to like, not just him popping up in the #10 slot Total Football style. I just need a few more games of watching his defending ability to be totally sold on him as our permanent starting LB (or as our #3 CB??)
No, you're still not thinking about it correctly. If say Duran hits a laser from 45 yards out, it's a very low xG regardless of whether it goes in or hits the post or sails over. If Hleb has a tap in from inside the 6, it's high xG regardless of whether he actually scores or not.
Actually I've understood that about xG all along, for quite some time now. But what I was just questioning (and you just replied to) was Jitty deeming the outcome of a shot to be luck rather than factual data. Especially when he (and others?) said that there are actually alternate/newer xG models that do factor in shot outcome. If it's true that such alternate models exist, then wouldn't that imply that outcome can be just another factual data input to an algorithm? And how far a shot ends up away from the goal could be measured/estimated in a similar way to how far away a shot is taken from? I get that that would make such an alternate xG model less purely predictive. And maybe less pure probability theory perhaps. But is that such a fundamental problem? My interest in this discussion hasn't been about math purity but on weighing the value of a shot, or a series of shots, which actually happened. Some of which were quite nearly goals and others which were far from it. All I've really been talking about this whole time is essentially this: two shots with the same "origin data", one hits the bar and one sails into Row Z. Wasn't the former closer to being an actual goal than than the latter? That's surely how it feels to this human.
Can’t we just accept that Northbank is bad at math? I feel like we’ve been trying to explain xG to him for ten years now.
I guess you didn’t read my reply to Yos before posting that. If that guess was wrong then perhaps we should just accept that you are bad at reading comprehension?
Yes, but… The best xG models do incorporate things outside of shot location, including type of shot (foot strike or header, etc.), height data (where the ball is physically in the vertical axis at the time of the strike), type of action that preceded the strike (dribble, through ball, cross, cutback, etc.) and location data of the defenders.
A shot that hits the post and doesn’t go into the goal is definitionally off-target. I’m so sorry that I didn’t add the italicized bit to my previous post. Jesus.
You’re looking for a different metric - xClosenessToBeingAGoal. Serious question, if you start measuring how far away the shot went vs hitting the bar — what are you trying to ascertain? Do you just want xG to always give you the exact score? Because it’s not possible — players miss sitters all the time ….
I was the one who referenced post-shot xG models, which include where the shot actually ended up going rather than just the data I referenced in a post above. I’ve not been a huge fan of post-shot xG in general because I feel like it’s just an effort to get closer to the actual outcomes than traditional xG which is inherently and intentionally at a remove from outcomes. At which point, we can just watch the match and see what actually happened, so there’s no need for xG anymore. The point of xG is to try to evaluate shot actions without knowledge of the outcome so that the desirable outcomes become more predictable and thus coachable. However, post-shot models do better describe what actually just happened in a given match, which can lead to additional insight about individual players’ quality over time.
Not exactly. I’m not actually “looking“ for a new metric, but arguing the merits of one which incorporates outcome into its algorithm more. So I guess if I had to name it, it would be more like: HowCloseWasThatShotToBeingAGoal. Note that I didn’t use the x-prefix because it wouldn’t be purely expected/predictive as traditional xG is. It would blend prediction and actual outcome. So a sitter from 3yds in front of an open goal, would be high like 0.9 maybe? And a shot from 20+yds that strikes the inside of the post, caroms across the goalmouth and hits the other post would also be high like 0.7? The end result, especially in the aggregate, would be a measurement of how many goals you were close to scoring or conceding. I’ve never had this thought before tonight, but from what someone said earlier, others in the stats field may have done. So maybe I’m not as stupid crazy as some here would make you believe?
Aha, it was you not Jitty? Or you and Jitty? Anyway thank you for your reasonable, civil, not hostile reply. It was refreshing!!!
Again, if you just want a stat that matches the outcomes exactly, what’s the point? You can see the outcomes by watching. And even if that shot goes in, the expectation that it would go in 2% of the time is still correct. It’s the whole point.
Yeah. This whole idea of luckiness, or the other team being 'close to scoring' is a human vibe that isn't a probabilistic thing. Like you say, before the outcome, it was a 2% chance. And after the outcome it was zero. There never was a time it was close from a mathematical perspective. It's like if you drop your phone next to your full coffee cup and think "that was a close one". As a human heuristic that is useful - but as a question of probability, it isn't a thing.
So is the metric for xg constantly in flux? If this season there's a spike increase of goals from 20 yards out will that shot go from a 0.2 rating to a 0.3 for example?
I mean, yes, new data is constantly added to the models which will naturally move values over time. And the proprietors of these models even update them as they learn more. But we’re getting to the point where we have enough info and experience that big changes in any individual shot type probably won’t ever happen again. (And of course there are lots of different xG models out there which value things differently based on the datasets they have access to, which is why you get things like the huge variance in our xG versus Leicester, where understat had us around 6 xG and xG Philosophy had us around 4.5 iirc. Both of those are crazy high for any individual game.) As an aside, I know you were just throwing out random numbers for illustration, but just so nobody gets the wrong idea, a long range shot like the one from Nuno Mendes that hit the outside of the post is like a .02 value. .2 is like a super high value chance, about double the average shot.
I think it's pretty well settled that the location is the most powerful variable. Like any multi-variable equation, you can get optimized results by recreating the power of shot location in aggregate. But, some of the power of shot location can't be recreated like, where the keeper needs to be, or you can't just get clattered. I had the conversation about Palmer's almost assist last weekend and why it was a crap pass. It had location but all the other factors were not quite right for generating the outcome the shot location should have represented. A better pass would have led to a first-time shot from a prime location. Jackson will get the blame, and yeah he could have done better, but it wasn't as easy a chance as people are making out.
On the xCondescension metric my post was actually a 0.0. Honest true fact. Sorry you didn't see that.