The Rise and Fall of ( ExpG ) Expected Goal Totals as a Metric in Football Data Analysis
When Manchester United were at home to Fulham in a game that ended 2-2 in 2014 , twitter went into meltdown because the football “data analysts” considered that United should have won the game as their ExpG was 4+
What is ExpG ?
ExpG is expected goals in a game which basically means how many goals a team should have scored given their “chance creation” and we need to understand that a chance does not need to be a shot as for example a player could do an air shot from close range which needs to be recorded in the same way as if he had connected with the ball.
How do you key ExpG?
The first exposure given to ExpG via the media that I am aware of was an article in the Guardian (http://www.theguardian.com/football/blog/2013/feb/24/football-numbers-game-gary-neville ) and in the article we were advised that Opta has used a database of thousands of matches to develop a model that quantifies the chance of a shot going in depending on its location.
“When Newcastle lost 2-1 at home to Reading last month they had 56% possession and 16 shots to seven. But, as Green points out: “Reading created two excellent chances – Pavel Pogrebnyak’s miss in the 27th minute (goal probability 49%) and Adam Le Fondre’s opener (from point-blank range, 69%), as well as his second (17%) – while Newcastle only had one very good chance: Papiss Cissé’s shot in the 30th minute (from just outside the six-yard box, 34%).”
“Using the model, Newcastle had a goal expectancy of 1.4, with Reading slightly better at 1.6. The bald stats told one story, the more detailed analysis another.”
When Ibrahimović missed a late opportunity to score for United I was expecting a discussion from the current crop of analysts on expectation of the chance ending in a goal but there was a muted response to the miss.
The Limitations of ExpG
Is ExpG repeatable ?
Liverpool are a great example of why should not rely on Expectation of goals as the betting syndicates have discovered in recent seasons because Liverpool had such a high “chance creation” in games that they would not only win games but win games by a wide margin which did not happen.
Individual players are not machines and can suffer from fatigue and make errors and although shot on target production can be consistent it is known that accuracy is inconsistent.
We need to consider that there will be more expectation of Peter Crouch scoring from a header from 10 yards then Messi so if you are relying on data that does not include defensive pressure etc then it will be very limited.
The effect of game state on accuracy.
It is important to consider if you are calculating ExpG as time decays in a game that game state ( current score ) added to the time of the opening goal will effect the accuracy of teams during a game and this metric needs to be factored in to the calculations.
ExpG can be very subjective and as a metric for predicting lack of goal production is very dangerous to apply as a game could have a low “chance creation ” and then be ignited by a goal that triggers further goal(s)
Keying “chance creation” data has been around for a number of years and it appears that the current crop of football data analysts that were seeing the expected goal metric as the “latest love chid” are discovering that the metric is not as sophisticated as it looks for the reasons outlined above.
Manchester United at home in the Premier League since 2008-2009 when 0-0 on 79 minutes
0-0 1-0 0-1 0-0 0-0 0-0 0-0 0-1 0-0 0-1 1-0 1-0 0-0 1-0
50% over 0.5 goals