WITH THE winter break now upon us, it is a good time to begin to have a look at where the title race stands.

As has been typical and conforms to the concepts laid out in my first column for The Celtic Way, the value of player wages Celtic and Rangers spend relative to the rest of the league means it is effectively a two-horse race.

Here is the league table at the break:

Celtic Way:

I did not share this concept in that initial column, but this graph is a log/log of the approximate wage structure of the SPFL Premiership versus league table position:

Celtic Way:

The straight line is a power-law distribution and the graph displays similar to what is known as Zipf’s Law, which is a fascinating statistical phenomenon seen widely in natural sciences and human behaviour.

I have approximated the relative wage bill structure of clubs using the data I could obtain, estimated for last season and generated the x-axis values with Celtic’s highest wage bill level scaled to be 100.  

There are some within the analytics community which have taken heart from data such as the following, which is a table I’ve created using WyScout’s expected points model:

Celtic Way:

As we can see from the table, Celtic have underperformed relative to what is known as ‘expected points,’ (xPts) which is a model which tabulates the probability of points earned using expected goals (xG).

Obviously, the only table that actually matters is the one with the actual results but, like xG, the idea is that xPts can offer another way to measure underlying performances as the variance in events can be so significant in such a low-scoring sport.

Generally speaking, there is a school of thought that xG is a mean-reverting metric and that, given a large enough sample size, players and teams will revert to it. For example, a player who has scored at a high rate relative to xG is likely due for a stretch where they ‘underperform’ their xG, so that over a longer period, goals scored reverts close to xG.

To illustrate this, here is the data for Ryan Christie’s last three seasons, including so far this season at Celtic and Bournemouth:

Celtic Way:

We can see how dramatically Christie outperformed his xG in the 2019-20 season, and how he has subsequently underperformed subsequently. In fact, you may remember his four goals from long-distance early that season, with one versus Nomme Kalju and a hat-trick of them versus St. Johnstone in the league opening 7-0 victory? He scored those four goals on an aggregate of 0.20 xG per WyScout’s model.

I have no specific issue with the mean-reverting concept, but believe it should be viewed as a rule of thumb rather than applied with a maximalist interpretation. Firstly, xG models vary depending upon assumptions, so an obvious question is to which model should the mean reversion occur. Also, the current generation of data capture for football is missing a vital component – measuring the ball.

READ MORE: Johnny Kenny scouted - The Irish up-and-comer with finishing touch and 'fighting spirit'

Whether it is Steph Curry raining three-point shots from 30 feet, or David Turnbull taking a shot from 20 yards, I believe shooting, or in the case of football, striking the ball is a skill of its own. Capturing the velocity and relative spin and movement of the ball is likely to further advance the ability for the game to be analyzed, including the various advanced goalkeeper metrics.

Now back to the real issue at hand – is the fact that Celtic has underperformed xPts and Rangers overperformed so far this season indicative of a pending mean reversion? Here are the average differences by table place each season, regardless of team, for the league dating back to the 2015-2016 campaign, which is the first in which WyScout offers data:

Celtic Way:

So even if one were to assume mean reversion was inherent to performance via xG/xPts, the track record for variance over a full season has been quite significant. But is the variance in xPts to points mean reverting?

Celtic Way:

This scatter plot compares the average xPts Difference for each table position over the six seasons with table positions, this time as a linear graph. I’ve added what is known as a cubic curve, which seems to fit the data quite well. Cubic curves can be an indication of a non-linear relationship between two variables. Remember, the first graph above also displayed non-linearity via the power-law distribution.

Next, let’s take a look at plotting the wage bill structure to xPts Difference via a linear graph:

Celtic Way:

Looks like it may be another cubic curve? If the statistical relationship suggests a non-linear relationship, what happens when we plot them on a log/log?

Celtic Way:

Quick note: I rescaled the xPts Difference in y-axis to 100 instead of zero, in order to adjust for the negative integer values, as log charts must include positive values.

This is not comparing values versus a ranking system for something like Zipf’s law to potentially apply, but can see a pretty decent fit. Also, please remember the limited volume of data points in this exercise.

If I had the time and resources to test, my theory is that these statistical relationships may exist across leagues when using accurate data for things like wages and the most advanced xG/xPts models available.

This review has me sceptical that xPts is a mean-reverting model, at least for the SPFL Premiership given the wage bill structure. Both performance metrics like xG/xPts and the variation of xPts from actual points display potential evidence of non-linear statistical relationships.

For example, Celtic have outperformed their xPts in five of the six seasons in the sample and the one laggard was by 0.2 last season, despite it being what many supporters would deem a calamitous campaign where seemingly all that could go wrong, did go wrong.

So while the future is obviously unknowable, it may be relatively normal for Rangers to maintain a healthy disparity between their points earned relative to xPts.

With the two clubs’ relative wage bills likely fairly close this season, I think the primary question for Celtic is whether they will ‘catch-up’ with, and also begin to eclipse, their xPts?

The answer is more likely than not to be ‘yes,’ so in the next column we will take a closer look at underlying performance trends and review what opportunities and risks may lay ahead. Chief among them: is catching up likely to be enough?