These tables show the Ranked Probability Scores (RPS) for the following public prediction models since GW 9 (330 games) of the Premier League:
For links to the models and how to read this table, scroll down.
accastats, @alfa_data, @bananapredict, betbot, betbrain (betting market), betegy, betstatz, betstudy, clubelo, desportz, euroclubindex, five38, football1x2tips, forebet, fupro, @fussbALEXperte, gamcast, @goalprojection, iambettor, @jallan99, kicker, kickoff, mybet.tips, @Nivol3000, @petermckeever, @thepredictaball, SciSports, scometix, scorepredictor, @oh_that_crab, sofascore, sport-insight, statarea, @teouchanalytics, vitibet
The betting market is represented with the Betbrain model, which uses the highest available odds offered on Betbrain. To make this model more transparent, I am using the odds @12Xpert offers on his website for everyone to download here.
Also added is a combination of all models (the mean average of all predictions).
What is the RPS?
The Ranked Probability Score is a statistical method to evaluate the quality of predictions with outcomes that have an order to them. This makes it a good tool to measure the quality of football predictions, as football matches have ranked outcomes (home win > draw > away win).
The RPS compares how far off predictions were to what actually happened. This means that a low RPS is a good RPS, as it more or less represents the ‘error’ in the predictions.
The RPS is one of many scoring rules that can measure the accuracy of predictions. If you want to read more on why it is important to keep score of predictions and how scoring rules work, you can read my blogpost on Hindsight bias and how to measure the accuracy of football predictions.
Perfect and z-score column
Those two columns can show us a bit more about how to interpret the RPS for each model. I’ve simulated all matches a couple of thousand times based on the probabilities of each model to calculate the average RPS for every model, given its probabilities were perfect.
So, if the model perfectly captured reality, we were to expect an RPS similar to whatever stands in the ‘perfect’ column.
The z-score shows how many standard deviations the actual RPS strayed from the mean (average) of the simulated predictions.
This can help us identify overconfident and lucky prediction models, which we could not identify with their actual RPS alone.
A very low (negative) z-score indicates that the model was overconfident. A high (positive) z-score indicates that even if the predictions were perfect, the model still got very lucky with the actual outcomes of the games.
If you have any further questions concerning the table or the explanations just let me know.
Scometix, Sofascore and Kicker are strictly speaking not prediction models, but voting-percentages. Scometix and Sofascore let people vote on what they think the result of each match will be (Home win, draw or Away win). Kicker let’s people predict what the actual score of a match will be (2:1, 3:0, 1:2 etc.) and then displays how many people predicted a home win, draw or away win.
Note that some modellers missed few deadlines: @thepredictaball (19 games), @jallan99 (13 games)