Cheltenham 2018 – Interesting Stats and Potential Angles

With the first race of the 2018 Festival only six days away I decided to crunch some numbers to see if I could come up with any interesting stats or angles which might help our punting next week. This year most of the data is related to a horses prep run and which course it was on. Does the form from some tracks work out stronger than others, and just how important is Cheltenham form, are just two of the questions I looked at. I also take a look at the effect the amount of runs a horse has had, on performance in the handicaps.

I can’t stress enough how important sample size is when doing data analysis, and when dealing with just four days racing, run once a year, the sample size is pretty small.  To combat that it’s imperative you decrease the effect of a few lucky wins, by using either place data, or percentage of rivals beaten. Place data has the added bonus of being able to gauge not only the effect a certain variable has, but also to ascertain if the market under or overestimates that variable, so for this piece I have exported some data from Proform Software into excel, and then imported Betfair Place SP’s to go along with the Betfair Win SP’s Proform provides. In the tables below the Place Strike Rates and Place AE are easily the most important figures, as even though we are trying to predict winners, with smaller samples, place data is a far better predictor of future winners, than win data on it’s own is. The stats articles I did for the 2016 and 2017 festivals contain different data, and will still be relevant next week, so you should check them out as well.

Track for horses last run

TrackCountWin SRPlace SRWin AEPlace AE
 Cheltenham 431 6.5% 19.5% 0.98 0.91
 Leopardstown366 12.3% 34.2% 1.421.31
 Newbury 308 4.2% 17.2% 0.63 0.82
 Kempton 275 7.6% 19.3% 1.22 0.95
 Ascot 258 4.7% 17.8% 0.82 0.9
 Doncaster 196 2.6% 17.3% 0.57 1.0
 Sandown 177 3.4% 14.7% 0.62 0.72
 Haydock 162 6.2% 16.7% 1.46 1.03
 Punchestown 134 9.7% 25.4% 0.97 0.95
 Warwick 1106.4%16.7%1.311.25
 Fairyhouse 919.9%25.3%1.561.22
 All Tracks 38475.6%19.1%1.01.0


The first table above simply records the figures for the track a horse had it’s most recent run at prior to Cheltenham, and includes all races at the festival. The data for all the tables is from the last eight years, which is long enough to give us a decent sample, but not too long as to make it irrelevant. Count means the amount of observations, Win SR is win strike rate, Place SR is place strike rate, Win AE is actual wins, divided by expected wins based on the Betfair Win SP, while Place AE is amount of actual places, divided by the amount of expected places based on the Betfair Place SP price. For the Place AE figures I’ve highlighted significant returns in bold, which basically means there is a less than 5% chance such a return could have happened by chance. For both sets of AE figures, returns over 1.0 mean you would have made a profit blindly backing the horses to return a set amount, while figures under 1 would mean a loss. For example a figure of 1.2 would mean you would have made a 20% ROI blindly backing the qualifiers.

The first thing to note is that the win, and in particular the place SR, which is more robust, for horses who prepped for the festival at Cheltenham itself, don’t really do any better than average, which is surprising given how so many people talk about how important track form is here. The standout piece of data from this table though, is the performance of horses who had their last run at Leopardstown prior to coming to the festival. The win and place strike rates of 12.3% and 34.2% are very impressive, and perhaps even more so from a betting perspective, the Win AE and Place AE figures show we would have made a very decent profit blindly backing all such runners over the past eight years, with the place figure being highly significant. Of the others the most significant would be how poorly Newbury and in particular Sandown do, with poor strike rates, and from the AE figures it also appears the market overestimates the form coming from those venues.

Running in Cheltenham Hcp, track for last run

TrackCountWin SRPlace SRWin AEPlace AE
Cheltenham 2346.0%17.9%1.240.95
Newbury 1862.2%14.0%0.450.77
Kempton 1375.1%14.6%1.320.91
Leopardstown 1197.6%29.4%1.461.47 
Ascot 1180.8%13.6%0.210.81
Sandown 1132.7%15%0.580.82
Fairyhouse 4411.4%31.8%2.171.62 
 All Tracks 19594.4%17.6%1.01.0


I broke the data down between whether the horse is running in a handicap or non handicap at Cheltenham, and again Leopardstown performs really well on all metrics with another significant Place AE. If you backed each horse in a Cheltenham Handicap over the last eight years, that had it’s prep run at Leopardstown, you would have an ROI of +46% in the win market, and +47% in the place market. Fairyhouse appears as another worthwhile finding too, with good figures on each measurement, and while the sample of 44 is small, the actual places of 14, when only 8.6 would be expected given the price, is still a significant finding. Newbury, Ascot and Sandown do poorly, Ascot with only 1 winner from 118 runners, and while horses having their prep run at Cheltenham have shown a profit in handicaps the place figure which is more solid, is negative.

Running in Cheltenham Non Hcp – Track for last run

TrackCountWin SRPlace SRWin AEPlace AE
 Leopardstown 24714.6%36.4%1.411.26 
 Cheltenham 1977.1%21.3%0.80.88
 Ascot 1407.9%21.4%1.130.96
 Kempton 13810.1%23.9%1.170.97
 Newbury 1227.4%22.1%0.770.87
 Doncaster 982.0%17.3%0.420.97
 Sandown 644.7%14.1%0.670.58
 All Tracks 18886.9%20.6%1.01.0


For the graded races at Cheltenham horses who had their prep at Leopardstown do miles better than the average, and again show a nice profit backing them blindly, with the Place AE figure being significant too. Once again a recent Cheltenham run doesn’t help, and the market seems to over rate such form too. While Doncaster does notably poorly in win strike rate terms, the place figure is not far below average, so it was probably just variance in the distribution of wins to places, but again Sandown is a good bit below the averages.

Running in Cheltenham Chase – Track for last run

TrackCountWin SRPlace SRWin AEPlace AE
 Cheltenham 2376.8%21.9%1.021.0
 Leopardstown 16513.9%38.2%1.471.37 
 Ascot 1387.2%18.8%1.160.88
 Newbury 1325.3%18.2%0.670.77
 Kempton 1308.5%20.8%1.110.89
 Doncaster 1113.6%19.8%0.771.08
 Sandown 931.1%18.3%0.180.83
 Fairyhouse 287.1%39.3%1.241.81 
 All Tracks 18286.1%20.5%1.01.0


I then split the data into horses running in chases and hurdles at the festival, including both handicaps and non handicaps, as to delve down further would leave us with sample size issues. Leopardstown performs well again, as does Fairyhouse, and again while the sample is small for Fairyhouse it is still significant. Only 1 horse who had their previous run at Sandown won a Cheltenham Festival chase on their next start, and while the place figure suggests variance could be at play, it is still interesting.

Running in Cheltenham Hurdle – Track for last run

TrackCountWin SRPlace SRWin AEPlace AE
 Cheltenham 1826.0%17.0%0.880.81
 Leopardstown 17711.9%32.2%1.431.27
 Newbury 1583.2%17.1%0.540.86
 Kempton 1366.6%18.4%1.261.01
 Ascot 1061.9%17.0%0.370.9
 Haydock 1015.9%16.8%1.491.06
 Sandown 816.2%11.1%1.240.58
 Fairyhouse 519.8%17.6%1.510.87
 All Tracks 18385.3%18.2%1.01.0


Leopardstown as has been the case in all tables, again does well, and while Fairyhouse has a strong win strike rate in festival hurdles, the place strike rate is fractionally below average. Ascot form has a very low win strike rate, but the place one is okay, so of more significance is the very low Sandown place strike rate and Place AE of just 0.58, with the better in strike rate likely due to fortuitous distribution.

Finished in first 3 last time out – Track for last run

 TrackCountWin SRPlace SRWin AEPlace AE
 Leopardstown 23315.0%37.8%1.391.23 
 Cheltenham 1968.2%24.0%0.820.85
 Newbury 1575.7%22.3%0.590.81
 Kempton 1568.3%23.7%0.990.95
 Ascot 1447.6%22.9%1.00.94
 Doncaster 1333.8%19.5%0.721.01
 Sandown 1153.5%12.2%0.520.51 
 Fairyhouse 739.6%27.4%1.351.22
 Wincanton 470%6.4%00.34 
 All Tracks 5116.5%21.4%0.920.96


I then filtered to only include horses who finished in the first three last time out, and as expected those who ran well at Leopardstown last time have a very good record at the festival with Sandown and Wincanton performing very poorly on all metrics. Why this might be is hard to tell, it could be form from the track is over rated by either the handicapper, or the market for the AE figures. Maybe differences in course configuration mean horses who run well at those tracks, don’t like Cheltenham as much. It could just be noise in the data, as after all the samples aren’t massive. It also seems that while a good run at Cheltenham last time means a slightly higher than average strike rate, the market seems to over estimate the importance as judged by the AE figures.

Cheltenham previous Course Winners

Race TypeCountWin SRPlace SRWin AEPlace AE
 Handicap Chase 2333.0%22.3%0.561.06
 Handicap Hurdle 1443.5%13.9%0.910.88
 Non Hcp chase 2338.6%22.7%0.80.79
 Non Hcp Hurdle 17411.5%29.9%0.880.91
 All Races 7906.8%22.8%0.820.91


Given the previous tables showed that a horse running at Cheltenham on it’s previous start does little better than average at the festival,  I thought I’d look at how previous course winners do. The above table seems to show that while the strike rates are slightly over the average of all horses, which from the first table is 5.6% and 19.1% respectively, the market again seems to over estimate the importance of a previous Cheltenham win, which is hardly surprising given how many pundits claim it to be of utmost importance, apparently without any data to back up the claim.

The main points to take from this course analysis seems to be that Leopardstown form is strong and holds up very well at the festival, and the market over estimates previous Cheltenham form. I’m not suggesting if you want to win at Cheltenham, that you should run your horse at Leopardstown for it’s prep run. I very much doubt running at Leopardstown improves your chances. The reason horses who ran at Leopardstown do so well is far more likely to be due to the strength of the races at the track, and it’s likely the handicapper and the betting public under estimate just how strong it is. Running there will also better inform trainer’s which horses should be going to Cheltenham, and which ones have no business going, something that is harder to ascertain in weaker races elsewhere.

Handicap Chase – Previous Chase runs

 Chase RunsCountWin SRPlace SRWin AEPlace AE
 <=5 2835.7%22.3%0.90.99
 6 <=10 2895.2%21.5%1.121.14
 11 <=15 1895.3%15.3%1.390.91
 16+ 2172.3%13.8%0.610.84


The above table is for handicap chases at the festival, and I split the amount of previous chase runs each horse had prior to the festival. While the win strike rate holds up for horses in the 11 to 15 previous chase starts bracket, the place strike rates which are more robust, fall off once we hit 11 previous chase runs, and the figures for 16+ runs are poor all round. This is no surprise really as it’s hard for an exposed horse to win such competitive handicaps, although the market does still seem to underestimate just how hard.

Handicap Hurdle – Previous Hurdle runs

 Chase RunsCountWin SRPlace SRWin AEPlace AE
 <=5 3535.1%19.0%0.960.97
 6 <=10 3584.5%17.9%1.11.08
 11 <=15 1621.2%11.1%0.440.87
 16+ 1083.7%10.2%1.70.92
 11+ 2702.2%10.7%0.870.89


It’s pretty much the same story for the handicap hurdles, with a fall off from 11+ runs. The decent win strike rate and win AE for 16+ runs is probably just variance, so I’ve added together the figures for 11+ runs to give a larger sample, and again it indicates the market might not fully account for how hard it is for exposed horses to do well in these races. It took a good bit of time to get all the data together and then analyze it, so if you found it useful, or hopefully at the least, interesting, then please share it via social media.

