It has been widely appreciated just how much credit is due to the human team behind the G1 Lexus Melbourne Cup winner Cross Counter. TRC Global Rankings reflected their likely influence before the race:
- Godolphin stands second only to Coolmore Partners among owners;
- Charlie Appleby is second only to Aidan O’Brien among trainers;
- Kerrin McEvoy is second only to Hugh Bowman among Australian jockeys.
TRC Global Rankings are entirely mathematical: given a set of results to learn from, what is the ranking that most likely predicts future meetings between competitors? Appleby had a higher rank than fellow British trainer John Gosden not because we think he is better than Gosden, not because his standing in the sport is greater than Gosden’s, nor because his best horses are better than Gosden’s (they are not) nor even because we think he has had a better season than Gosden.
Instead, as we stated on day one and page one of our in-depth mission statement, it is because our numbers suggest that Appleby will win more head-to-head match-ups with Gosden in future, as he did here in the Melbourne Cup (Gosden’s Muntahaa ran really well in ninth).
This, after all, should be the ethos of ranking any set of objects in a predictive setting.
TRC Global Rankings are designed to be predictive because this is the only way to be authentic. A retrodictive (descriptive only of the past) set of rankings is one that is an exercise in flower-arranging: designed to look attractive. Such a set of rankings cannot test itself; it cannot learn from past mistakes; it is an exercise in playing with numbers. We didn’t want to do this because it is precisely this shortcoming that can make rankings in other sports so arbitrary and frustrating.
People tend to answer the question Who’s Number One? in sport merely by the aggregation of past success. That’s fine, but recent success counts for more and is more predictive when weighted suitably. But remember, in team sports and leagues, competitors play the same number of games against the same opponents, crucially equalising both the difficulty of winning and the opportunity to win. In racing, that is a long way from being the case.
If you want to make conclusions by adding up wins, wherever they happened, however many tries it took to achieve them, we can accommodate that: the aggregate of G1, G2 and G3 wins for each competitor is included in the rankings tables.
However, if you want to know how the maths weights each and every performance – the strength of the opposition, the efficiency with which it was gained – then TRC Global Rankings do that with our unique index of a competitor’s position within the milieu of Jockeys, Owners, Trainers and Sires – TRC Global Ranking Points.
How the machine learns
The power of computation is so awesome nowadays that we can get computers to answer questions requiring not just billions of calculations, but also, thanks to machine learning techniques, new, more efficient ways of calculation. TRC Global Rankings sit in the cloud and wait merely to be given a results file each week. When that happens, the machine goes to work, iterating over billions of potential future outcomes to determine the ones that are most likely, given the results in the file and its idea of how to learn.
Learning for a machine is not as complicated as it sounds. Think how you might compute the square root of 1681, given that the definition of a square root is a number which, multiplied by itself, gives the answer required:
The answer must lie between 1 and the number itself 1681
- Pick the value at halfway (841) and square that: you get 707,281
- That’s too high, so now we know the answer must lie between 1 and 841
- Pick the value at halfway (421) again and square that: you get 177,24
- That’s still too high. Rinse and repeat until...
- Pick the value at halfway (41) again and square that: you get 1681
- BINGO !!
This same process lies at the heart of solving most really complex mathematical problems, of which the problem we face here – called Rank Aggregation – is definitely one. But, instead of the laborious process above, mathematicians have invented complex optimisation methods using calculus and matrix algebra that are just perfect for modern, fast computation.
Unfortunately, there isn’t a perfect solution to the rankings problem, just like there is to the square root of 1681, but each week the results file comes in, TRC Global Rankings have more evidence to get closer to the solution, by learning how horse racing ‘works’:
What are the trajectories of new competitors in each category? Let’s build an ageing curve!
What does a slump mean when a competitor’s past record is strong? Let’s try using the average fate of all similar competitors who had the same rankings history!
We only have full data back to 2011. So, learning goes on at an improved rate with each week’s results added. And, there are many data points the system does not know: how many horses does a trainer have? Which new job has a jockey just acquired? What has happened to the business interests of an owner?
Humans can thus improve over rankings by the intelligent understanding of exogenous factors – things the machine does not know or understand. But, doing so over thousands of competitors and races tends to explode the limits of our brain.
Over the next few months, it is likely that Appleby may usurp Aidan O’Brien as world #1 trainer. The system knows and understands that 2018 was a down year for O’Brien, but it does not know what the consequences might be in terms of backward 2-year-olds who did not make it to the races this year, older horses who may discover unrealised potential, changes in the way O’Brien himself operates …
It is a dynamic world, constantly in flux, but what you want from a rankings system is an objective expression of competitive strength that doesn’t depend on subjective factors, that doesn’t listen to the whirl of social media, that just keeps computing and learning and throwing up really insightful predictions.
For instance, Appleby is supposed to have had an annus mirabilis in 2018, according to the experts. Fair enough: the man is potentially an O’Brienesque trainer. But, we were already writing about him last year, and in March he was already the world #4. The numbers demanded it, not anyone’s personal view.
Similarly, in our sires ranking, the maths elevated Dubawi to #1 over Galileo on September 2. It had to. We are trying to minimise rankings violations in future races, instances where a lower-ranked competitor beats a higher-ranked one; Dubawi was dominating the greatest sire of the last 20 years.
Since then, Dubawi has produced five G1 wins, 5 G2 wins and 1 G3 win at a 23 percent strike rate. And there have been 49 meetings between the progeny of the two sires, of which the Dubawi runner has finished in front no fewer than 36 times. Over the course of 2018, Dubawi leads Galileo 74-28!
By our criteria, if your personal ranking of the two stallions was reversed you would have been wrong 73 percent of the time.
Similarly with Appleby and Gosden. We promoted Appleby over Gosden on March 11 this year. Since then: Appleby leads Gosden 21-14 in head-to-head match-ups. When Dubawi or Appleby doesn’t have a runner, that’s fine. Galileo will be our highest-rated sire in the race or Gosden may be our highest-rated trainer, if O’Brien isn’t around that is.
Guess what the machine does if results don’t go the way it predicts? Change the numbers, of course, not straight away because that would be just overfitting to the past, but suitable to the circumstances and to the weight of evidence.
Finally, we join disparate countries and barely co-mingled racing circuits together by the principle that the rarer a competitor’s results, the better he, she or it likely is. It’s often a first guess, just like that initial stab at the square root of 1681. If it doesn’t work as a prediction when competitors finally come together – such as at the Breeders’ Cup in the case of European and U.S. trainers and jockeys, for instance – again the machine adjusts intelligently.
The machine can learn the strength of countries, the influence of a stallion base on that country (sires tend to be less localised than humans) or even the trend in that country’s influence. We don’t use prize money and the names of the races are only employed to group them together abstractly for learning, like a child conceives of things with four wheels belonging to the class initially marked ‘car’.
TRC Global Rankings have no inherent bias towards Dubawi or Godolphin or Appleby. If we did, we have done a poor job with Galileo, Coolmore Partners and O’Brien ranked ahead of them for most of the lifetime of the rankings. But, right now, something interesting is going on at the top of world racing, which the numbers picked up early.
There is only one Galileo, one Aidan O’Brien, one Ryan Moore and one Coolmore Partners. These are the kings of their profession, the undoubted historical number ones, the standard-bearers whose level all other competitors have to surpass.
But, the times they are a changing, as a certain Nobel Laureate once wrote and the committee finally realised was an example of authentic expression in its own right.
Will they continue to change? Or will the previous order be restored? You will have your own strong, intelligent human views. But why not check each week what our numbers are predicting. There would be no point to them if you always agreed.
The latest TRC Global Rankings take into account all Group and Graded races over the last three years up to and including Sunday. The result of Tuesday’s Melbourne Cup will be included in next week’s update.