A concise data-driven approach to predicting the 2020 Australian Football Hall of Fame inductees

This morning I read on Twitter that the AFL decided to hold its Hall of Fame over spread the event over four nights beginning tonight (1 June).

This move was made presumably to solve for both the lack of large events that are able to be currently held, and to help fill the footy void prior to the AFL season resuming on 11 June.

I had been keeping one eye on new plans for the Hall of Fame night but this news had somehow escaped me. I had planned to do deep-dive and expand on the analysis I first put together ahead of last year’s ceremony

If interested, you can read more about the context, the data, the statistical approach and some of the outcomes in that article.

I have thrown together an article in fits and spurts today to put some ideas on the table prior to the first inductions tonight.

Updating for 2019 inductions

Last year I used a combination of the data-driven player likelihood of induction, plus recent trends in types of selections in the previous few years, to make some ‘predictions’ (and I say that loosely given the very small sample size). I wrote: “I expect this year that the selection committee will once again focus on South Australia and induct two Croweaters, alongside two-three modern-era AFL players and possibly another dark horse.” 

My ‘predictions’ were:

  • Tom Leahy
  • Jim Deane
  • Tony McGuiness
  • Kelvin Templeton
  • Simon Black
  • Alastair Lynch

Pleasingly, I landed one Croweater, as Jim Deane (a name even most die-hard footy fans have probably never heard of) was inducted. Deane played 11 seasons for South Adelaide and two at Richmond, winning the Margarey Medal twice and South Adelaide’s best and fairest on six occasions. On the data-driven ratings (which are based on historical selector preferences, not any sense of objectivity), he was the top choice from all historical SANFL players.

On the face of it, Deane was my only direct ‘hit’ in a shallow year for inducted players (only four). The only other players inducted were Trevor Barker, Brad Hardie and Ken Hunter. The data had Hardie as the fourth ‘most likely’ from VFL players in the 1960s-80s era to be inducted, but I did not expect that the selectors would –  once again – go back to the 1970s and 1980s to induct more players out of the VFL/AFL. Since 2011, there had been a promising trend away from that cohort. If you have read ‘Footballistics’ you will understand that there exists already a vast overrepresentation of players from that demographic. 

Complementing their selections were Ron Evans (as an adminstrator) and Michael Malthouse (as a coach).

At the time, I couldn’t understand how Simon Black was not inducted. Black had become eligible that season, marking five years after his retirement. The selectors had shown an ongoing preference to induct ‘gun’ players the first year they became eligible, and the data (as well as recent memories) were very hot on him.

Turns out Black “had been voted in by the Hall of Fame committee last year” but “was unable to attend the function due to his filming of Australian Survivor overseas”. Perhaps that was not public knowledge at the time as the reality show had not been aired, but I remember his omission (and only four players inducted) left me confused.

For statistical purposes, I am going to consider Black ‘inducted’ in 2019. It also makes sense to update the data to take him of the prospective pool for analysis for this year’s selection.

There was a stronger lean back towards the VFL/AFL (pre-1990) last year than I had anticipated

I’ll give myself two ticks out of six for last year’s predictions.

  • Tom Leahy
  • Jim Deane (correct)
  • Tony McGuiness
  • Kelvin Templeton
  • Simon Black (considered correct)
  • Alastair Lynch

Updating the data for 2020

Updating the data for analysis required a few tweaks:

  1. Update the inducted players from 2019 with a new status
  2. Remove the inducted coaches and administrators from the potential pool (as they are assumed ineligible for induction as players, even though some have great playing records)
  3. Re-run the model to include the ‘class of 2013’ retirees and the inducted cohort of 2019 cohort
  4. Re-run the predictions on the new pool, including the ‘class of 2014’ retirees who are eligible for the first time 

I updated the chart I ran last year to show the relationship between various standard achievements across the leagues and how they contribute to the chances of induction.

Each dot represents an individual player – how many of that achievement they recorded (a count on the horizontal axis) and whether or not they were inducted (bottom means not inducted, top means inducted).

On top of the aggregate set of points, a smoothed line (logistic curve) which best fits all points for every player in each particular league. As all of these achievements indicate success, we would expect to see the average line of all players sloping up from bottom left to top right – which is exactly what we see. The slopes and points at which the lines tilt up differ, and here is where we see differences in how the selection committee has historically not deemed achievements in some leagues as worthy as others.

The likelihood of induction of Australian Football Hall of Fame candidate players, by league and playing era

Marked by the blue curves rising almost exclusively faster than the red and yellow curves, the VFL/AFL pre- and post-1990 honours appear to be considered far more worthy in the eye of selectors than those in the SANFL or WAFL. In other words, it takes a much more glittery CV in those latter leagues to have the same chances of induction as in the VFL/AFL, even during the state-based eras.

Eyeing off the chart, it appears that a player with one Brownlow medal is more likely to be inducted than a player with two Magarey or Sandover medals. Or that players with five club best and fairest awards in South Australia and Western Australia are deemed to have similar chances to those with just two in Victoria or in the national era.

I interpret these results with the hypothesis that the Hall of Fame selection committee naively assesses SANFL and WAFL players, with the shortlist only coming from those players with absolutely stand-out CVs in some of the most prestigious award categories. It seems that some statistics, such as games played or premierships, are not considered whatsoever. This explains why there are a whole host of Western and South Australian footballers with large tallies of games played and premierships won who have not been inducted. 

Predictions for 2020

There have been few changes to the below visual from 2019, save from the inducted players dropping off and Jonathan Brown appearing right at the top of the list of recent players.

An article on the AFL website notes that “the likes of Luke Ball, Jonathan Brown, Dean Cox, Darren Glass, Lenny Hayes, Ryan O’Keefe and Ben Rutten are in contention for the first time” after retiring in 2014. This list of players, with Brown and perhaps Cox as exceptions, do not have CVs that tend to be acknowledged by the committee. Ben Cousins has been eligible for five years now, but given his current circumstances I expect him to be overlooked again. 

The modelled likelihood of future Australian Football Hall of Fame induction of candidate players, by league and playing era (chart updated 3 June to fix a data error)

A Fox Footy article last week notes “each inductee [will take] part in a long-form interview to air alongside a career highlights package”. Given there is far more air time to fill in this made-for-television ‘event’ over four nights, and the expectation of in-person interviews and lots of footage, sadly once again I don’t expect the long list of “neglected heroes” to be honoured this week. 

Therefore my prediction have an ultra-modern focus, with some names from last year popping in again to round out the selections:

  • Simon Black (confirmed and as predicted in 2019)
  • Jonathan Brown (first year eligible)
  • Alastair Lynch (making it a real Brisbane premiership flavour)
  • Paul Couch (posthumously)
  • Kelvin Templeton
  • Tony McGuiness
  • Don Lindner/Steve Malaxos (one token ‘interstate’ player)

Using data to predict the 2019 Australian Football Hall of Fame inductees

Embed from Getty Images

Mel Whinnen giving his acceptance speech at the 2018 ceremony

Tonight (Tuesday 4 June), the Australian Football Hall of Fame will honour its next batch of players, coaches, administrators and media performers with induction. Each year in the lead up to the ceremony there is often a range of speculation from various media types on which ex-players will be in the latest batch to be inducted. The Victorian writers tend to focus on the newly eligible candidates from the recent pool of AFL retirees, while the focus interstate is often to wonder when the ledger will be tipped a little back in their favour to honour some of their long-overlooked football legends.

The Australian Football Hall of Fame was established in 1996 and “seeks to recognise and enshrine players, coaches, umpires, administrators and media representatives who have made significant contributions to Australian Football – at any level – since the game’s inception in 1858”. In total, 136 individuals were inducted into the Hall of Fame in the initial intake of 1996 and a further 121 have been added in the 22 years since. Players make up the bulk of the intake (coaches, administrators, umpires and media personalities also have a presence), with 202 inducted off the back of their playing career and another 28 which have since been elevated to Legend status. This post will focus on players, as the cohort with both the largest sample and most easily accessible records and playing honours.

It is important to note that the Australian Football committee “considers candidates from all parts of Australia and from all competitions within Australia” rather than merely the sole elite-level competition today, the AFL. Newer or younger enthusiasts of the sport may not realise that indeed the current pinnacle of the game is a somewhat recent change within the history of the structure of the sport at all levels, with state leagues dominating the landscape for over a century until the late 1980s. The three states which have always played the most prominent role founded formal football associations well back into the 19th century, namely Victoria, South Australia and Western Australia.

For a number of years, football historians have bemoaned the seemingly inequitable induction rates into the Hall, specifically favouring both playing careers in Victoria (over the other state leagues) through the VFL/AFL, and those that tended to remain in the memories of selectors (from the 1960s onwards) at the expense of those prior to the television era.

The challenge

Woven through a lovely narrative of the careers and legacy of South Australian champion footballers Sampson ‘Shine’ Hosking and Tom Leahy, one chapter of the 2018 book ‘Footballistics’ analysed the skew across states and seasons of the inductees into the Hall of Fame. It investigated the dis-proportional induction rates in player groups by different eras and in different competitions. It also looked at some of these trends have evolved over time since the inaugural Hall of Fame induction in 1996. Neither Hosking nor Leahy have been inducted, despite both objective and anecdotal evidence to their favour, whilst vast numbers of particularly VFL players from the 1960s-1980s have been inducted ahead of them.

What I also wanted to do was understand which career achievements may be meaningful in the eyes of the committee in order to gain induction. This was not an easy task, as there are no minimum achievements required to reach eligibility and the committee “considers candidates on the basis of record, ability, integrity, sportsmanship and character. How could one even begin to objectively or quantitatively measure many of these attributes, such as “ability” or “character”? These are hard enough to even hypothetically consider a relevant metric, never mind go about finding an available data set where these metrics might exist for elite Australian Football! The one attribute that, of course, can be at least partly assessed is the playing record of individuals. The short biographical sketches written about each inductees provided on Hall of Fame website give us a good starting point.  They typically include statistics such as career span, total games and goals, league and club best and fairest awards, league and club leading goal kicker awards, league and club teams of the century, All-Australian selections, state appearances, grand final best on ground awards and years captained.

Given (to my knowledge) a lack of a comprehensive and complete database across the elite levels of Australian football, to research for ‘Footballistics’ I spent a lot of time collating (through best efforts) an aggregate data set which summarised player-level data spanning leagues, including attributes such as seasons played, games and goals, and various honours and achievements. This process was fairly labour intensive and detailed, so I’ll spare the details until the bottom of the chapter. It is fair to say that the some of the numbers in the data set used this analysis range greatly from the precise to the approximate, such is the time scale of the players’ careers assessed and evolving nature of competitions over more than a century – as well as a large number of ‘fuzzy’ joins on my end which also require a leap of faith. Accordingly, all values should be read as estimates only, providing a good overview and understanding of patterns but not as ‘ground truth’.

I ended up with a relatively broad data set of Australian footballers over time, including their career games and goals at top-league clubs, the seasons they played, competition-wide honours such as league best and fairests, league goal kicking awards and number of premierships, and team-based achievements such as club best and fairests, club goal kicking awards, and club captaincies. All of these achievements were summarised per player and split out over four major league categories, namely the VFL/AFL (until 1989), the SANFL (until 1990 prior to Adelaide joining the AFL), the WAFL (until 1986 before West Coast joined the AFL) and the VFL/AFL (the national era since 1990). I was also able to include representative selection, specifically All-Australian selection in both the Carnival and national eras. One piece I would have loved to include (and I think would have been particularly explanatory) was number of intra-state league and state-of-origin matches, however I couldn’t find a comprehensive data set with this information.

For simplicity, I chose to concentrate my efforts on the playing careers of inductees only. All inductees referred to in this section have been inducted as ‘Players’ and/or ‘Legends’ (unless otherwise specified) and all games referred to are as players (rather than coaches), as this by far comprises the biggest sample size of inductees for analysis purposes. The Hall of Fame inducts individuals who participated in more than one role still in a single discrete category. For example, John Cahill is listed as ‘Coach’, even with very respectable playing careers, while five-time VFL premiership coach Jack Worrall is listed as a ‘Player’! Perhaps erroneously (a theme…), the website has Western Australia’s Jack Sheedy as the only individual listed on both the ‘Player’ and ‘Coach’ pages.

The approach

With this data set, I was able to fit a number of classical and machine learning models to ‘reverse engineer’ a data-focused criteria to induction into the Hall of Fame. In the end I settled on the best performing logistic regression model, as it was philosophically suitable to model a binary induction status of players based on their the aggregation of their achievements, performed well with many of the input variables statistically significant, and was more explainable to a pleb like myself.

Hall of Fame induction status and its relationship to a range of various playing honours and achievements, by league

The only set of attributes I dropped were the club leading goal kicking awards for players across all four league categories, as they showed no explanatory power. I would assume this is as leading a club’s goal kicking can be done by some relatively ordinary players in ordinary sides, and total career goals is a much better indicate of strong forwards who are deemed worthy. Some of the other attributes exhibited statistical significance across two or three of the league categories, and in that case I left the entire set in the model for consistency.

I also only considered at players that commenced their careers after 1897 (to standardise the pool across the leagues), and stripped out the inducted Hall of Fame ‘Coaches’ entirely from the training data set. I didn’t want to muddy the data set by having their playing careers not matched to a Hall of Fame ‘Player’ induction, nor include them as inducted players either.

With the combination of the sum of all these playing achievements and attributes, I was able to generate a propensity for all yet-to-be inducted footballers. A rating towards 1 suggests the player is very likely to be inducted, while a rating closer to 0 suggests the player has little chance under the current criteria. For example, highly-decorated and newly-eligible champion Simon Black topped the board with a modelled rating of 0.97.

It is also important to note that what this model does not do is suggest who ‘should’ and ‘should not’ be inducted. It merely tells us what factors may have been significant contributors to the induction of players in the eyes of Hall of Fame committees of the past, and gives us a rating or propensity for similar-type players to be inducted given the committee’s selection history. For this reason, the model does not try to explicitly account for any skews towards the VFL or the 1950s to 1970s – in fact, by applying it to the careers of all players not yet inducted, it will tend to favour the same type of players. You could say that you only get out what you put in. 

As such, it’s hard to compare like-for-like as the players with any history in the VFL/AFL and modern era tend to outshine all others numerically. Instead, I have forced its hand and broken out the top five results for a combination of league and era (two-decade increments). Players were allocated to both the league and the era in which they played the highest proportion of matches in their careers. This way, we can compare ‘like with like’ and at least understand which yet-to-be-inducted players our data suggests should be closer to Hall of Fame worthiness.

The modelled induction chances of prospective Australian Football Hall of Fame candidate players, by league and playing era

The skew towards the VFL/AFL and the more modern decades is pretty stark, with much higher top-five ratings in those selections.

Our South Australian pioneers ‘Shine’ Hosking and Tom Leahy pleasingly sit top two in the first two decades of the 20th century, while Jim Deane (two-time Margarey Medal winner and six-time South Adelaide best and fairest) and Don Lindner (Margarey Medallist and three-time North Ad(Sandover Medallist, two-time West Perth best and fairest and three-time WAFL premiership player)elaide best and fairest) lead the way for the Croweaters in other eras.

Over to the west, and Hugh ‘Bonny’ Campbell (four-time WAFL premiership player and once kicked 23 goals in state game), Ted Flemming (Sandover Medallist, two-time West Perth best and fairest and three-time WAFL premiership player) and Steve Malaxos (Sandover Medallist and inaugural West Coast best and fairest) seem to be the most likely candidates in their respective eras from the WAFL. Prior to the 2018 induction ceremony, there were two more names at the top of this list who are now Hall of Famers. An earlier iteration of my model placed Bernie Naylor and Mel Whinnen within the top three selections for WAFL players at this time last year, which spurred me on to consider this analysis for 2019.

A 2018 tweet showing the then-top-ranked WAFL candidates heading into that year’s Hall of Fame ceremony

The VFL/AFL has had a stack of players inducted each year, and a case could be made that some ‘very good’ players have been a little lucky (if that is possible) to receive the honour. One name missing that has jumped out me for years is that of Kelvin Templeton. He is one of just five players to have won both the Brownlow and Coleman Medals (two), and remains the only such player to not be honoured with an induction. Combine that record with two Footscray best and fairest awards and five leading goal kicker awards and surely his name must be floating near the top. Tony McGuiness doesn’t quite fit properly into the ‘1960s-1980s’ VFL era as he only played 87 of his career 335 matches in that dimension. I’ve decided to place him there as he played 60% of his career games in the 1980s, and played 66% of his games in the VFL/AFL – his career wasn’t largely in the SANFL in the 1980s, nor largely in the 1990s AFL either.

In previous generations, Bill Cubbins (one of the premier full-backs of his era and four-time St Kilda best and fairest) and Alby Morrison (five-time Footscray leading goal kicker and two-time club best and fairest winner) are rated highly in compariso to their peers.

In the national era of the AFL, Simon Black is eligible for the first time in 2019 and outshines all with a hat-trick of premierships at Brisbane, Brownlow and North Smith medals and three All-Australian guernseys. 

The model is calling out some features it sees as important to have on the CV of a Hall of Fame footballer, but we must remember that it is kept in the dark from so many other features that football fans and historians could call out in an instant. How can it factor in the individual ability of Gary Ablett Senior, the courage of Francis Bourke or the defensive resolve of Vic Thorp, when they only shared four club best and fairests between the three of them? As such, perhaps it’s not surprising given its limitations – and the limited samples of inductees from South Australia and Western Australia – that there may be some notable omissions from the highly-rated candidate list. It may only take some further ‘squaring up’ by the committee in future years to help recalibrate the model and smooth the results.

It is important to understand in this analysis that our model – and indeed, all analytical models, to various degrees – is useful for understanding but must be considered flawed. And in this case, it is heavily flawed on multiple levels, due to the question we are trying to answer and availability of structured information we have at our disposal with which we want to use to answer it. Before we even begin we have recognised we are unable to even measure most qualities considered by the committee and once we consider playing records only, we miss such anything qualitative or anecdotal and must be constrained to the objective list of achievements. But even then, few honours can easily be compared over many decades. For example, the current All-Australian selection system today recognises the best players by position across a season, however prior to the modern era it was selected from an inter-league pool based on performances at interstate carnivals. And then finally, it due to the structure of most awards and statistics, it is likely that certain types of players (particularly defenders) are likely to be statistically under-represented by any analysis, due to the lack of quantitative metrics that tend to relate to those who shut down, rather than create.

Predicting the 2019 inductees

Although not reflected on the official Hall of Fame website (the stale criteria outlined is now a number of iterations old), news articles in recent months have pointed to an expansion (to eight) of the number of possible inductees within a given year. I am foreseeing the first female to be inducted, along with perhaps another non-player (administrator, coach or media type) this year if the option is chosen.

That will leave five or six male players on the brink of induction into the Hall of Fame for 2019.

We have modelled our most likely player candidates for induction, however in recent years the selectors pleasingly are starting to lean a little towards a more representative mix of induction candidates, across both states and eras.

All inducted players following the inaugural intake, split by the amount of games played in major leagues

As discussed, last year there was a strong Sandgroper flavour to the ceremony with both Bernie Naylor and Mel Whinnen inducted. The year prior, South Australia’s John Halbert was honoured alongside ex-Collingwood but also VFA-legend Ron Todd. In 2016, there was again a decent lean outside Victoria with Paul Bagshaw representing South Australia, with Ray Sorrell and Maurice Rioli spending considerable chunks in the WAFL.

The selection committee have in recent years been pretty consistent with inducting two or three AFL-era players, including the immediately eligible Matthew Scarlett last season. The selectors also leant into the 1970s and 1980s with Terry Wallace and Wayne Johnston last season, however there are been few VFL-types in the seasons prior to 2018.

I expect this year that the selection committee will once again focus on South Australia and induct two Croweaters, alongside two-three modern-era AFL players and possibly another dark horse.

To put my money where my mouth is, bringing together everything the data has told us about the Australian Football Hall of Fame, my tips for 2019 induction are:

  • Tom Leahy
  • Jim Deane
  • Tony McGuiness
  • Kelvin Templeton
  • Simon Black
  • Alastair Lynch


This post, and analysis, is dedicated to SANFL statistician and historian Mark Beswick, who passed away in April 2018. Mark was one of the first people I contacted when trying to hunt down footy data sets outside the VFL/AFL in 2017. 

Appendix: The data

As far as I am aware, there exists no comprehensive database or structured data sets of top flight Australian footballers, including clubs, tenure, games, goals, and league and club achievements and honours.

This meant that I needed to do my best to stitch one together myself. For the best ‘single view’ of all footballers over time and states, I took the view provided by the wonderful website AustralianFootball.com. Building on the work of footy history doyen (John Devaney / Full Points Footy), this provided the most comprehensive and consistent data set covering the VFL/AFL, SANFL, WAFL, VFA/VFL and indeed some other major leagues. Although I am aware of some missing players across the three major leagues (from Western Australia and South Australia), certainly this view contained a broad enough view that the majority of noteworthy players across the country (both inducted and otherwise) had been captured.

The human delineation of fluid history is always somewhat arbitrary, but effectively I wanted to pull out the three major state leagues prior to the national era, and then separate out the modern national competition from its state-based past. Therefore I summarised the league and club data (games and goals) into aggregates into various state- (or league-) based buckets:

  • VFL/AFL pre-1990 (“VFL” – the state era)
  • SAFA/SAFL/SANFL pre-1991 (“SANFL” – the state era)
  • WAFA/WANFL/WASFL/Westar Rules/WAFL pre-1987 (“WAFL” – the state era)
  • VFL/AFL post-1990 (“AFL” – the national era)

For consistency, I took players only from 1897 onwards, as this allows like-for-like comparison between the three-major leagues (and also most of the league and club honours are recorded after this point anyway). I also used this data set to derived out the career start (earlier season listed) and career end (latest season listed) for each player.

Next I had to join a varied selection of league and club honours and achievements. For availability and consistency purposes, I narrowed the chase down to the following:

  • League best and fairests
  • League leading goal kicker awards
  • League premierships
  • Club best and fairests
  • Club leading goal kicker awards
  • Club captains
  • Representative selections (i.e. All-Australian carnival and AFL All-Australian teams)

Most of the club and league honours were pulled league-by-league and team-by-team from various official and unofficial websites. The premiership counts were generously provided by Greg Wardell-Johnson, Ric Gauci and Steve Davies for the WAFL and Kyle Smith for the SANFL. The WAFL Footy Facts website was also very useful and clearly provides the best availability and accessibility of any league’s data outside of the VFL/AFL.

I wanted to chase down state league/state of origin games, however I was unable to find a comprehensive enough data set. This would be a good inclusion to the model going forward, as it would provide a deeper indication of the better players playing in and/or from each state at a given point in time.

Next required the arduous and tedious task of joining the achievements onto the player name details, which I did via ‘fuzzy’ joins with some supervision. There is no doubt that some of these joins will be incorrect, however for the most part (and definitely for all inductees), I can confirm the matches were sufficient for ‘good enough’ analysis purposes.

I also built out the data set of Hall of Fame inductees from the official website, which again was a little tedious as different induction years have their information structured in slightly different formats. With the same process, I created flags for the inducted players, including their induction year (and year induction as a Legend, if applicable).

For the purposes of our analysis, I set the eligibility to include players who commenced their careers from 1897, to create a common baseline to compare the new VFL against the SANFL and WAFL. The data includes all VFL/AFL players, all players and achievements in the SANFL up until 1990 (before Adelaide’s induction into the VFL/AFL in 1991) and all players in the WAFL up until 1986 (before West Coast’s induction into the VFL/AFL in 1987). As all inductees debuting after 1990 have played the vast (if not all) of their careers in the AFL, these exclusions attempt to capture all achievements from the three major leagues before the national modern era and only those records and honours in the AFL since that point. Further, the current eligibility criteria is for five years retired from the sport, so all players who were active from 2014 and onwards were likewise excluded.

I also stripped out the inducted Hall of Fame ‘Coaches’ entirely from the model training data set. I didn’t want to muddy the data set by having their playing careers not matched to a Hall of Fame ‘Player’ induction, nor include them as inducted players either.