The Hall of Fame voting period has started, and as always this will spawn discussions about which players are good enough to warrant voting for them. At least in part, this comes down to the question of which players are best at this game, and going one step further, how to measure greatness at Magic. Being a bit of a statistics nerd, that is a question that has always interested me. After giving it some thought, I think that the metrics Wizards uses and gives us on their Hall of Fame website are suboptimal.
There are really only three metrics in use today: Top 8s, Pro Points, and Median. Top 8s are great, because they showcase the most valued achievement in the game, but obviously they are also a very discrete, imprecise measure. Pro Points are mostly a lifetime achievement which is also fine in that the Hall of Fame should be for players that have dedicated a significant part of their life to the Pro Tour, but it cannot distinguish between a grinder without much high level success and a hotshot that played only a few events.
Finally, there is median and I think this is actually one of the worst measures to take. The shortcomings of Top 8s and Pro Points as a metric are very obvious, thus everybody is aware of them and uses them accordingly. Median, however, is a pretender. It seems like it is a very precise and fair fair way of measuring Pro Tour performance. Unfortunately it measures something that we don’t really care about at all—the average. Let’s face it, in a game with as much variance as Magic most tournaments don’t go our way. That doesn’t mean they go awful, but most of the best players are not going to reach the goal they set for the tournament. This is a fact that everybody accepts, and so we don’t care for all the mediocre results that Guillaume Wafo-Tapa or Matej Zatlkaj have had over their career. Instead we focus on when they do well, and see that they Top 8’d or Top 16’d yet again. These are the results that count, and for them we are in awe of the prowess of the player. Nobody cares if that player wins a match more or less when they are having a bad day, but that is what is measured by the median.
So what should we use instead? Well, we should measure what we care about, and that is good finishes. How often do they occur for a player and what does “good” mean for that player? It turns out there is an easy way to get much closer to that goal. You probably know that the median finish is the result that a player has exceeded exactly 50% of the time. This is also called the 50% Quantile. But there is no reason why we should use 50% as a yardstick here. Sure, the median is by far the most used quantile-based measure, but as it doesn’t really measure what we are interested in, we might just use another quantile. How about the 20% Quantile? Which one we use is up to us, but 20% seems to give us a good grasp on what having a very good tournament means for a player. Conveniently the very best players are known to have Top 8’d about one in every five Pro Tours, so that gives this measure a bit of tangibility right away.
Before trying the 20% quantile I would like to apply a fix to the results, though. The way Wizards calculates the median doesn’t care if you finished fourth in a 450-person Pro Tour, or fourth with your team in a 110-team Pro Tour. This of course means that they compute a median over apples and oranges. To avoid this problem I have normalized all results to an event size of 400 competitors. Thus when I calculate the 20% Quantile if you finished 100th in a 200-person Pro Tour it will be figured in as 200th place out of 400. First stays first of course, and last will be counted as 400th, the rest will be scaled in linear fashion.
I have applied that measure, and the result was way better than what you get when you rank a player on their median finish. This is what the Top 25 looks like:
The ranking seems kind of good. The Top 25 on 20% Quantile is pretty much the who’s who of Magic. If you take a look at the median scores of the same players you will notice that they are all over the place with some of the undisputed greats having absolutely miserable scores. However, the 20% Quantile is not without its flaws either. If we turn our eyes on Kai Budde and Terry Soh, they don’t seem to be properly place in a ranking of the best players ever. Eight players being better or even way better than Kai doesn’t feel right. The reason for his “bad” place in the ranking is that his results are distributed in a way that makes his 20% Quantile look comparatively bad. Kai has played in 51 Pro Tours, so only the 11th best finish is actually used to calculate the 20% Quantile. Let’s take a look at his best normalized results:
1, 1, 1, 1, 1, 1, 1, 5.8, 7.1, 10.1, [20%Q] 13.9, 14.2, 16.0, 17.2
Kai has a couple of very good results, mostly wins, then his results drop off sharply, and for the last period his results worsen just very slightly. The 20% however is measured exactly at the end of the sharp drop off. This lets his 20% Quantile look way worse than his actual results. The main reason for this problem is that Quantile-based measures measure performance only on one or two data points. In this case Kai suffers from that approach, but if we take other cutoffs other players will suffer. We can see that for Wafo-Tapa and Josh Utter-Leyton when we use the median.
To avoid this problem I decided to use an average the the 10%, 20%, and 30% Quantile scores of players. Unfortunately this is not as elegant as using just one Quantile, and makes the resulting numbers less tangible, but it turns out this approach yields very good results, in fact some of the best results I have ever seen derived in a purely statistical way. These are the stats for all players that can be voted for this year, and all players with at least ten Pro Tour appearances that I ever interviewed for my Pro Tour Specials:
Player 123 Quantile
Kai Budde 5.3
PVDDR 5.6
Jon Finkel 6.0
Josh Utter-Leyton 6.0
Gabriel Nassif 7.4
Guillaume Wafo-Tapa 7.5
LSV 8.4
Kenji Tsumura 8.4
Terry Soh* 8.6
Mark Justice** 9.4
Tom Martell 10.0
Stanislav Cifka 10.5
Kamiel Cornelissen 10.7
Makihito Mihara* 11.2
Matej Zatlkaj 12.1
Eric Froehlich 12.7
Dirk Baberowski 13.2
Tomoharu Saitou°° 13.3
Marijn Lybaert* 13.5
Anton Jonsson 13.5
Masashi Oiso 13.9
Olle Rade 13.9
Patrick Chapin* 14.3
Tomohiro Kaji 14.3
Willy Edel 14.9
Randy Buehler 14.9
Nicolai Herzog 15.0
Shuhei Nakamura 15.2
Zvi Mowshowitz 15.5
Paul Rietzl 15.7
Paul McCabe 16.3
Darwin Kastle 16.4
Shouta Yasooka 16.5
Scott Johns 16.6
Gary Wise 17.2
Jelger Wiegersma 17.2
Martin Juza 17.4
Tommi Hovi 17.7
David Sharfman 18.5
Marcio Carvalho 18.7
Mike Long 18.7
Andrew Cuneo* 18.8
Robert Jurkovic 18.8
Katsuhiro Mori 19.0
Osyp Lebedowicz* 19.4
Raphael Levy 19.4
Owen Turtenwald 19.5
Olivier Ruel 19.7
Nico Bohny 19.8
Justin Gary 19.8
Tsuyoshi Fujita 20.0
Matt Costa 20.4
Mike Turian 20.4
Alan Comer 20.4
David Ochoa 20.4
David Humpherys 20.7
Shingou Kurihara* 21.5
Rob Dougherty 21.7
Frank Karsten 21.7
Reid Duke 21.8
Ben Stark 22.4
Brian Hacker 23.1
Yuuya Watanabe 23.2
Sam Black 23.3
William Jensen 23.4
Richard Hoaen** 24.6
Ben Rubin 24.7
Antoine Ruel 25.3
Mark Herberholz 25.9
Yuuta Takahashi 26.8
Chris Pikula 27.5
Brian Kibler 27.6
Shahar Shenhar 28.2
Bob Maher 28.5
Jamie Parke* 29.1
Tsuyoshi Ikeda** 29.6
Kenny Oberg* 31.3
Bram Snepvangers 32.8
Brock Parker 32.8
Steven OMS 33.0
Gerard Fabiano** 34.8
Simon Görtzen 35.4
Andre Müller 35.9
Alexander Hayne 37.3
Ivan Floch 37.9
Robert v. Medevoort 39.3
Lee Shi Tian 40.0
Samuele Estratti 40.4
Adam Yurchick 41.0
Craig Wescoe 42.9
Gerry Thompson 47.1
Tzu-Ching Kuo 53.5
Michael Jacob 53.5
Melissa DeTora 62.8
Players in Italics have less than 20 Pro Tour appearances.
* = One result is missing (either because I overlooked it or due to Wizards not having full PT stats).
** = Two results are missing.
°° = For Saitou I found that he played in 37 Pro Tours, but the Hall of Fame stats say he only played in 35.
I don’t want to say this measure is perfect, but especially the top slots capture the best of the best very well. Terry Soh being amongst those at the top is surprising, but although largely ignored by just about everybody, he has the results to warrant that placement. He actually did even better when we focused on the 20% Quantile alone, and his median is on par with Finkel’s. Even if he is not that good we might assume that he is one of the most underrated players around. Huey, Reid, and Owen do a bit worse than we might imagine, but we have to keep in mind that this picture is painted on results and thus the past. These three certainly have gotten even better since they have joined forces, and I imagine that their actual power level might be at least in the 10 to 15 region, but saying this is pure guesswork of course. Future results may show if that is actually true.
In the end I think that almost every way of measuring Pro Tour success is better than using the median, because—as I already said—the median measures mediocrity, and what we really want is a metric that enables us to better compare the frequency and the quality of success of a player. Using the 20% Quantile is a step in the right direction, but is flawed as any Quantile approaches in that these rely on very few data points. The measure I presented today is not perfect either, and I doubt that there is one perfect measure, but it does a reasonably good job of measuring success by focusing on the good, very good, and exceptional finishes of a player, while also being relatively robust to anomalies in the distribution of these finishes.
Finally I would like to give you a short overview on who I am giving my vote for the Hall of Fame this year, and a brief explanation why.
Makihito Mihara: The one candidate that really needs no detailed explanation. Great deckbuilder, second-best stats, and in contrast to some others on the ballot he has never been heard to have done anything shady whatsoever.
Guillaume Wafo-Tapa: Like many other voters I am convinced that Wafo has paid his due for one bad decision. I value honesty and sportsmanship highly, but I am also a firm believer that all humans make mistakes, and redemption should be possible especially for non-repeat offenders. Otherwise Wafo has the best stats, is a great deckbuilder, and the best control mage of all time.
Scott Johns: I am always surprised by how many people don’t even take the time to say why they do not vote him. Scott Johns has made the Top 8 of a Pro Tour five times which was good enough for basically everybody else to get into the Hall of Fame, but Scott barely gets 10% of the votes. If the Pro Point system would just remotely resemble what we have today Scott would have been the first Pro Player of the Year, too. On top of that the guy did quite a bit of community work in his time, hosting several of the earlier, successful Magic websites, and eventually became the first editor of dailymtg.com. For me that is easily good enough to include him on my ballot. Some voters hold it against him, that he probably wouldn’t come to Pro Tours, or prefer to vote for players that they think can make good use of the lifelong invitation. In my opinion this is exactly not what the Hall of Fame is about. If Scott doesn’t want to play on the Pro Tour, that is his personal decision. To be honest I find his approach way more honorable than Justin Gary’s approach of campaigning for himself, and saying “Hey, if I get in I might even come to the Pro Tour again.”
Paul Rietzl: Paul’s stats are more or less on par with some of the other players’ stats. The vote for Paul is mainly a subjective one. Whenever I see Paul play, be it live or on camera, I am impressed with the tightness of his play and his posture. He might not really be the best player in the world, he might not do as much for the community as Willy Edel, and I don’t care that he has to manage his job and Magic at the same time (at least not where my vote is concerned), but he does impress me as a player and that in my opinion is the single best reason to vote for someone.
Mark Justice: Okay, I figured it out. I cannot really vote for this guy any more, and that’s a shame. Not having by far the best player of the early Magic game in the Hall of Fame sucks! I could instead have given this vote to a bunch of other players like Marijn or Edel, but as much as I respect them I think their resumes fall short just a tiny little bit.
There are a bunch of other people that are clearly in consideration. My reason not to vote for any of them is generally one of the following three: Their resume is good, but not exceptional, their resume is exceptional, but lacks depth, or the gravest offense—significant negative community contributions. In most cases these reasons have been identified by others and discussed at length. I don’t think it helps going through them once again.
-Florian
P.S.: If you want to help me improve my data base, and you happen to know where I missed a result—maybe because you are one of the players that I don’t have complete data on—feel free to check out my list of Pro Tour appearances and drop me a note in the comments, or get in touch via any kind of social media. Thanks!
Discussion