Card evaluation is a key skill that allows you to navigate drafts confidently and arrive at a streamlined, coherent deck. It is also the area of limited play where leveling up is crucial if you want to improve your game. Over time great magic players have created and shared a plethora of ways to make the process of card evaluation easy(ish) and well organized. But until now, most of those methods rely on theory and intuitive understanding of the limited format. I would like to add a data component to those approaches, giving you a tool to update your card evaluations by looking at the impact each card has on generic win rates.
A good theory has several hallmarks. It is intuitive, simple, but it has high predictive power. It is also easy to expand it and build on it to reach more subtle conclusions. In Magic, few concepts rival former Limited Resources co-host Brian Wong’s Quadrant Theory for card evaluation. For those unfamiliar, Quadrant Theory divides game states into four general categories:
- Development stage, early stages of the game, you generally start on parity and try to gain advantage or not to be significantly behind when the later game phases arrive. Development sets the scene for the midge portion of the game. Your position in midgame will be greatly impacted with the cards you play in that stage.
- Ahead – Board states where you are advantaged and opponent needs to catch up if they want to win
- Parity – Board stall, where none of the players has a clear advantage, but one key play can break this balance
- Behind – Board state where you play catch-up with the opponent
Wong proposed that cards should be evaluated based on their power level in each of those quadrants. To give an example from ZNR: cards, like Luminarch Aspirant will be rated good to great in each of the quadrants. Aspirant is able to run away with the game when played early, solidify a win, when you are ahead, break the parity by making your threats larger and even let you come from behind. A good card doesn’t need to excel in every quadrant to be good, some cards can be extremely powerful while only being great in 2-3 of the quadrants. For example, Dream Trawler is not great in the development stage, but if you build your deck to survive the early pressure and manage to arrive in late game, it lets you escape with the game by gaining life and card advantage while being resistant to removal.
Quadrant theory is a great tool to evaluate cards in a vacuum, but certain cards can vary in power due to the context of the set it is in. Ben Werne (AKA Mister Metronome of the Lords of Limited fame) recently updated Quadrant Theory by adding a synergy component to it that takes context into account.
Understanding and applying Synergy Theory requires more in-depth knowledge of the current set but does improve your card evaluation for cards that can’t be fully evaluated within the Quadrant Theory framework. Importantly, it allows you to make better judgment calls during your draft by taking into account synergies already present in the deck you are drafting and cards that you anticipate seeing later in the draft. Applying synergy to Quadrant Theory makes your pick order more flexible and adaptable to a changing situation. Applying both of those concepts at the same time can be a major level up as you move from drafting best cards to drafting great decks.
While both Quadrant Theory and Synergy Theory are great tools for card evaluation, they can fall prey to personal biases due to their subjective nature. Two great players will vary in evaluation of the same card based on their personal view of the format and paradoxically, both of them can be right. Some cards fit the play style of some players more than the others.
Most of us, us being Magic content consumers, are not LSVs – we are closer to the above-average – good level in limited. There is nothing wrong with that! But, we must honestly admit that this may mean that some evaluations of top limited minds will not always work for us. It is important to know how do certain cards perform in the hands of players at the similar skill level and use those data to inform our choices.
Enter 17lands data. 17lands is a data tracker specializing in limited data collection (thus the name). The idea behind 17lands is to collect data, not to hoard it, but to release it to those interested so they can benefit from a large data set feeding us information about aspects of the game that may escape the pundits.
One of the recent features of the tracker, and one I’d like to focus on today, is collecting the data on card performance in the decks if they are drawn and when they are drawn. With this feature, data from a large set of limited games can finally let us quantify the card power. We decided to look at 52k+ games of BO1 (so take notice the numbers may vary in BO3) and look at the trends relating to all the cards in the set and try to assign a precise number to each card that reflects its ability to make you win more. In theory, this is a great way to confirm the conclusions you arrived to using card evaluation methods you use, but importantly, it is great food for thought when your personal evaluation does not match the data.
But this is easier said than done. How to measure card quality is not a straightforward task.
You could look at win rates decks containing a particular card have. At 17lands we set off to find a good way of quantifying card power that is available in Arena draft logs and yields reliable and useful information. To do so, we tried several methods. Firstly, we looked at the win rate of decks that contain a given card but having a card in the deck does not mean you draw or play it. Just think of all those decks with an A+ level bomb you drafted only to never see the bomb (oh yes, our brain is much better at memorizing those drafts than at remembering this fluky top deck that won you a badly played game).
Furthermore, bad cards will occasionally be put into otherwise good decks, boosting the bad card’s win rate but not contributing towards your win rate on their own merit. This way of measuring is noisy in an already high variance game that Magic is, but can be really telling in the beginning of the draft. The win rate when a card is in a deck tells you a bit about the card quality and a bit about the power of the color combinations this card is frequently played in. As the draft progresses this value loses a lot of its predictive power as you are less likely to divert from your early draft path.
Other way to evaluate cards is to look at the win rate when a particular card was cast. This has advantage over the previous method, you know that a card was drawn in a game it was cast. At the same time, it creates a massive bias towards more costly cards. A problem with cards like Ugin, the Spirit Dragon is not winning once you cast it but getting to eight mana to cast it. In many limited games you will die with Ugin in hand short of the eight mana that would guarantee you the win. Therefore, its win rate when cast will be extremely high but this does not mean that Ugin is a great card. In fact, in many decks with a more aggressive mana curve it can be unplayable, as your deck is not prepared to have enough resources to cast it.
So what’s a more reliably and informative way to look at single card win rates?
Enter the previously mentioned 17lands feature, another way of measuring card performance that could complement traditional tier lists based on quadrant theory and in the later stages of the format, on quadrant theory with synergy theory add-on.
The new way of how Arena logs it’s data allowed us to not only see if a card was in the deck, but also if it was drawn during each game. If a card is great, surely it will increase the win rate of the games where it was drawn. We calculated the win rates for the games where a given card was drawn, was drawn on the opening hand, was drawn later in the game, or was never drawn. These four measures can give us a very simple but telling evaluation of card strength based on the differences in win rates.
But what do those numbers mean? Let’s start with the win rate where card was not drawn. This is a very important, but less exciting measure. This is our benchmark for the decks containing a given card. Some color combinations are just more powerful than others and this will be reflected here. It should also be evaluated carefully while looking at the sample sizes of those decks: some color combinations will have high win rates but that does not mean you should draft those cards aggressively. Sometimes, a color combination has high win rate because the deck comes together rarely, but when it does it is excellent. You should take this into account and make sure you are convinced that the particular combination is open and you will likely get the access to key cards for this archetype more likely.
Win rate when a card is drawn in the opening hand is a data-driven way of looking at the power of a card in the development stage of the game. If a card has a high win rate when present in the opening hand, it is very likely good in the early turns. Most cards that have a high win rate when present in your opening hand are going to be low cost creatures and spells that let you gain early advantage, but not all of them.
Some of them are good in your early hand as they give you a clearer decision-making path in the early game. An example of such card in ZNR is Zagras, Thief of Heartbeats. It is not a two drop but knowing it is in your opening hand informs your decisions on how to play your party members, how to trade early to make sure it hits the board as early as possible.
But being good early does not mean that the card is generally good. Take Akoum Hellhound. This card is pretty good if you have it on your opening hand: a 1-drop that attacks as a 2/3 during the early turns. But in late game it gets hit two-fold: first of all 2/3 is not impressive in the late game, and on top of that, it is not even guaranteed to be a 2/3 by then.
Win rate when a card was drawn later in the game but not in the opening hand is another measure. Here you can expect to see costly spells that are great when you can cast them but in the early game are dead on your hand. An example from ZNR is Vastwood Surge. This is a game-ending card if you kick it but early in the game it can’t be cast or tempts you to cast it for much less value without kicker. This measure is also very interesting for post-mulligan decisions. If you know a card has an unimpressive win rate while on your opening hand, but great in the late game – you should consider it as the card you get rid of your hand after you mulliganed.
Lastly, we have win rate when drawn at any stage of the game. This is the most important measure. It tells you how much drawing a given card impacts your win rates, and more. It also lets you see if a good early drop is also good enough in the late game to net you an advantage when you put it in the deck. It also lets you understand if that Ugin of yours is good enough in the late game to compensate for it being a sitting duck in your hand for a large chunk of the games you play.
At this point, I want to caution that raw win rates can be deceiving. As white mages among you know well, not colors were made equal. Some colors have lower win rates than others, but that does not mean that a single card in that color is not a bomb. That is why we decided not to look at win rates only but at the differences in win rates, using the win rate when a card was not drawn as a benchmark. If decks with a given card win 50% of their games, but once you draw the card this increases to 60% – we are looking at an exceptionally good card. Just drawing it in a game makes you win more and overcoming the potential disadvantage of the color you are playing being weaker than the others. Same goes for drawing a card in the opening hand – if your win rate when the given card is not on your opening hand is 50% and if you have it in your opening hand it jumps to 60%, it clearly is really good early in the game.
But maybe it is best to look at specific examples to get a better idea of the results. Let’s start with a format all-star, the card that increases your win rate at all stages of the game, Luminarch Aspirant.
At a baseline, the decks with Aspirant win 50.5% of the time when Aspirant is not drawn. A relatively unimpressive result, but as soon as you draw it, your win rate skyrockets to 76.5% – a sweet 25 percentage point increase in win rate. This means that this card is incredible in the early game. But it is not terrible in late game either. It still increases the win rate from 50.5% to 55.5% when drawn later. When you add those two, Aspirant, when drawn in a game, increases your win rate by 20 percentage points when drawn in a game at all, independent of the time it was drawn. This means that this one card is a complete game changer in your white deck, and means you really are incentivized to make sure it does make your deck once you draft it.
On the other side of the spectrum we have Tormenting Voice. The decks containing it win 57.5% of the time when Tormenting Voice is not drawn. But if it is in your opening hand, the expected win rate goes down by 10.5 percentage points to 47%. When drawn late, it has a smaller impact, but you still lose 2.5 percentage points from your win rate. This adds up to a 49% win rate if you draw it at any stage – way lower than the baseline of the decks that contain it. This means that putting a Tormenting Voice into your deck actively makes it worse. This means that in this format you should really avoid playing Tormenting Voice and consider filling your deck with something else when making the last cuts.
There are several things to keep in mind when looking at this data. It looks at all decks, therefore it will boost the results of generalist cards and cards with mana restrictions. Luminarch is a good example: it will be great in all decks you put it into. It has multiple synergies, it is good on its own, it will find home in every white archetype. Some cards are more specialized. Take Marauding Blight-Priest. It is way better in WB clerics deck than in any other shell. But sometimes it will find its way to other black archetypes, where most likely it will underperform. This will reflect on its general score and most likely lower it. There is the advantage of mana restricted cards: multicolored cards are naturally positioned in the archetypes where they are supposed to shine, like one of the top performers, Soaring Thought-Thief. Unfortunately, this analysis does not have enough data to determine with confidence, which color pairs are best homes for each particular card, but we do plan to look at such interactions in the future articles.
Another thing that this data set does not reflect is pockets of synergy. Yes, Scavenging Blade is generally a poor card, but in some decks it will shine. Without a more targeted analysis it is impossible to see where does it over perform. Again, this is a first attempt to look at this type of data and I am sure with time we will be able to pinpoint particular synergies and card combinations that when drawn together in a deck vastly improve mediocre cards performance.
All the curated data is available with a brief description on 17lands (link) for your pleasure. But it would be rude not to give you some info on the best and the worst cards of the format. Let’s start with the all-stars. And this list should be quite intuitive. Bombs are easy to evaluate and ZNR is no exception. The card with the biggest impact on your deck is the Luminarch Aspirant I mentioned earlier increasing your win rates by a massive 20 percentage points (pps) when drawn. It is followed by a bunch of rares (Phylath, Maul of the Skyclaves, Zagras, green Inscription), but also some uncommons (Soaring Thought-Thief, Roost of Drakes, Cleric of Life’s Bond). Each of those increases your win rate by 10-15 pps if you draw it, but not necessarily in the same way. For example, Thought-Thief is particularly strong in your opening hand (18 pp increase in win rate) while Phylath boosts your win rate when in opening hand by “only” 8 pps but excels when drawn in the late game.
The 10 best and worst commons, as they are the bread and butter of limited decks. In the best commons there is a clear trend. most cards there are cheap ways to get ahead in the race. 6 of the top 10 commons: Into the roil, Chilling Trap, Zulaport Duelist, Bubble Snare, Rabid Bite and Roil Eruption let you switch the tempo, trade favorably and all can be played for little mana. Turntimber Ascetic and Kor Celebrant, on other hand, are ultimate stabilizers, with both gaining life while clogging the board. Cunning Geysermage can serve as a blocker early and will gain tempo value later in the game. The odd one out is the Seafloor Stalker, which may be a competent blocker early and is a great finisher later.
The worst commons are dominated by the red cards, which is in part a testament to the strength of the color. Spitfire Lagac, Inordinate Rage, Sizzling Barrage, Cleansing Wildfire, Scavenged Blade and Tormenting Voice are found in decks with a very high baseline win rate and they just can’t catch up with the quality of Roil Eruptions and Grotag Bug-Catchers. Some of those cards are also most likely very synergy driven and there is, very likely, a subset of decks where they do pretty well (think Scavenged Blade in WR Warriors with a couple of Kor Blademasters in a Bug-Catcher deck. The biggest surprise on that list for me was Tazeem Raptor, which I considered a decent card. This may have something to do with it being included in party decks where it doesn’t shine, and getting played too early before it can accrue its full value.
How to use this data when navigating the draft?
I would suggest that for the early picks the gain in win rate when the card is drawn vs. not drawn, and use the generic win rate of when the card is in the deck as a tiebreaker. This way you ensure that you pick a more impactful card, but at the same time you take into account the context of the format and in case of doubt put yourself on a potentially stronger color combination. As the draft progresses, the latter measure becomes less important as your path is more clear and you should focus on the difference in drawn vs. not drawn win rates and based on what is your game plan for the deck, supplement it with opening hand win rate advantage (if you are more aggro) of late game win rate advantage (if you force your deck wants to play long games). Numbers can only get you that far so on top of this, do use your drafting experience and adjust those evaluations based on a general population to your own style.
This is the first attempt to tackle such data and isn’t yet refined. Think of it as a quadrant theory presented in times when we have a good understanding of synergy. Looking at bulk performance of a card, when we know it may have a great home in one particular archetype is not optimal and we are fully aware of it. In the future expansions we plan to look at the same metrics within each archetype to pinpoint cards that have a well-defined home and allrounders. This is the data driven version of expanding quadrant theory with the synergy theory. What we hope to achieve this way is to aid you in understanding what should be the composition of each available archetype and as such make the decision making during the draft easier and more focused. Another thing we want to achieve is up to you. Some of the cards yield surprising results, results we do not have a good explanation for. This can mean two things: our metric is imperfect in some way or some deeply held assumptions of card evaluation are not right. Both of those options are possible and we would love for the community to discuss our data and give some personal perspective to the raw numbers we present you with. By all means, share your thoughts with us and this way help data-driven card evaluation becoming a useful tool for limited fans.
I would like to dedicate this article to JJ, my team mate, friend, and someone who was always challenging me to be more careful with data analysis. JJ, you are sadly no longer with us, and lack of your proof-reading certainly makes this article much worse than it could have been, just like my world is much worse without you being there.
All data was taken from 17lands.com, from BO1 human drafts. We looked at 52k+ games. We excluded all the cards with fewer than 1000 games available. Due to constrains of the arena logs all the data is from the games where the 17lands user did not take the mulligan, so it does not represent a perfect data sample of all games. Make sure you interpret the data cautiously. This is an early attempt at a novel way of grading cards, we hope to perfect in the future, so if you think there are large errors, please let us know. Our intention is to expand the understanding of limited and feedback is a key part of doing it well.