PV’s Playhouse – Seven Playtesting Traps


Originally, I was going to write about the traps you can fall into when you’re deckbuilding, which was a suggestion I received on Twitter. As I wrote them, though, I found that they applied more to playtesting than to deckbuilding, so I changed the focus of the article. Of course to deckbuild you have to playtest, so it’s not that different, but hopefully it helps when you need to tweak a deck or even just figure out which of the already existing decks you like more.

Throughout the years, I’ve worked with different kinds of players, of many different levels. I’ve been a casual player, I’ve been a PTQ player working with other PTQ players, I’ve been a first time PTer working with other first time PTers, and I’ve been a Pro Player working with other Pro Players. For all their differences in skill and experience, most players in those groups (as well as myself) have fallen for the same playtesting traps:

Trap #1 – You Don’t Isolate Variables

When doing research, it’s important to make sure you’re isolating the correct variables. When A and B grow in proportion, it doesn’t necessarily mean one causes another. It could be, for example, that B and C are related, but C is the one that actually causes A. It could be that C causes both A and B. It could be that they cause each other. It could be just random chance. Studies have showed, for example, that people who drink more coffee are also more likely to have lung cancer. If you just look at those numbers, you might think that drinking coffee actually causes lung cancer; this happens because you didn’t isolate the variable properly. In this case, people who smoke cigarettes also tend to drink more coffee, and it is the cigarettes, not the coffee, that cause lung cancer.

When you’re building a deck, it’s not very different. Imagine, for example, you have the following eight cards:

4 [card]Putrefy[/card]
4 [card]Wolfir Silverheart[/card]

You play against mono-red and you do badly. You think your removal is too expensive and your threats are not defensive enough. You then change it to:

4 [card]Abrupt Decay[/card]
4 [card]Thragtusk[/card]

Now you start winning against mono-red! Clearly, [card]Abrupt Decay[/card] and [card]Thragtusk[/card] are both better than [card]Putrefy[/card] and [card]Wolfir Silverheart[/card] against mono-red. Or are they?

Here are the possibilities:

1) [card]Abrupt Decay[/card] is better than [card]Putrefy[/card]. [card]Thragtusk[/card] is either the same as Silverheart or worse than Silverhart.
2) Thragtusk is better than Silverheart. Abrupt Decay is either the same as Putrefy or worse than Putrefy.
3) Thragtusk is better than Silverheart and Abrupt Decay is better than Putrefy.
4) Abrupt Decay and Thragtusk, in combination, are better than the combo of Putrefy and Wolfir Silverheart.
5) You got lucky or unlucky in some of those matches and replacing those cards had no significance.

The end result is the same (you win more versus mono-red), but it’s very important to know why this is happening—you don’t know which card, or combination of cards, is actually making you win more. If you win solely because of Thragtusk, well, then maybe you can afford to keep Putrefy, since it’s better against other decks. Maybe with Abrupt Decay you’d have enough time for the aggressiveness of Silverheart to come into play, and wouldn’t need the defense of the Thragtusk. Maybe if you only need the Abrupt Decays you can afford to side out some Thragtusks, and so on.

In this case, the best way to isolate the variables is to change them one at a time. With this particular example, you could start by simply switching the four Putrefies for Abrupt Decays. If that is not enough, then change the Thragtusks. If that’s not enough, change both. If it is enough, though, try changing back and see how far you can go.

I think the best method, however, is to start with the full-blown switch and then remove the pieces one by one. The reason for this is that, if you don’t win with what you would assume is the single best configuration against them, then you don’t waste any time and move on. You also get to play with the cards and have a general idea of what is the most important one, so you can make better tweaks for future versions.

Trap #2 – You Don’t Correct Assumptions After You’ve Changed Your Deck

We’ve all been through this before. You take your Putrefy/Wolfir Silverheart deck and you smash Jund—your guys are bigger and you have the perfect removal for Olivia. You make a mental check that the deck beats Jund, maybe you even write down the score of 9-1 and move on. You get to mono-red, and you start losing. No problem—just a few tweaks, some Abrupt Decays, Thragtusks, and now you beat mono-red. Since you beat mono-red and Jund, it’s clear that your deck is very good.

The problem is that you’ve never actually had a deck that beat both mono-red and Jund. Those two configurations are very far apart, and you’ve just removed the cards that were potentially making you win against Jund! Maybe you still smash it, but perhaps you now actually lose to it, and the only way to know is to play against Jund again with the new configuration. It’s very boring, I know—I hate Jund too—but you must do it. You would never assume your Bant Hexproof deck beats Jund because your mono-red deck does, so you can’t assume the same of your two completely different builds.

Trap #3 – You Don’t Consider the Sideboard

I’m extremely guilty of not playing many sideboarded games. A lot of the time, I go to a PT with a card in my sideboard that I’ve never even seen in play before. I wish I could change that, but we don’t have infinite time, and I usually take very long to decide on a deck, so you have to compromise. It is very important, however, not to completely ignore sideboarding, even if you haven’t played games with it. A deck is 75 cards, not 60, and even if you do not have your full 75 you should still be thinking about them. To give you an idea, for PT San Diego we played a Naya deck that had four [card]Cunning Sparkmage[/card] and a [card]Basilisk Collar[/card] in the sideboard. We all played it—over ten of us—and no one had ever played a game with those cards before. We just knew they would be very good, because we played games with the deck, and we could extrapolate how they played out after sideboard.

When you play your game 1s, try to figure out what you would want. Would this matchup be fixable after board? What would you need? What cards are bad? What kind of card would you even have in your sideboard? We often fall into the trap of playing a deck that has a horrible sideboard (most mono-color decks and most aggro decks, and especially mono-colored aggro decks—White Weenie I’m looking at you), because we never even stopped to think about what will happen to us when everyone else adds all these great cards and we add nothing.

The opposite also happens. Sometimes we give up because we can’t beat a certain deck, but that deck is actually quite vulnerable to a specific sideboard plan. If you lose and you can’t imagine how you would win post-board, go ahead and drop it. But if you can imagine, well, don’t give up on your deck. If you aren’t sure what’s going to happen, then you test that matchup after board—that’s better than just randomly jamming sideboarded games.

Trap #4 – You Want to Beat Everything

It’s very rare that a deck is good against everything. There are formats in which you absolutely must beat a deck for your deck to be competitive, but more often than not, we overestimate the presence of a particular deck—it is very rare that a deck exceeds 25% of a tournament.

I’m a big proponent of testing decks against what I like to think of as “the enemy” first (the deck that’s most popular and everyone knows it’s most popular), and if my deck gets demolished, it will be very hard to convince me the deck is good, but I understand that sometimes it’s OK to have a bad matchup or two. If a deck is 10% of the metagame, you’ll get paired against it once in your average tournament, so it’s not the end of the world to not be able to beat it. Sometimes it’s even possible to beat a deck after board, but you have to dedicate so much to it that it’s better to have a very low percentage against that deck and improve against everyone else.

Trap #5 – Inbreeding

When we playtest with the same people over and over, we get the feeling that their choices are more representative of the choices of the other players in the tournament than they actually are. This could be because they play with a different style (more conservatively or more aggressive), but most likely it shows up in deck selection and specific card choices.

We often have this problem on ChannelFireball, especially now that PTs are all new formats. The way it generally goes is, we all like a deck, then Kibler builds his midrange deck to beat that deck, then we all change it to beat Kibler’s deck, and when the tournament comes we are 50-50 versus each other and lose to everyone else.

In truth, sometimes people do follow the same thought process that your team does. At the last PT, for example, we all thought Esper was going to be a very popular deck because it seemed very good to us, and it did end up being very popular. We tweaked all our decks against Esper, and that worked. We also thought BG was going to be very popular, but that ended up not being the case. Why was it not the case if BG decks were also very good? I don’t know—maybe other teams didn’t think it was good, or maybe they were playing against a different metagame than us, against which the deck actually wasn’t good.

Sometimes, though, you’re radically off. PT Avacyn Restored was the best example of this—we had a deck that we figured would beat other aggro decks and the slow control decks. We thought those were the important decks to beat because those were the decks we thought were good, so that’s what everyone else would play. In the end, the entire tournament played a midrange deck that beat ours very easily. I don’t even think that deck was good, but it doesn’t matter what is actually good, only what is perceived as good by the other players, because that’s what they’ll play (unless the deck is so bad that it can never win, in which case just dodge it and win round 1). We assumed everyone would think the same things we did, and we acted accordingly, but when we were wrong on that basic premise it all fell apart.

When you’re playtesting, you have to make sure your builds are at least somewhat representative. You need to choose which deck it is that you’re testing. If the Jund deck wants to try this new tech of four [card]Barter in Blood[/card]s and the Hexproof deck wants to try the new tech of three Sigardas, well, you probably shouldn’t play those decks against each other—this goes back to isolating the variable. The correct way to do it is to play the matchup twice—one with each tweaked deck against a normal version of the opposing deck. Playing a brew versus a brew is useless, as is playing a tweaked version of a deck against a tweaked version of another.

Trap #6: You Attribute Too Much Statistical Significance to a Small Sample

Let’s face it—we can’t brute force our way into a perfect deck. It’s impossible to test as much as we would need for that. You test not to find win percentages, but to understand what is happening in those matches. If you’ve played two sets and went 6-4 and 7-3, congratulations, that means nothing. If you’ve played two sets, went 6-4 and 7-3, and then tell me, “I think this matchup is favorable because of this this and this,” well, now we’ve gotten somewhere.

There have been many occasions in which I’ve played a matchup, gone 4-6 and 3-7 and concluded the matchup was favorable, because the games did not seem representative. When this is the case, I usually try to get other people to play the matchup as well. If they disagree with me, or if they agree but also keep losing, we try to figure out why.

The important thing here is that your 10-games set means a lot less than you think it does, and as far as statistics go is not very relevant to figure out which deck is better, or not as relevant as simply trying to understand what is going on.

Trap #7 – You Get Too Attached to Your Deck

Aaand we finally get to what I think is the biggest challenge to overcome—emotional attachment to your own ideas. While I am most definitely guilty of every other single thing in this article, I think I’m actually very good at separating personal from professional in this regard. You can say whatever you want about my decks and card choices, and I won’t ever for a moment feel personally attacked. There are often critics in the forums of my draft videos, and I like to think that I am able to step back a little and analyze the plays with a clear, unbiased view—it doesn’t matter if I was the one making the play or not. If I say I think it was a good play, that’s because I honestly think it was the right play, not because I’m being defensive of my own choices.

You need to be able to take criticism from your teammates. Finding a deck for a tournament is not a competition, it is a cooperation—they are not going to shut you down because it was your idea and not theirs (or at least that’s the plan anyway—if they do that, find new teammates), so you shouldn’t defend it just because it is yours. If they criticize it, it’s because they think it’s the best way to get a better deck, not because they hate or disrespect you. In fact, if they didn’t respect you, they wouldn’t even bother to try and make you change your mind, they’d just let you do as you please. Taking criticism on your decks, of course, doesn’t mean believing it to be true—it just means not taking it personally. If what they’re saying doesn’t make sense to you, by all means keep testing—maybe you were right all along—but don’t hate them for saying it.

When you are playtesting and trying to find a deck for a tournament, your main priority should be to do well. Well, OK, it shouldn’t necessarily, but if it is your main priority, then make sure you’re treating it that way. If you tell me you really want to do well and then you show up with mono-blue Slivers, well, you’re either lying to me or to yourself. If your priority is to become famous, to be known for a weird deck, to write articles, to have fun, then go ahead and do it, but make sure you understand your priorities. Believe me, I know how that feels—I played [card]Battle of Wits[/card] in a GP after all, and it certainly wasn’t because I thought it was the best choice. You can’t fool me. Just don’t fool yourself! At some point, you should just admit that you like the deck because you who built it, and not because you actually think it’s good.

So, how do I know if my deck idea (or my teammate’s deck idea) is bad or not? When should I give up? Well, the main way to tell if you’re wasting time with a deck or not is to identify a feature in it more substantial than “it’s different.” People often say that having a different deck can lead to some wins due to unprepared opponents, and though that is true, it’s way overstated and usually not worth playing an inferior deck for. If your deck is faster than the metagame, or attacks it at an angle it’s not prepared to fight, or can nullify all their kill conditions—if it does something extraordinary—then you should maybe invest on it. If it’s just different, well, don’t waste any more time.

Mike Flores is a proponent of the “you need to make sure your deck is not just the same as another deck but worse” (paraphrasing here but you get the idea), and I think that he is absolutely right. For example, what is wrong with this Modern deck?

[deck]4 Arid Mesa
4 Flagstones of Trokair
1 Horizon Canopy
4 Marsh Flats
10 Plains
4 Ethersworn Canonist
4 Figure of Destiny
4 Knight of the White Orchid
2 Ranger of Eos
4 Steppe Lynx
4 Student of Warfare
4 Brave the Elements
4 Honor of the Pure
2 Mana Tithe
1 Path to Exile
4 Spectral Procession[/deck]

If you answered “mono-white, duh,” congratulations, you get bonus points. The truth, however, as much as it pains me to say this, is that there is nothing inherently wrong with this deck. It has a clear game plan, a curve, good mana, etc. In fact, this deck won PT Amsterdam in 2010. That PT was not Modern, though—it was Extended. The real problem with this Modern deck is that it’s not this deck:

[deck]4 Arid Mesa
1 Blood Crypt
1 Breeding Pool
1 Forest
1 Hallowed Fountain
1 Marsh Flats
4 Misty Rainforest
1 Plains
1 Sacred Foundry
4 Scalding Tarn
1 Steam Vents
1 Stomping Ground
1 Temple Garden
2 Kird Ape
3 Knight of the Reliquary
3 Snapcaster Mage
4 Steppe Lynx
4 Tarmogoyf
4 Wild Nacatl
4 Lightning Bolt
4 Lightning Helix
4 Path to Exile
2 Spell Pierce
4 Tribal Flames[/deck]

This is our list from Worlds 2011—though you could have picked any Zoo list as an example. The problem with the White Weenie list is not in the list itself, but it lies in the opportunity cost of playing mono-white. In this case, it’s not playing the Zoo list and the powerful cards in it—[card]Wild Nacatl[/card], [card]Tarmogoyf[/card], etc. If you wanted to play that WW list in 2011 Modern, you could have—all the cards were legal and the list is coherent—but you would simply be playing a much worse version of Zoo. In fact, the reason [card]Wild Nacatl[/card] was banned was not because it was super powerful but because it invalidated all the other aggro strategies—you were never able to play Boros, or RG, or this mono-white deck, or mono-red because the exact same deck with Wild Nacatl would turn out better. Of course it kind of backfired, and instead of making those Boros, RG, mono-w and mono-r decks viable, it simply eradicated non-Affinity aggro from the format, but that’s a different argument.

Very often you see someone ship you a list for evaluation and it’s just a bad version of a deck that already exists. For example, a UR deck as opposed to UWR, or 5-color Jund, or Jund-Reanimator as opposed to a Junk-Reanimator. In those cases, it’s clear that the motivation is not to be good, but to be different—if the goal is to win, then you would play the established deck that does what you want it to do.

When you come up with a new deck, again, try to compare it to existing decks. What qualities does it have that the existing decks don’t? How is it going to win games that the existing decks won’t? If the only answer you can come up with is “it’s different, so people won’t know what to expect,” then chances are you do not have a keeper.

Well, that’s what I have for today. I hope you’ve enjoyed it, and see you next week!


1 thought on “PV’s Playhouse – Seven Playtesting Traps”

  1. Pingback: Student of the Game – 2imu

Comments are closed.

Scroll to Top