I’m presently sorting through hundreds of cards for the SatW game, in the hopes of cutting the game down to a more reasonable size. I’ve played a lot of games with this prototype and have plenty of notes about each card. The most useful “at a glance” measure is that I’ve been ending playtests by asking “Which cards did you like? Which ones made the game worse?” and putting a plus or minus next to each card that gets mentioned (depending on whether it’s a positive or negative mention). This is less useful than you’d think.
I’d love to just add up the totals, throw all of the “positive overall” cards into the game and call it a day. That’d be super easy and I could feel like I did a great job running all of those playtests and using the data. It’d also generate a horrible game, for two reasons.
The first is the existence of controversial cards. There’s a difference between these two:
Australia (Daredevil) Gamble: Lose your hand and draw two cards. +
North Korea (Flashpoint) World War Three: Remove any character, that characters owner may then use this ability. This can happen several times +++–
They’re both a plus one overall, but where Australia (Daredevil) is a solid card that’ll almost certainly make it into the set, North Korea (Flashpoint) is more controversial. Three people loved it and two hated it. That makes the decision of whether to include it one which will shape the game.
It tends to be the case that controversial cards (ones that get a large number of marks, but that are mixed between positive and negative) have a substantial impact upon the direction of the game. That impact is loved by some players and hated by others.
Rather than a naive approach of “Include the cards with the most likes” they require an overarching design philosophy – about what is important to this particular game – in order to determine whether they should be included or not.
The second issue is the trickier of the two: The quality of cards varies with the context in which they’re played. Consider this character:
Sealand (Naive Friend) Hi!: Choose an opponent’s character, use their ability, then give them Sealand +
It’s another solid plus one, but by contrast to the previous example of a plus one the experience players have of it depends upon the other cards in the game. Sealand is going to be an interesting card where other players have characters with abilities that are good enough to justify giving away a character in order to get them.
A game is based upon interesting choices, the closer that trade off is and the more viable abilities there are to trade it for, the more interesting the decision of how much to bid for Sealand (Naive Friend) becomes. So while it’s presently a positive experience, simply taking all of the positively marked cards in the game might change the context in which Sealand (Naive Friend) will be played and make it into a boring card.
Understanding any one card in this manner is relatively straightforward. I could go through the set and rate each card 1-5 based on how interestingly it combos with Sealand (Naive Friend) and then having made my decisions about other cards, check out those rating and decide whether to include Sealand.
Unfortunately for me I keep trying to make everything interesting and find that the key to replayability is in making different combinations of effects interesting in a way that’s more than the sum of its parts (As it would be phenomenally expensive to create a game with enough cards that players didn’t see repeats in concurrent plays. Though there are some cool tech approaches to this.) This means that I’d want to assess half of the cards in the game that way and couldn’t make a decision about any one until I’d made decisions about all of the others.
As an ex-academic I’m aware of statistical approaches that would let me take a network of “combo values” and collapse them down into a theoretical “best” combination of cards. I’d be curious to see what sort of game that this approach would create, but I’m not convinced that a 1-5 rating can truly encapsulate the consequences of the decision. There’s something to be said for human judgement.
There are probably a lot of approaches to untangling quality in context decisions. Here’s mine:
I look for “spikes”, moments in games that players haven’t just liked, but loved. There are a *lot* of games in the world and making something that has a place in someone’s collection isn’t about being vaguely likable – it’s about setting someone’s soul on fire. So I start by looking at moments that players have had that kind of experience.
Then I pull the elements necessary for that moment, which might be a particular card, or a combination of cards or even simply there being enough cards representing a particular archetype. Those get locked into the game and act as its heart.
The game then needs a core to support the heart. Chances are, whatever those moments were, they were only possible with the support of particular cards. Sometime cards is too specific, it can come down to “This works if it’s possible to steal a character a few times a game”. That doesn’t suggest a particular card, but does suggest a quantity of cards that need some kind of theft effect.
Then it’s a case of seeing what else the core can do, looking for things that will play off the core effects to generate other great moments. The more decisions that are made, the easier it is to make future decisions, since the chicken and egg problem is resolved.
At that point there’s just one thing left to do: Burn it all down.
There’s a danger that this approach generates a local maxima – that is to say something that’s good and can’t be improved with small changes, but that’s not as good as radically different solutions. The best defence against this is to look back at the early decisions and see how their flow has affected things. Particularly looking for thoughts like “This card creates a fantastic experience, but the core needed to support it doesn’t support anything else, which makes the core cards feel really bland most of the time and locks off other experiences.”
It can be hard to think those thoughts, especially if you’re in love with the great interaction at the heart of it. So the best thing to do is take a few of the things that you like the most, have a look at a version in which those things don’t exist and honestly compare it to the first version.
It’s more work than adding the pluses and minuses like it’s a big game of fudge, but I think it makes for a better game in the long run.
Interesting dilemma you sketch! I wished I had a solution to this… It has made it clear that this is something to look out for in my own game.
Too bad it’s not possible to test all combinations of cards. I guess here game designer intuition comes into play.
Good luck!