The games that we’ve looked at so far are the simplest case of

basic games. In these games, we’ve got a payoff matrix, and both players

can see the whole matrix – the players have equal information,

and nothing is secret. The players move simultaneously – so neither player can wait to see what his opponent does before making his own decision. Finally, the game is played exactly once – there are no repetitions.

The first complication we can add is to make it an *iterative game* – that is, instead of each player going once, they’ll repeat the game 10 times in sequence. Everything else stays the same: perfect, equal information, simultaneous moves, etc.

This creates an interesting added dimension. Now, we’ve got two layers

of strategy: each iteration, each player selects a strategy; and then there’s an over-arching strategy that they use to select their strategy each iteration. That over-arching strategy is called a *grand strategy*

In an iterated game, the optimal solution for players can be different than in the single iteration. The goal is to maximize your total utility at the end of a series of iterations. It’s OK if some iterations result in a loss of utility, so long as by the end of the series of iterations, the final

sum of the utility of the series is maximal.

For example, look at the following game. (This example is copied

from the textbook The Compleat Strategyst (Complete Strategist): Being A Primer On The Theory Of Games Of Strategy.)

A | B | |
---|---|---|

1 | 3 | 6 |

2 | 5 | 4 |

Player 1 needs to choose either 1 or 2; player 2 needs to choose A or B. For simplicity, this is set up so that player 2 always pays player 1, so the utility values in the game grid are written as payoffs to player 1. *(Originally, I didn’t include an explanation of the game grid payoffs. This confused a lot of readers. Sorry!)*

There’s no saddle point here. The maximin for player 1 is 4 (at 2,B); the

minimax for player 2 is 3 (at 1,A). So the simple solution formula from before

doesn’t work. If we’re only playing the game once then there’s no way for

either player to create an idea outcome for themselves. Look at it from the

perspective of player 1. If player 2 picks strategy A, then the best thing for

player 2 to do is to pick strategy 2. But if player 2 picks B, then player 1

is better off picking A. Since the players move simultaneously, player

1 cannot pick the best strategy.

If the game is played once, then there’s no way to play optimally. It’s

basically random. But if you *repeat* the game multiple times – that

is, you play it as an iterated sequence of games – then you can find an

optimal strategy for selecting strategies. That’s called a *grand
strategy*: in a game where you make multiple moves, each move is a

strategy, and the process of selecting strategies for each move is a

*grand strategy*.

There is a successful grand strategy in this kind of iterated

zero-sum game. What’s interesting about it is that our idea of

strategy is not really what works as a successful grand strategy: the successful grand strategy is *random*.

After all: if player 1 can figure out player 2’s grand strategy,

then he can easily pick an optimal strategy. So the best grand strategy

is one that the other player *can’t* figure out. If you’re playing randomly, then there’s no way that the other player can predict

what strategy you choose.

So, let’s look at an example of a series of iterations, with player 1 selecting randomly, with a uniform probability distribution, and player

2 playing minimax:

- Iteration one: Player one selects 1, player two selects A. Player one takes 3.
- Iteration two: Player one selects 2, player two selects A. Player one

wins 5.

It will continue like that. Player one will win 3 half the time, and 5

the other half – for an average of 4 – which is his maximin value. So

playing this way is at least as good for player 1 as maximin.

This is a pretty trivial game: for player 2, he’s guaranteed to

lose, and the only question is how quickly he’s going to lose. And

there’s not any strategy that’s going to really improve things much for player 2. There’s not much that 2 can do: playing strategy B is pretty much a lose for

him. So why would he ever do it?

If player 1 *knows* that player 2 is always going to play A, then he

could play strategy 2, and always win 5. So player two is motivated to

play B just often enough to make player one try 1 once in a while. The

key is, how often should he do that? What’s the best distribution for him,

to minimize his losses?

As you can see from the example, the key to finding the optimal strategy is

probability. You need to assign a probability distribution to the different

strategies. Each iteration, you pick a strategy randomly, according to the

distribution. If you can find the right probability distribution for your grand strategy, then you can optimize your winnings.

But is there always an optimal distribution? And how can you find it?

There *is* always an optimum. John vonNeumann (who managed to have his

fingers in an astonishing number of pies) proved that for every two player,

simultaneous move zero-sum iterated game, there is an optimal grand strategy based

on a probability distribution. And it’s computable. It took a while after

JvN to figure out the fast way of doing it – but it’s doable, and it’s doable

quickly, through a technique called linear programming.

Linear programming is a fascinating topic in itself. And it’s complicated

enough that it deserves a post of its own. (Or, more likely, several.)

pgVery interesting. I didn’t know the bit about an optimum always being computable. Hopefully we’ll get a chance to hear more about that!

I was a little confused at the start about the table used, since it has 1 and 2 as names for both strategies and players. Maybe use X and Y for the strategies, to mimic A and B?

I think I forgot to read “Player 1 needs to choose either 1 or 2; player 2 needs to choose A or B.” That said, there are likely other readers who will do the same and decide to stop reading when they get stalled.

JanneMaybe the payoff matrix is rendered wrong for me or something as the whole thing is incredibly compressed and hard to read (using Firefox 3), but I see only a single payoff value for each entry, not a pair, so I really don’t understand the explanation here (the previous game theory post suffers from the same thing). I guess the values I see are for player 1 but I don’t really know.

alan@Janne: This is a zero sum game, so the number given is the payoff Player 1 recieves, and thus the sum Player 2 loses.

@MCC: I would state this explicitly, as I got confused at the same point, and suspect many others will.

JanneOK, thanks; there was no info on it being a zero-sum game, and as the matrix is rendered so compressed I wasn’t sure what I was looking at.

StephenIf you have no idea about linear programming, couldn’t you just code the game, and have the computer try various probabilities and see which work? Computers are pretty fast. It might be quicker than actually knowing what’s going on.