# 10. Mixed strategies in baseball, dating and paying your taxes

Professor Ben

Polak: All right, so last time we did something I

think substantially harder than anything we’ve done in the class

so far. We looked at mixed strategies,

and in particular, we looked at mixed-strategy

equilibria. There was a big idea last time.

The big idea was if a player is playing a mixed strategy in

equilibrium, then every pure strategy in the mix–that’s to

say every pure strategy on which they place some positive

weight–must also be a best response to what the other side

is doing. Then we used that trick.

We used it in this game here, to help us find Nash Equilibria

and the way it allowed us to find the Nash Equilibria is we

knew that if, in this case,

Venus Williams is mixing between left and right,

it must be this case that her payoff is equal to that of right

and we use that to find Serena’s mix.

Conversely, since we knew that Serena is mixing again between l

and r, we knew she must be indifferent between l and r and

we used that to find Venus’ mix. So I want to go back to this

example just for a few moments just to make one more point and

then we’ll move on, but we’ll still be talking

about mixed strategies throughout today.

So this was the mix that we found before we changed the

payoffs, we found that Venus’ equilibrium mix was .7,

.3 and Serena’s equilibrium mix was .6, .4.

And a reasonable question at this point would be,

how do we know that’s really an equilibrium?

We kind of found it but we didn’t kind of go back and

check. So what I want to do now is

actually do that, do that missing step.

We rushed it a bit last time because we wanted to get through

all the material. Let’s actually check that in

fact P* is a best response to Q*.

So what I want to do is I want to check that Venus’ mix P* is a

best response for Venus against Serena’s mix Q*.

The way I’m going to do that is I’m going to look at payoffs

that Venus gets now she knows – or rather now we know

she’s playing against Q*. So let’s look at Venus’

payoffs.. I’m going to figure out her

payoffs for L, her payoffs for R,

and also her payoff for what she’s actually doing P*.

So Venus’ payoffs, if she chooses L against Q*

then she gets–very similar to what we had on the board last

week, but now I’m going to put in

what Q* is explicitly–she gets 50 times .6..

[This is Q* and this is 1-Q*.]. So she gets 50 times .6 and 80

times 1 minus .6 which is .4,80 times .4.

We can work this out, and I worked it out at home,

but if somebody has a calculator they can please check

me. I think this comes to .62.

Somebody should just check that. If Venus chose R–remember R

here means shooting to Serena’s right, to Serena’s forehand–if

she chose R then her payoffs are 90 Q*.

So 90(.6) plus 20(1-Q*) so 20(.4), so 90(.6) plus 20(.4),

and again I worked that out at home, and fortunately that also

comes out at .62. So what’s Venus’ payoff for P*?

We’ve got her payoff for both her pure strategies,

so her payoff from actually choosing P* is what?

Well, P* is .7, so .7 of the time she will

actually be playing L and when she plays L,

she’ll get a payoff of .62, and .3 of the time she’ll be

playing R, and once again, she’ll be getting a payoff of

.62 and–do I have a calculator? Sorry, thank you.

So P* is .7, yes, you’re absolutely right,

so this is P* and 1-P*, So let’s make that clearer.

I’ll show you what the equilibrium is but P* itself is

.7. So when Venus plays L with

probability of .7, then .7 of the time she’ll get

the expected payoff of .62 and .3 of the time she’ll get a

payoff again of .62 and that’s the kind of math I don’t have to

do at home, that’s going to come out at .62.

Again, assuming my math is correct.

So all I’ve really done here is confirm what we did already last

time. We knew–we in fact chose

Serena’s mix Q to make Venus indifferent between L and R.

And that’s exactly what we found here, going left it’s .62,

going right it gets .62 and hence P* gets .62.

But I claim we can now see something a little bit else.

We can now ask the question, is P* in fact the best

response? Well, for it not to be a best

response, for this not to be an equilibrium, there would have to

be some deviation that Venus could make that would make her

strictly better off. Let me repeat that.

If this were not an equilibrium, there would have to

be some deviation for Venus, that would make her strictly

better off. By playing P* she’s getting a

return of .62. So one thing she could deviate

to, is playing L all the time. If she deviates to playing L

all the time, her payoff is still .62 so

she’s not strictly better off. That’s not a strictly

profitable deviation. Another thing she could deviate

to, is she could deviate to playing R.

If she deviates to playing R, her payoff will be .62.

Once again, she’s not strictly better off: she’s the same as

she was before, so that’s not a strictly

profitable deviation. So what have I shown so far?

I’ve shown that P* is as good as playing L,

and P* is as good as playing R. In fact that’s how we

constructed it. So deviating to L is not a

strictly profitable deviation and deviating to R is not a

strictly profitable deviation. But at this point,

somebody might ask and say, okay, you’ve shown me that

there’s no way to deviate to a pure strategy in a strictly

profitable way, but how about deviating to

another mixed strategy? So, so far we’ve shown–we’ve

shown just up here–we can see that Venus has no strictly

profitable pure-strategy deviation.

She has no strictly profitable pure-strategy deviation because

each of her pure strategies yields the same payoff as did

her mixed strategy, yields the same as P*.

But how do we know that she doesn’t have a mixed strategy

that would be strictly better? How do we know that?

Anybody? No hands going up;

oh, there was a hand up, good. Student: Any mix between

left and right will still yield .62.

Professor Ben Polak: Good, so any mix that Venus

deviates to, will be a mix between L and R,

and any mix between L and R will be a mix between .62 and

.62 and hence will yield .62. So we’re going to use again,

this fact we developed last week.

The fact we developed last week was that any mixed strategy

yields a payoff that is a weighted average of the pure

strategy payoffs, the payoffs to the pure

strategies in the mix. Any mixed strategy yields a

payoff that is a weighted average of the payoff to the

pure strategies in the mix. That was our key fact last week.

So here if we’ve shown that there’s no pure-strategy

deviation that’s strictly profitable,

then there can’t be any mixed strategy deviation that’s

strictly profitable. Why?

Because the mixed strategy deviations must yield payoffs

that lie among the pure strategy deviations.

So this is a great fact for us. What’s the lesson here?

The lesson is we only ever have to check for strictly profitable

pure-strategy deviations. That’s a good job.

Why? Because if we had to check for

mixed strategy deviations one by one, we’d be here all night,

because there’s an infinite number of possible

mixed-strategy deviations. But there aren’t so many pure

strategy deviations we have to check.

Let’s just repeat the idea. Suppose there isn’t any

pure-strategy deviation that’s profitable, then there can’t be

any mixed strategy deviation that’s profitable,

because the highest expected return you could ever get from a

mixed strategy, is one of the pure strategies

in the mix, and you’ve already checked that none of those are

profitable. So this simple idea,

the simple idea we developed last time, not only helps us to

find Nash Equilibria, but also to check for Nash

Equilibria. Now a lot of people I gathered

from feedback from sections were left pretty confused last time.

It’s a hard idea. Actually I looked at the tape

over the weekend, I could see where it could be

confusing. But it’s actually,

I think what’s really confusing here–it wasn’t so much–I think

it wasn’t so much that I could have been clearer though I’m

sure I could have been. It’s that this is really a hard

idea, this idea of mixed strategies.

So we’re going to work on it again today, but I think one of

the ideas that gets people confused, is the following idea.

They say, look we found Venus’ equilibrium mix by choosing a P

and a 1-P to make Serena indifferent.

We found Serena’s equilibrium mix by finding a Q and a 1-Q to

make Venus indifferent and a natural question you hear people

ask then is, why is Venus “trying to make

Serena indifferent?” Why is Serena “trying to make

Venus indifferent?” That’s not really the point

here. It isn’t that Venus is trying

to make Serena indifferent. It’s that in equilibrium,

she is going to make Serena indifferent.

It isn’t her goal in life to make Serena indifferent between

l and r, and it isn’t Serena’s goal in life to make Venus

indifferent between L and R, but in equilibrium it ends up

that they make each other indifferent.

The way that we can see that is that if Venus puts–we said last

time it’s repeated–if Venus puts too much weight,

more than .7 on L, then Serena just cheats to the

left all the time, and that can’t possibly be an

equilibrium. And if Venus puts too much

weight on R, then Serena cheats to the right all the time and

that can’t be an equilibrium. So it has to be that what Venus

is doing is going to make Serena exactly indifferent and vice

versa. Now let’s see that idea in some

other applications. Let’s talk about this a bit

before we move on. So it turns out that some very

natural applications for mixed-strategy equilibria arise

in games, in sport. So let’s talk about a few now.

Can anybody suggest some other places where we see

randomization or at least mixed strategy in equilibria in

sporting events? Let me actually grab the mike

myself. Anybody here play football for

example, and we’re talking American football now,

the gridiron game, not the civilized type.

Anyone play? Yes, so some of you play

football. So where is the mixing involved

in playing football? Where in equilibrium would we

expect to see mixed strategies? There’s somebody down there can

we go on and get them. So shout it out.

Student: Running game and passing game.

Professor Ben Polak: All right, the running game and the

passing game. So a very simple idea whether

to run or whether to pass when you have the ball is likely to

end up as a mixed-strategy equilibrium.

The defense is also randomizing between, for example,

rushing the passer or playing a run defense.

Is that right–this is not exactly a game I know a lot

about, but I’m hoping I’m getting close enough here.

It couldn’t possibly be a pure-strategy equilibrium,

other than very extreme parts of the game,

like at the end of the game perhaps, but for most of the

game, it’s very unlikely to end up as a pure-strategy

equilibrium. Much more likely that the

offense is mixing between passing and running,

and for that matter between going to the left,

going to the right and going to the center, and the defense is

also mixing between–over its types of defense.

So we see that–for those people who were watching

yesterday–we see that in football games.

Where do we see it else in sport, some other sports?

I can’t have a room full of non-sports fans.

How many of you ever watch any sports?

Let’s raise some hands here–some of you do.

So this is baseball playoff season.

How many of you have been watching the baseball playoffs?

Raise your hands if you’ve been watching the baseball playoffs.

I’ll let you off, I know you should have been

doing my homework. How many of you have been

watching the playoffs instead? How many watched the Yankees

game last night? Quite a few of you.

So they haven’t been very exciting yet but we’re hoping

that it’s going to get more exciting.

So when you’re watching baseball what kind of things do

you see where you just know that there must be mixed strategies

involved? There must be randomization

involved. Now I’ve got a few more hands

out. Good, so you sir.

Student: Choosing how to pitch the ball.

Professor Ben Polak: Choosing how to pitch the ball.

Enlarge a little bit more, say a bit more.

Student: Fast ball versus slider,

versus change up, all sorts of different things.

Prof Ben Polak: All right, so there’s different ways

of throwing the ball, and there’s going to be

randomization from the pitcher, or at least it’s going to look

like there’s randomization by the pitcher over whether to

throw a fast ball or a curve ball or whatever.

How is the hitter randomizing there?

How is the hitter randomizing? Is the hitter randomizing at

all? What’s the hitter doing while

this is going on? Anybody?

Yeah. Student: He’s choosing

whether to swing or not to swing.

Professor Ben Polak: Okay, he’s choosing whether to

swing or not to swing, although presumably he can do

that just after the ball’s thrown.

So you sometimes hear the commentator say that that hitter

was looking for a fast ball, is that right?

Or looking for a curve ball. The hitter is trying to

anticipate the pitch, is that right?

This is not a game I played a lot of either–I played a little

bit. You’re trying to anticipate

where the ball is going to be thrown.

So the type of ball you throw in baseball and the way in which

the pitch being anticipated by the hitter, is likely to be a

mixed strategy. What else is likely to be a

mixed strategy in baseball? What else?

Anybody here on the Yale baseball team?

Okay, I’ve got one volunteer here.

So what else, stand up for a second.

Let’s have a Yale baseball team member, what’s your name?

Student: Chris. Professor Ben Polak:

Where do you play? Student: I’m a pitcher.

Professor Ben Polak: You’re a pitcher,

okay. So he’s not going to get on

base now, so he’s not going to answer this.

Suppose you did get on base, pitchers don’t often get on

base. Let’s assume that happens,

what might you randomize? There you are,

you’re standing on base, what might you randomize about?

Student: Whether to steal second or not.

Professor Ben Polak: Right, whether to steal or not,

whether to try and steal or not.

Stay up a second. So the decision whether to try

and steal or not is likely to end up being random.

If you’re the pitcher, what can you do in response to

that? Student: You can either

choose to try to pick them off or not.

Professor Ben Polak: What else?

So one thing you can try and pick him off.

What else? Student: You can be

quicker to the plate. Professor Ben Polak:

Quicker to the plate, what else?

Student: You can pitch out.

Professor Ben Polak: You can pitch out,

what else? At least those three things,

right? Student: Yeah.

Professor Ben Polak: At least those three things okay,

thank you. I have an expert here,

I’m glad I had an expert. So in this case we can see

there’s randomization going on from the runner whether he

attempts to steal the base or not,

and by the pitcher on whether he throws the pitch out or

whether he tries to throw, to get to the plate faster.

So we see this in sport. We don’t see it well

anticipated by sports commentators.

Let me put this down a second. So in baseball,

for example, you’ll sometimes see quite

sophisticated statistical analyses of baseball in which

somebody will have looked at base stealers across the major

leagues and they’ll look at all the instances in which a player

was on first base and in the position where you think they

might steal, and they’ll look at what

happened on every attempt to steal, whether they were in fact

caught stealing or not, and they’ll try and measure the

value of these things and they’ll see, the conclusion

they’ll come to is something like this.

They’ll conclude that whether the guy stole or not,

whether the guy attempted to steal or not,

sorry, or whether he just sat on first base doesn’t seem to

make much difference, they’ll say.

They’ll say that the payoff for even great base stealers are

attempting to steal or not, when you take into account the

pick offs versus just staying put, turns out the payoff in

terms of the impact on the game is roughly equal,

and then they’ll draw– these analysts will then draw the

following conclusion. They’ll say,

oh look, speed or the ability to steal bases is therefore

overrated in baseball. How have they made a mistake?

What’s the mistake they made there?

So the premise was, let’s give them the premise,

the premise was that when a base stealer is attempting to

steal or not the expected return in terms of outcome of the game

is roughly equal, whether they attempt to steal

or don’t attempt to steal. The conclusion is,

therefore stealing doesn’t seem such a big deal.

What’s the mistake they’ve made? Yeah, let me borrow it again,

sorry. Student: The pitcher has

to react differently in pitching when he knows that there’s a

fast guy on base. Professor Ben Polak:

Good, so our pitcher has to react differently.

Let’s talk to our pitcher again, so one thing our pitcher

said was he wants to get to the plate faster.

What does that mean getting to the plate faster?

–Shout out so people can hear you.

Student: It means just getting the ball to the catcher

as fast as possible so he has the best chance to throw out the

runner. Professor Ben Polak:

Right, so you’re going to pitch from, you’re not going to do

that funny windup thing, you’re not, thank you,

you’re going to pitch from the stretch, I knew there was a term

there somewhere. I’m learning American by being

here. And you’re more likely to throw

a fast ball, there’s some advantage in throwing a fast

ball rather than a curve ball. Both actions of which,

both having to move more towards fast balls and pitching

to the stretch are actually costly for the pitcher.

But we’ll get there in a second, let’s just back up a

second, so that was good, that’s right.

But let’s just back up a second. The premise of these

commentators was what? It was that the return to

stealing, attempting to steal, seems to be roughly a wash.

It seems to be that the expected return when this great

base runner attempts to steal a base is roughly the same as the

return when they don’t attempt to steal the base.

But I claim we knew that was going to the case.

We didn’t have to go and look at the data.

Why did we know that was going to be the case?

How did we know that we were bound to find a return in that

analysis that finds those things roughly equal?

Yeah. Student: If he is

randomizing that means that the returns will be equal.

If they weren’t equal he would just do one or the other all the

time. Professor Ben Polak:

Good, excellent. Since we’re in a mixed strategy

equilibrium, since he’s randomizing, it must be the case

that the returns are equal. That’s the big idea here,

that’s the thing we learned last time.

If the player, and these are professional

baseball players doing this, they’ve been very well trained,

a lot of money has been spent on getting the tactics right.

There’s people sitting there who are paid to get the tactics

right. If it was the case that the

return to base stealing wasn’t roughly equal when you attempt

to steal or didn’t attempt to steal,

then you shouldn’t be randomizing.

Since you are randomizing it must be the case that the

returns are roughly equal. So that’s the first thing to

observe and the second thing to observe is what we just pointed

out. In fact, the value of having a

fast base stealer on the team doesn’t show up in the expected

return on the occasions on which he attempts to steal,

or which he does not attempt to steal.

It shows up where? It shows up in the fact that

the pitching team changes their behavior to make it harder for

this guy to steal by going faster to the plate,

or throwing more fast balls. Where will that show up in the

statistics? If you’re just a statistician

like me, you just look at the data, where will that show up?

I mean suppose I can’t keep track of every single pitch,

I can’t actually observe all these fast balls,

where will I see the effect of all these extra fast balls in

pitching and from the stretch, in the data?

Somebody? It’s going to show up in the

batting average of the guy who’s hitting behind the base stealer.

The guy hitting behind the base stealer is going to have a

higher batting average because he’s going to get more pitches

which are fast balls to hit, and more pitches out of the

stretch. So if you ignore that effect,

you’re going to be in trouble. But we know,

if we analyze this properly using Game Theory,

we know we’re in a mixed strategy equilibrium.

We know, in fact, the pitching team must be

reacting to it. We know there must be a cost in

doing that, and the cost turns up in the hitter behind.

So when you’re watching the playoffs in the last– now I’m

giving you permission to watch a bit of TV at night,

after you’ve done my homework assignment, but before anyone

else’s homework assignment–you can have a look at these

baseball games and have a go at being a little bit better than

the commentators who are working on them.

So one application for mixed strategies is in sports,

but not the only application. Let’s just talk about another

application, a slightly more scary application.

So after 9/11 there was a lot of talk in the U.S.

about the placement of baggage checking machines at airports.

Actually there’s still quite a lot of talk, but there was a lot

of talk then about the placement of machines to search the

luggage that goes onboard. The hand luggage was being

searched anyway, but to search luggage going

into the cabins. It was pointed out at the time,

this has changed since, there weren’t actually enough

machines in the U.S., on the day after 9/11,

to search every single bag that went into the hold.

You’d hear discussions of the following type.

You’d hear these experts on Nightline or whatever and they’d

say: look there’s no point trying to do this,

because if we put all our baggage searching machines at

Logan Airport in Boston, for example,

then the terrorists will simply move their attack to O’Hare and

if we put them at O’Hare, then they’ll move their attack

to Logan. If we have enough to do both

Logan and O’Hare then they’ll move their attack to some third

airport. So there was a sense of doom in

the air. It was kind of a depressing

time anyway. There was a sense of doom in

the air saying that if you put your baggage searching machines

somewhere, all you do is cause the

attempted terrorists, terrorists attempting to blow

up the planes, to go elsewhere.

And you hear the same things today about searching

individuals as they go on the plane.

For example, you’ll hear a discussion that

says, if we only search men traveling alone,

let’s say, then you’ll quickly end up with all the people

carrying bombs being couples or women.

Again, there’s this sense of doom, this sense that says it’s

hopeless. Whatever we do we’re just going

to force the terrorists to do something else but we won’t have

gained anything. So once again that’s wrong.

What’s wrong about that one? What should we be doing in that

setting? Let me come down again.

What should they have done–in fact they did do–with those

luggage/baggage searching machines when they were in short

supply after 9/11? What do they do with searching

people as they got on planes? What do they do?

Well here’s what they didn’t do, they didn’t just put them at

certain airports and announce they’re just at these airports.

That would have been a crazy thing to do.

That would have been hopeless–not entirely

hopeless–but not wise. What should they have done?

What did they do? Anybody want to guess?

Yeah. Student: In name they

randomized who they were checking.

Professor Ben Polak: Right, so when they’re checking

passengers, they’re going to randomly check passengers.

When they’re checking, when they think about the

baggage machines, a sensible thing to do is to

put a big metal box at every single airport and say:

we’re not going to tell you which of these boxes actually

have baggage checking machines, which effectively is

randomizing. From the point of view of the

terrorists, they’re not going to know where the baggage checks

are going on. That’s worth doing.

It doesn’t–It isn’t going to perfectly eliminate,

well unfortunately it isn’t probably going to perfectly

eliminate all terrorist attacks, but it does make it harder for

the terrorists. So randomization there–whether

it’s literally randomizing over who is checked,

or whether it’s “as it were” randomizing, by concealing where

in fact you have placed those machines–can be very effective.

The hard thing, both in sports and in these

military examples, is really mimicking

randomization. It’s very hard for us as humans

to do it, and there’s a famous story about a military

commander, actually a English military

commander during an insurgent war in, I think it was Malaysia

after World War II, where again he had to worry

about randomizing which convoys to protect.

And the way in which…–He figured out

that randomizing was the right thing to do to try and protect

these convoys as well as he could with small numbers of

troops. And the way in which he

randomized was he literally randomized.

Every morning he put a bit of paper in his hand and he had

somebody, had one of his sergeants, pick which hand the

paper was in. So we do actually see these

random strategies used. The reason we have to literally

randomize is because it’s very difficult to do so unless you’re

a professional sports player. Okay, but it turns out that

mixed strategy equilibria, and mixed strategies in

general, are relevant beyond just these

contexts in which you think of people literally randomizing.

I want to look at a different context now.

So I want to go back to a game we started a few weeks ago.

This isn’t the same game. It’s a sequel.

It’s a follow up in our exciting adventure of our dating

couple in the classroom. Who were our dating couple,

do we still have them here? That was the guy,

who is the–yeah, there they are.

They’re even sitting closer. What a success here.

Can we get the camera on them a second?

Stand up a second, thank you. Your name was?

Student: David. Professor Ben Polak:

David. And look at this.

Is this romantic or what? David and your name is?

Student: Nina. Professor Ben Polak:

Nina and David, okay.

I think we pretty much figured out last time that Nina’s Player

I and David’s Player II, is that right?

As we remember last time, I’ll pick on you in a second,

you can sit down a second. So we figured out last time

that they were going to try and go on a date and they had

arranged to go to the movies. They picked out two,

in fact three movies, but two that remained viable,

and the problem was being typical Economics majors who

are, are you both Economics majors,

I think we figured that out? They are, look at that,

so being typical Economics majors who are just hopeless at

dating they had forgotten to tell each other which movie

they’re going to. So that, I don’t know if that

worked out well or not, but now that life has moved on,

they’re going to try it again, but this time taking advantage

of fall in New England, rather than go to a movie,

they’ve decided on some new activities.

So they might either go apple picking or they might go to the

Yale Rep and see a play.. And so apple picking has its

advantages: the fall weather, it’s local flavor,

it has certain undertones about the Garden of Eden or something.

I don’t know if you can use the term flavor, local or otherwise,

for American apples but never mind.

And the Yale Rep, Yale Rep is a good thing to do

in New Haven, go to a play,

I think it’s Richard II is showing now, is that it?

Probably not a great “date play” but Economists are trying

to show they have culture, so there it goes.

And let’s assume the payoffs are like this.

Much as they were before, whereby we mean that Nina wants

to meet David but she would, given the choice she would

rather meet David in the apple fields.

And David who’s a dark personality, likes the sort of

darker side of Shakespeare. And he also wants to meet Nina

but he would rather meet at the Yale Rep.

If that’s backwards I apologize to their preferences.

But once again, because they’re still

incompetent Economics majors, they’ve again forgotten to tell

each other where they’re going. So let’s analyze this game

again, we’ve figured out this was a coordination game last

time or several weeks ago., And we know in this game,

we know what the pure strategy Nash Equilibria are,

so no prizes to be able to spot them.

One of the Nash Equilibria in pure strategies,

let’s put this in pure strategies,

so one of the pure strategy Nash Equilibria is for them both

to go apple picking and meet up in Bishop’s Orchard or whatever,

and another pure strategy equilibrium is for them both to

choose the Rep. We’d figured out that if they

were able to communicate, there’s really a pretty good

chance of them managing to coordinate at one of these

equilibria but we suspect, I think, that this is not all

that’s going on here. It looks quite likely that come

your next Saturday afternoon, when we send these guys out on

their date, they’re going to fail to meet,

it’s at least plausible. To test the plausibility of

that, let’s ask them, have you been,

have you managed to meet on a date yet?

No, haven’t managed to meet on a date.

See, so I’m proving the point that in fact they haven’t

managed to at least coordinate an equilibrium yet.

So it seems at least plausible that they’re going to fail to

coordinate. It’s plausible they’re going to

fail to coordinate. We’d like to sort of capture

that idea, and the way we’re going to capture that idea

is–let’s see if there’s another equilibrium in this game.

Well, there certainly isn’t another pure strategy

equilibrium in this game is there?

We know that.. So if there’s another

equilibrium it better be mixed. So let’s try and find a mixed

Nash Equilibrium in this game, and remember this game is

called Battle of the Sexes, it’s a famous game.

This is Battle of the Sexes revisited.

So how are we going to go about finding this mixed Nash

Equilibrium in the game? We’ll interpret it later but

let’s just work on finding it. So, in particular,

I’m going to postulate the idea that Nina is going to mix P,

1 – P and David is going to mix Q, 1 – Q.

So how do we go about finding David’s equilibrium mix Q,

1–Q? What’s our trick from last week?

Should be able to cold call at this point, but let’s not have

to. How am I going to find Q,

the equilibrium Q? Somebody?

Thank you, they can use Venus’ payoffs, good.

So to find–it isn’t Venus’ payoffs–it’s Nina’s payoffs.

Fair enough, sorry. So to find the Nash Equilibrium

Q, to find the mix that David’s using we use Nina’s payoffs.

So let’s do that. So, in particular,

for Nina, if she goes apple picking then her payoff is 2

with probability Q if she meets David and 0 otherwise.

If she goes to the Rep then her payoff is 1 if she meets David,

sorry, need to be careful, let’s do it again.

If she goes to the Rep her payoff is 0 if David goes apple

picking with probability Q, and her payoff is 1 if she

meets David at the Rep, which happens with probability

1 – Q, is that correct?

So this is her payoff from apple picking and this is her

payoff from seeing Richard II. And what do we know if Nina is

indeed mixing, what do we know about these two

payoffs? They must be equal.

If Nina is in fact mixing, then these two things must be

equal. And that means:

what we’re saying is 2Q equals 1(1-Q) or Q equals 2/3,

I guess it is. No it’s 1/3 sorry.

Is that right? Q is 1/3.

Okay, so our guess is that if there’s a mixed strategy

equilibrium it must be the case that David is assigning a

probability 1/3 to going apple picking,

which means he’s assigning probability 2/3 to his more

favored activity which is going to see Richard II.

What about, I’m going to pull these both down,

okay, how do we find Nina’s mix?

So to find the Nash Equilibrium P, to find Nina’s mix what do we

do? What’s the trick?

Somebody? Use David’s payoffs.

So David’s payoffs, if he goes apple picking then

he gets a payoff of 1 if he meets Nina there and 0 otherwise

and if he goes to the Rep he gets a payoff of 0 if Nina’s

gone apple picking, and he gets a payoff of 2 if he

meets Nina at the Rep. Once again, if David is

indifferent it must be that these are equal.

So if these are– if David is in fact mixing between apple

picking and going to the Rep–it must be that these two are equal

and if we set this out carefully we’ll get,

let’s just see, we’ll get 1(P) equals 2(1-P),

which is P equals 2/3 and 1-P equals 1/3.

So here we have Nina assigning 2/3 to going apple picking,

which in fact is her more favored thing and 1/3 to going

to the Rep. Okay, so we just used the same

trick as last time, let’s check that this is in

fact an equilibrium. So, in particular,

let’s check that it is in fact an equilibrium for Nina to

choose 2/3,1/3. Let’s check.

So check that P equals 2/3 is in fact the best response for

Nina. Let’s go back to Nina’s payoffs.

For Nina, if she chose to go apple picking,

her payoff now is 2 times Q but Q is equal to 1/3 plus 0(1-Q)

and if she chooses to go to the Rep then her payoff is 0 with

probability 1/3 and 1 with probability now 2/3.

All I’ve done is I’ve taken the lines I had before and

substituted in now what we know must be the correct Q and 1-Q

and this gives her a payoff of 2/3 in either case.

If she chooses P, her payoff to P will be 2/3 of

the time she’ll get the payoff from apple picking which is 2/3

and 1/3 of the time she’ll get the payoff from going to the Rep

which is 2/3 for a total of 2/3. So Nina’s payoff from either of

her pure strategies is 2/3. Her payoff from our claimed

equilibrium mixed strategy is 2/3, so neither of her possible

pure strategy deviations were profitable.

She didn’t lose her anything either, but they weren’t

profitable, and by the lesson we started the class with,

that means there cannot be any strictly profitable mixed

deviation either, so indeed,

for Nina, P is a best response to Q.

We can do the same for David but let’s not bother,

it’s symmetric. So in this game we found

another equilibrium. The other equilibrium,

the new equilibrium is Nina mixed 2/3,1/3 and David mixed

1/3,2/3 and we also know the payoff from this equilibrium.

The equilibrium from this payoff, for both players,

was 2/3. There are three equilibria in

this game. They managed to meet at apple

picking in which case the payoffs are 2 and 1.

They managed to meet at the Rep, that’s the second pure

strategy equilibrium, in which case the payoffs are 1

and 2, or they mixed, both of them mixed in this way,

and their payoffs are 2/3,2/3. Why is the payoff so bad in

this mixed strategy equilibrium? Does everyone agree,

this is a pretty lousy payoff? The other equilibrium payoffs

the worst you got was 1 and you sometimes got 2,

but now here you are playing a different equilibrium and at

this different equilibrium you’re only getting 2/3.

Why are you only getting–what happened?

Why have these payoffs got pushed down so far?

What’s happening to our poor hapless couple?

Or not hapless I don’t know. What’s happening to our couple?

Student: Sometimes they don’t meet.

Professor Ben Polak: Yeah, they’re failing to meet.

The reason, what’s forcing these payoffs down is they’re

not meeting very often? How often are they actually

meeting? How often are they meeting?

Let’s have a look. Let’s go back to the previous

board. Here it is..So they meet when

they end up in this box or this box, is that right?

So what’s the probability of them ending in those boxes?

Well ending up in this box is probability 2/3,1/3 and ending

up in this box is probability 1/3,2/3, is that right?

You end up meeting apple picking, the 2/3 of the time

when Nina goes there times the 1/3 of the time when David goes

there. And you end up meeting at the

Rep the 1/3 of the time Nina goes there times the 2/3 of the

time that David goes there. So this is the total

probability of meeting and it’s equal to 4/9,

is that right? So 4/9 of the time they’re

meeting, but 5/9 of the time–more than half the

time–they’re screwing up and failing to meet.

This is why I call them a hapless dating couple.

So this is a very bad equilibrium, but it captures

something which is true about the game.

What is surely true about this game is that if they just played

this game, they wouldn’t meet all the time.

In fact what we’re arguing here is they’d meet less than half of

the time. But certainly this idea that

we’re given from the pure strategy equilibria,

that they would magically always manage to meet seems very

unlikely, so this does seem to add a

little bit of realism to this analysis of the game.

However, it leads to a bit of an interpretation problem.

You might ask the question why on Earth are they randomizing in

this way. Why are they doing this?

It’s bad for everybody. Why are they doing this?

This leads us to think about a second interpretation for what

we think mixed strategy equilibria are.

Rather than thinking of them literally as randomizing,

it’s probably better in this case to think about the

following idea. We need to think about David’s

mixture as being a statement about what Nina believes David’s

going to do. David may not be literally

randomizing. But his mixture Q,

1–Q, we could think of as Nina’s belief about what David’s

going to do. Conversely, Nina may not

literally be randomizing. But her P, 1 – P,

we could think of as David’s belief about what Nina’s going

to do. And what we’ve done is we’ve

found the beliefs such that these players are exactly

indifferent over what they do. We found the beliefs for David

over what Nina’s going to do, such that David doesn’t really

quite know what to do. And we found the beliefs that

Nina holds about what David’s going to do such that Nina

doesn’t quite know what to do. That make sense?

So it’s probably better here to think about this not as people

literally randomizing but these mixed strategies being a

statement about what people believe in equilibrium.

We’ll come back and look at this game some more later on,

so our couple I’m afraid are not quite out of the woods yet.

But I want to spend the rest of today looking at yet another

interpretation of mixed strategy equilibria.

So, so far we have two, we have people are literally

randomizing. We have thinking of these as

expressions about what people believe in equilibrium rather

than what they’re literally doing.

And now I’m going to give you a third interpretation.

So for now we can get rid of the Venus and Serena game.

So to motivate this third idea I want to think about tax

audits. So none of you here have ever,

probably ever, had to fill out a tax form,

except for the fact that there seems to be a lot of parents in

the room today, is it parents weekend,

is that what’s going on? So where are the parents in the

room? Wave your arms in the air if

you’re a parent here. So at least these guys at

probably some point in their life filled out a tax form.

So come tax day, the parents in the room face a

choice, and the choice is are they going to honestly fill out

their taxes, or are they going to cheat?

I’m not going to ask them what they, well maybe I will,

but for now I won’t ask them what they did.

So they can choose one of two things.

They can choose to pay their taxes honestly–we’ll call that

H–or to cheat. This is the tax payer,

the parent. And at the same time the audit

office, the auditor, has to make a choice,

and the auditor’s choice is whether to audit you or not and

it’s not literally true because literally the auditor can wait

until your tax return comes in and then decide whether to audit

you. But for now let’s think of

these choices being made simultaneously,

and we’ll see why that makes it more interesting.

So let me put down some payoffs here and then I’ll explain them.

So 2,0, 4, -10,4, 0 and 0,4. So how do we interpret this?

Let’s look at the auditor’s payoffs first of all.

So the auditor is very happy not having to audit your parents

and having your parents pay taxes, so we’ll give that a

payoff of 4. It’ll turn out,

in this game, we’ve decided in the payoffs,

that the auditor is equally happy if she actually audits

your parents in the year that they cheated.

We’ll say that makes the auditor equally happy.

Now the auditor is not so happy if she audits your parents when

they’re honest because audits are costly.

The auditor is really unhappy if she fails to audit when the

parents cheated. Let’s look at the–I keep

wanting to call them parents–I should stop calling them

parents, let’s call them taxpayers.

So for the taxpayers, what are their payoffs?

Well we’ll normalize things, so if they’re honest we’ll give

them a payoff of 0. That means they correctly fill

in their tax form and pay what they’re supposed to pay,

but if they can conceal some of their income,

they pretend to have whatever it is,

a third child, then they might be in trouble

if they’re audited. If they’re audited they’re

going to have to pay a big fine, maybe even go to jail,

so that’s -10. Of course if they’re not

audited they get to keep a chunk of money so we’ll call that 4.

Everyone understand the basic idea of this game?

In reality, we could add more complications,

we could think of different ways to cheat on your taxes,

but I don’t want to give tutorials on how to cheat on

your taxes here. So it’s not going to take long

staring at this game to figure out that there are no pure

strategy equilibria in this game.

Let’s just do that, so from the taxpayer’s point of

view, if they’re going to be audited,

then they’d rather pay their taxes than not,

and if they’re not going to be audited then according to these

payoffs they’d rather cheat. From the auditor’s point of

view, if they knew everyone was going to pay taxes,

then they wouldn’t bother auditing and if they knew

everyone was going to cheat, then they’d of course audit.

So you can quickly see that there’s no box in which the best

responses coincide, there’s no pure strategy Nash

Equilibria. For those people who are

thinking this is seeming other worldly, you will have to pay

taxes in a couple of years, and trust me your parents are

paying taxes now. So what we want to do here is

we’re going to solve out and find a mixed strategy

equilibrium, but we’re going to give it a

different interpretation to the equilibria we found so far.

But the basic, initial exercise is what?

We’re going to find–we’re going to try and find the

equilibrium here. So to find the Nash Equilibrium

here we know it’s going to be mixed.

So to find the probability with which taxpayers pay their

taxes–and let me already start getting ahead of myself and just

say to find the proportion of taxpayers who are going to pay

their taxes–what do we do? What must be true of that

equilibrium proportion Q of taxpayers who pay their taxes?

How am I going to find that Q? Shout it out somebody.

Yeah look at the auditor’s payoffs.

So from the auditor’s point of view, if the auditor audits,

their payoff is 2Q plus 4(1-Q) and if they don’t audit their

payoff is 4Q plus 0(1-Q). Everyone see how I do this,

this is 2Q plus 4(1-Q) and this is 4Q plus 0(1-Q).

And if indeed the auditor is mixing, then these must be

equal. And if they’re equal,

let’s just do a little bit of algebra here and we’ll find that

2Q equals 4(1-Q) so Q equals 2/3, is that right?

So our claim is to make the auditor exactly indifferent

between whether to audit or not, it must be the case that 2/3 of

the parents of the kids in the room, are going to be paying

their taxes honestly, which means 1/3 aren’t,

which is kind of worrying, but never mind.

Let’s have a look at the taxpayer.

To find, sorry. We found the taxpayer,

we found the proportion of taxpayers who are paying their

taxes, now I want to find out the probability of being

audited. How do I figure out the

equilibrium probability of being audited in this model?

How do I work out the equilibrium probability of being

audited? Shout it out.

So the equilibrium probability of being audited are going to

use P and 1-P, so P is going to be the

probability of being audited, how do I find P?

Yeah, I’m going to look at the taxpayer’s payoffs.

So from the taxpayer’s point of view, if the taxpayer pays their

taxes, their payoff is just 0, and if they cheat they’re

payoff is -10P plus 4(1-P). And if indeed the taxpayers are

mixing–or in other words, we are saying that not all

taxpayers are cheating and not all taxpayers are honestly

paying their taxes–then these must be equal.

So if these are equal I’m going to get 4P equals 14–no it

didn’t– I’m going to get 4 equals 14P,

let’s try again, 4 equals 14P,

that was a bit worrying, 4 equals 14P,

which is the same as saying P equals 2/7.

If somebody can just check my algebra I think that’s right.

So my claim is that the equilibrium here is for 2/3 of

the taxpayers to pay their taxes and for the audits,

the auditor, to audit 2/7 of the time.

Now we could go back in here and we could check,

I could do what I did before, I could plug the Ps and Qs in

here and check that in fact this is an equilibrium,

but trust me that I’ve done that, trust me that it’s okay.

So here we have an equilibrium, let’s just write down what it

is. From the auditor’s point of

view it is that they audit 2/7 of the time, or 2/7 of the

population, and from the taxpayers’ point

of view, it’s that they pay their taxes honestly 2/3 of the

time and not otherwise. Now without focusing too much

on these exact numbers for a second, I want to focus first

for a minute on how do we interpret this mixed strategy

equilibrium. So from the point of view of

the auditor we’re really back where we were before with the

base stealer or the person who’s searching baggage at the

airport. We could think of the auditor

literally as randomizing. In fact, there’s some truth to

that. It actually is the case that by

law, that the auditor’s literally have to randomize.

So this 2/7,5/7 this has the same interpretation as we had

before. This is really a randomization.

But this 2/3,1/3 has a different interpretation and a

potentially exciting interpretation.

It isn’t that we think that your parents get to tax day,

work out what their taxes would be and then toss a coin.

They may be doing that, I’m looking at the parents and

I don’t think that’s what they’re doing.

The interpretation here is that the parents, some parents are

paying their taxes and some parents aren’t paying their

taxes. There’s a lot of parents out

there, a lot of potential taxpayers, and in the

population, in equilibrium,

if these numbers were true, 2/3, of parents would be paying

their taxes and 1/3 would be cheating.

So this is a randomization by a player, and this is a mixture in

the population. The new interpretation here is,

we could think of the mixed strategy not as players

randomizing, but as a mix in a large

population of which some people are doing one thing and the

other group are doing the other. It’s a proportion of people

paying taxes. So I don’t know if this 2/3,1/3

is an accurate number for the U.S.

It’s probably not very far off actually.

For Italy I’m ashamed to say the number of people who pay

taxes is more like 40%, maybe even lower now,

and there are countries I think where it gets as high as 90%.

I think the U.S. rate when they end up auditing

is a little higher than this but not much.

So again, we’re going to think of this not as randomization but

as a prediction of the proportion of American taxpayers

who are going to pay their taxes.

Now, I want to use this example in the time we have left,

to actually think about a policy experiment.

So let’s put this up somewhere we can see it.

Let’s think about a new tax policy.

So suppose that Congress gets fed up with all these newspaper

reports about how 2/3 of American’s don’t pay their taxes

or whatever the true proportion is,

I think it’s actually a little higher than that but never mind.

They get fed up with all these reports and they say,

this isn’t fair, we should make people pay their

taxes so we’re going to change the law and instead of

paying–instead of being in jail for ten years,

or the equivalent of a fine of -10 if you’re caught cheating,

we’re going to raise the fine or the time in jail so that it’s

now -20. So the policy experiment is

let’s raise the fine–to fine the cheating–to -20 and the aim

of this policy is try to deter cheating, right?

It seems a plausible thing for a government to want to do.

Let’s redraw the matrix, so here’s the game – 2,0,

4, -20,4, 0,0, 4 audit, not audit and pay

honestly or cheating. So here’s our new payoffs and

let’s ask the question, with this new fine in place,

now we’ve raised the fine, to being caught not paying your

taxes, in the long run once things have worked their way

back into equilibrium again, after a few years,

do we expect American taxpaying compliance to go up or to go

down, or what do we expect?

What do we think is going to happen?

So who thinks it’s going to go up?

Who thinks it’s going to go down?

Who thinks it’s going to say the same?

Who’s abstaining here? I notice the parents are

abstaining. u’re not really meant to

abstain. You have to vote here.

Well how are we going to figure this out?

How are we going to figure out what’s going to happen to

compliance? What happens to tax compliance?

Tax compliance, remember that was our P–no it

wasn’t, sorry, it was our Q.

The only way we’re going to figure this out is to work out,

so let’s work out the new Q in equilibrium.

Let’s do this, so to find out the new Q in

equilibrium, once again, we’re going to have to look at

the auditor’s payoffs, and the auditor’s payoffs if

they audit, they’re going to get 2Q plus 4(1-Q),

and if they don’t audit they’re going to get 4Q plus 0(1-Q),

and if the auditor is indifferent,

if they’re mixing, it must still be the case that

these are equal. And now I want to ask you a

question, where have you seen that equation before?

Yeah, it’s still there right, I didn’t delete it.

It’s the same equation that sits up there.

Is that right? From the auditor’s point of

view, given the payoffs to the auditors nothing has changed,

so the tax compliance rate that makes the auditor exactly

indifferent between auditing your parents and not auditing

your parents, is still exactly the same as it

was before at 2/3. In equilibrium,

tax compliance hasn’t changed at all.

Let me say that again, the policy was we’re going to

double the fines for being caught cheating and in

equilibrium it made absolutely no difference whatsoever to the

equilibrium tax compliance rate. Now why did it make no

difference? Well let’s have a techie answer

and then a better, a more intuitive answer.

The techie answer is this, what determines the equilibrium

tax compliance rate, what determines the equilibrium

mix for the column player is what?

Is the row’s payoffs. What determines the equilibrium

mix for the column player are the row’s payoffs–row player’s

payoffs. We didn’t change the row

player’s payoffs, so we’re not going to change

the equilibrium mix for the column player.

Say again, we changed one of the payoffs for the column

player but the column player’s equilibrium mix depends on the

row player’s payoffs and we haven’t changed the row player’s

payoffs, so we won’t change the

equilibrium compliance rate, the equilibrium mix by the

column player. What will have changed here?

What will have changed in the new equilibrium?

So we’ve pretty much established that people are

cheating as much as they were before in equilibrium.

Rahul, can I get Henry here? Student: Probability has

changed. Professor Ben Polak: Say

again. Student: The probability

of audit would have changed. Professor Ben Polak: The

probability of audit will have changed.

What’s going to change is not the Q but the P,

the probability with which you’re audited is going to

change in this model. Let’s just check it,

to find the new P, I need to look at the

taxpayer’s payoffs and the taxpayer’s payoffs are now 0,

–sorry, if they pay their taxes

honestly then they get 0, and if they cheat they get -20

with probability P and 4 with probability 1-P.

If they’re mixing, if some of them are paying and

some of them are not, this must be the same,

and I’m being more careful than I was last time I hope,

this gives me 24 P is equal to 4 or P equals 1/6.

So the audit rate has gone down from 2/7 to 1/6.

I’m guessing that probably wasn’t the goal of the policy

although it isn’t necessarily a bad thing.

There is some benefit for society here,

because audits are costly, both to do for the auditor and

they’re unpleasant to be audited,

so the fact that we’ve managed to lower the audit rate from 2/7

to 1/6 is a good thing, but we didn’t manage to raise

the compliance rate. So I don’t want to take this

model too literally, because it’s just a toy model,

but nevertheless, let’s try and draw out some

lessons from this model. So here what we did was we

changed the payoff to cheating, we made it worse.

But a different kind of change is we could have changed,

sorry,- we changed the payoff negatively to being caught

cheating. But a different change we could

have done is we could have left the -10 in place and we could

have raised the payoff to cheating and not getting caught.

We could have left this 10 in place and changed this 4 let’s

say to a 6 or an 8. We’ve increased the benefits to

cheating if you’re not caught. What would that have done in

equilibrium? So I claim, once again,

that would have done nothing in equilibrium to the probability

of people paying their taxes, but that would have done what

to the audit rate? The audit rate would have gone

up, the equilibrium audit rate would have gone up.

Let’s tell that story a second. So rich people,

people who are well paid, have a little bit more to gain

from cheating on their taxes if they’re not caught,

there’s more money at stake. So my colleagues who are

finance professors in the business school have more money

on their tax returns than I do, so in principle,

they gain more if they cheat. Does that mean that they cheat

more than me in equilibrium? No, it doesn’t mean that they

cheat more than me in equilibrium.

What does it mean? It means they get audited more

often. In equilibrium,

richer people aren’t necessarily going to cheat more,

but they are going to get audited more,

and that’s true. The federal audit rates are

designed so they audit the rich more than they audit the poor.

Again, it’s not because they think the rich are inherently

less honest, or the poor are inherently more honest,

or anything like that, it’s simply that the gains to

cheating and not getting caught are bigger if you’re rich,

so you need to audit more to push back into equilibrium.

Now, suppose we did in fact want to use the policy of

raising fines to push down, to push up the compliance rate,

to push down cheating. How would we change the law?

Suppose we want to raise the fines for cheating,

we don’t like people cheating so we raise the fines,

but we’re worried about this result that didn’t push up

compliance rates, how could we change the law or

change the incentives in the game so that it actually would

change compliance rates? What could we do?

Yeah. Student: If we changed

the payoffs of auditing to 4, from 2 to 4.

Professor Ben Polak: Good, if we want to change the

compliance rates we should change the payoffs to the

auditor. The problem with the way the

auditor is paid here is that the auditor is paid more if they

manage to catch people, but audits are costly.

The problem with that is when you raise the fine on the other

side, all that happens is the auditor’s audit less often in

equilibrium. So if you want to get a higher

compliance rate, one thing you could do is

change the payoffs to the auditor to make auditing less

costly for them, or making catching people nicer

for them, give them a reward, or you could simply take it out

of Game Theory altogether. You could enforce,

you could have congressional law that sets the audit rates

outside of equilibrium, and that’s been much discussed

in Congress over the last five years.

Somebody setting audit rates, as it were, exogenously by

Congress. Why might that not be a great

idea? Leaving aside Economic Theory

for a second, leaving aside Game Theory,

why might it not be a great idea to have Congress set the

audit rates rather than some office?

Student: Most members in Congress have a lot of money so

they’re going to lower the audit rates so that they don’t get

audited. Professor Ben Polak: So

the lady in the front is saying that a lot of Congressmen are

rather rich, so maybe they have particular

incentives here. I don’t want to take a

particular political stance here, but it could be whatever

side of the political spectrum you guys sit on,

it could be that you might not trust Congress to get this

right. You might think they’re going

to be political considerations going on in Congress other than

just having an efficient tax system.

Okay, so what do I want to draw out as lessons here?

The big lessons from this class are there are three different

ways to think about randomization in equilibrium or

out of equilibrium. One is it’s genuinely

randomization, another is it could be

something about peoples belief’s,

and a third way and a very important way is it could be

telling us something about the proportion of people who are

doing something in society, in this case the proportion of

people who are paying tax. A second important lesson I

want to draw out here, beyond just finding equilibria,

two other things we drew out today, one lesson was when

you’re checking equilibria, checking mixed strategy

equilibria, you only have to check for pure strategy

deviations. Be careful, you have to check

for all possible pure strategy deviations, not just the pure

strategies that were involved in the mix.

If the guy has seven strategies and is only mixing on two,

you have to remember to check the other five.

The third lesson I want to draw out today is because of the way

equilibria works, mixed strategy equilibria work,

if I change the column player’s payoffs it changes the row

player’s equilibrium mix, and if I change the row

player’s payoffs, it changes the column player’s

equilibrium mix. Next time, we’re going to pick

up this idea that mixed strategies can be about

proportions of people playing things and take it to a totally

different setting, namely evolution.

So on Wednesday we’ll start talking about evolution.