Outwitting the Prisoner's Dilemma

synctext · on April 25, 2012

This blog post+Youtube video+Amazon preview made me order the book. Good stuff.

For many years I've been working as a scientist to break the Prisoner's Dilemma problem. Case study is swarming-based downloading, where cooperation is needed for speed and anonymity. However, for over a decade scientists/developers seems have largely failed to do better then tit-for-tat (with unbounded scalability in number of users).

Book preview contains interesting text on paying taxes, parasite behavior and broad view of trust.

Anderkent · on April 25, 2012

Of course that's not a real prisoners dilemma, because the payouts are wrong. In the real prisoners dilemma Defect-Defect gives you a better payout than Cooperate-Defect. So when your opponent pre-commits to defect, you must defect.

robertskmiles · on April 25, 2012

The biggest reason it's not a real prisoner's dilemma is because the prisoners can communicate. Key to the standard prisoner's dilemma is that there is no way for the prisoners to talk to one another.

Like the post says, it's a variant.

dbecker · on April 25, 2012

Inability to communicate is not required in the classical prisoners dilemma. It's inability to credibly commit.

The distinction is illustrated with the common application of the prisoner's dilemma to cartels. Cartel members can tell each other they will restrict production, but they do not observe each other's factories. So even if they agree ahead of time to reduce production, that is cheap talk... and they can produce whatever they want.

It's quite similar to this game show, though as a previous commenter pointed out, the relative payoffs are different.

robertskmiles · on May 6, 2012

I suppose this will have been discussed in other comments, but it seems to me the ability to credibly commit here isn't bad. I'm not sure if verbal contracts are legally binding in the UK, but they certainly have a lot of witnesses, and the social consequences for breaking a very public verbal contract can be substantial.

yonran · on April 25, 2012

I believe you meant Defect-Cooperate gives the defector a better payout than Cooperate-Cooperate. At any rate, I’m not sure that it makes a difference in this video.

jgeralnik · on April 25, 2012

No, he means Defect-Cooperate should give the person who lost less than he would have gotten from Defect-Defect.

For example, in the Prisoner's Dilemma you might get 0 years of jail time if only you defect, 1 year if you both cooperate, 3 years if you both defect, and 10 years if only the other person defects. Then, knowing that the other person will defect it is still in your best interest to defect as well.

In this challenge, if you know that the other person will defect you automatically get $0, whether you defect or not. In that case there is no particular reason for you to defect (except out of spite) and you are better off with the hopes that your opponent will keep his promise than with nothing.

davidjohnstone · on April 25, 2012

That strategy only works in that particular version of the game where it is possible to split the reward. You can't split jail terms 50-50.

That said, that was a great bit of TV. And Nick's double-take when Ibrahim said what he'd do with the money was gold. (In case you missed it, he said he'd respray his yacht.)

ghshephard · on April 25, 2012

re: "You can't split jail terms 50-50." - sure you can. You can both be sent to prison for 1/2 the expected time if you both keep mum. But, if one of you squeals, then that person gets rewarded with no time, and the person who keeps mum goes to jail for the full sentence. But, if you are both stupid enough to squeal, you both go to jail for the full sentence.

jimworm · on April 25, 2012

I believe what was meant is that you can't split jail terms post-sentencing, and let someone else serve half your term.

ghshephard · on April 25, 2012

Thanks to my in-depth research of criminology (I watched two seasons of Boardwalk Empire, and several seasons of The Wire) - I can relate that criminals will sometimes agree to have one person do the time, while the other pays a bonus to them/makes it up somehow.

davidjohnstone · on April 25, 2012

Indeed. I considered editing my original post just after I wrote it to clear up that ambiguity, but I figured people would understand what I meant...

zorbo · on April 25, 2012

Schneier comments:

> The game is turned on its head

Is that really true? What has changed? Player B still can't trust that player A will give him 50% of the money outside of the game. Player could turn the tables by claiming he too will steal. Wouldn't that put them back to the original situation?

dsr_ · on April 25, 2012

You are assuming that social pressure either doesn't exist or is in favor of dishonesty. The first is not true, and the second is only true for limited peer groups.

Consider how your family would treat you if you reneged on a verbal commitment you made on TV. For the vast majority of people, this is a major negative factor. Now add in your coworkers, employers, friends...

Most people are honest most of the time, and especially so when there is a high likelihood of being caught cheating.

ecmendenhall · on April 25, 2012

Exactly. This is an example of a commitment strategy (http://plato.stanford.edu/entries/game-theory/#Com), and it was only credible because others were watching. Reputation is a very common way we make credible commitments when we deal with one another, but the results might have been different without an audience.

It's funny that the one-shot prisoner's dilemma is used so often as an example of rational self-interest leading to suboptimal outcomes or formal modeling failing to capture real behavior. This is true only for the most stylized models and simplest games. Really rational players will choose to restructure the rules and play a better game! Since prisoner's dilemmas were bad games, we figured out how to make credible commitments and overcome them. Since talk is cheap, we developed things like punishment and social norms to enforce commitments.

Adding things like iteration, reputation, and punishment to simple games leads to complex cooperative outcomes that reflect real behavior—and we can model them by adding a few simple rules. I think it's a fascinating and optimistic view of human history: modern market-exchange societies are the outcome of a long process of figuring out how to turn bad-outcome games that encourage defection into positive-sum cooperative ones that benefit everyone. Yet we are still biased to ignore all the cooperation around us and see selfishness (here's one of my favorite papers on this subject: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=929048).

I've also found it a very valuable personal insight. When you find yourself playing a bad game, don't settle for choosing the least harmful payoff. Figure out how to change the rules.

drostie · on April 25, 2012

I like to add to any discussion of Prisoner's Dilemma a similar note, something like:

This is why contracts matter. It's clear that both parties would love to sign a contract saying "I'll scratch your back and you scratch my back," if the consequences of not obeying that contract were sufficiently severe, because they see that they can get a mutually better reward. The reason why criminals do better than game theory professors on the prisoner's dilemma game is not necessarily because they act irrationally, but because they're part of a social order which creates and enforces those sorts of cooperative systems, and which doesn't do business with folks who break them.

Natsu · on April 25, 2012

That exact line of reasoning is brought up during the discussion in this video of the end of another game:

https://www.youtube.com/watch?v=y6GhbT-zEfc&feature=relm...

That said, you're right about the fact that reputations matter. They just don't matter to everyone.

Natsu · on April 25, 2012

Assuming he can be believed, and we might note that there's no benefit to lying about choosing to steal, you change the game so the other guy's payoffs are zero for stealing and 1/2 the money (as modified by the chance he's telling the truth about that). So it changes the incentive from wanting to defect, to either getting nothing or choosing to trust him and maybe get something.

Note also that he was going to steal at first, but changed his ball at the last second to cooperate. And the other guy lied about stealing, having chosen to split the whole time, never changing his ball.

skore · on April 25, 2012

This is great. The Show offers each player with three possible outcomes, which makes negotiation hard because the highest incentive is on getting all the money (choosing Steal while tricking the opponent into choosing Split), while the lowest incentive is on nobody getting anything.

What Nick does by saying that he will definitely Steal is reducing the choices for Abraham to two: Split or Loose. Actually just one: Loose, albeit with a promise to also split.

The cunning bit about the plan, of course, is that Nick never intends to choose Steal. Choosing Steal is foolish to begin with: There is a 50% chance that nobody will get anything if you choose steal, so by default, players should stay away from it. "Count your blessings", so to speak.

What is clever about this is that Nick has replaced the technical bet with a social one: He has done something that appeals to his opponent, by challenging his intellect, turning the game on its head. For Nick, the chances are now that either he gets half or nothing. He, internally, accepts a lower possible payout to himself to maximize the probability that the "group" will cash out. In any outcome, this would mean that it's no longer "He tried to get the money", but "He tried to make sure we get the money". I would say there is a pretty good chance that Abraham would have decided to give him half, even had he decided to Steal, basically matching Nicks generosity. This is supported by the fact that Nick really did end up displaying the Split - Had Abraham revealed a Steal at that point, he surely would have felt like quite bad about this ("He did it to help us both, after all!").

Nick has simply maximized the chances of a Split on all available vectors.

sopooneo · on April 25, 2012

Morality aside, choosing steal is foolish unless it works, then it seams pretty sensible.

benmmurphy · on April 25, 2012

choosing steal is the dominant strategy in a prisoners dilemma. it is not foolish it is the correct strategy.

skore · on April 25, 2012

In a prisoners dilemma maybe, but as others have pointed out - this show is not a good example for one.

AznHisoka · on April 25, 2012

I think this game is actually pretty close to a prisoner's dilemma. Why wouldn't you choose steal if you've managed to convince someone to choose split? If they didn't chose split, you'd still lose anyway.

skore · on April 25, 2012

The main difference is that in the prisoner's dilemma, you don't get to talk and try to persuade your fellow prisoner.

skore · on April 25, 2012

Superficially, choosing either Steal or Split gives you two sets of equal, 50% chances:

Steal: 50% Everybody looses, 50% I get everything

Split: 50% Everybody wins, 50% Opponent gets everything

In very basic, almost evolutionary terms, the dumbest thing to do is both choosing Steal, because that would mean that there are two losers. Only choosing split yourself makes sure that there is a winner.

Steal only seems sensible if you care about winning yourself.

sopooneo · on April 25, 2012

I think you are making a big assumption with your number of "50%". It's true that there are two possible outcomes for either choice I could make. But what makes you say they are equally likely? To my mind, the estimating of that probability is actually the crux of the game.

skore · on April 25, 2012

Yes, that was what I was talking about - If you make a choice and then depend on another person making a choice, your chances are precisely 50/50. That's the prerequisite of the game.

Of course, it gets a lot more complicated after that, particularly because you can talk to your partner. (That's kind of why I wrote "superficially", but it seems that word has triggered the downvote police.) Not sure whether it changes much of the math, though - after all, anything that you weigh in favor of something could always be a lie.

I was pointing out that of the basic choices that are offered to you, one is very disadvantageous, so it should be an advantage to prevent it from even possibly happening by choosing Split. You can literally prevent 1 in 4 outcomes with your decision. This realization, paired with some very social engineering in the example is what made this so impressive to me.

benmmurphy · on April 25, 2012

in the case of the show there is a third player that loses when you both split so it is not clear that steal is only sensible if you care about winning yourself. :)

philh · on April 25, 2012

Given the selection mechanism, I'd be tempted to try a blind strategy. "I'm not going to look at my ball before I pick it, but if you split and I steal, I'll give you half the money."

It wouldn't be good against a pure money-maximiser, because his expected $value is higher for steal (1/2 versus (1+P(I'm honest))/4), but if he's risk-averse and trusting it might work out.

thenonsequitur · on May 6, 2012

It's not clear that this strategy is possible within the the framework of the game show though. The host instructs both players to look at the concealed choices before they're allowed to talk to each other. To employ the blind strategy you'd have to explicitly not look at them (and probably say something at that point to make it clear you don't plan on looking at them). This would be directly contradicting the host's instructions. While the host might allow it, I think it's more likely they'll say "sorry, but you have to look at them".

judofyr · on April 25, 2012

He's only outwitting the Prisoner's Dilemma in the sense that he's shifted his main goal to get some money (instead of everything).

There are more certain strategies if that's your goal though: You can get the other part to join a mutual agreement: if one of you steals (while the other splits) you agree to share the money. If the other part doesn't agree, you say you'll steal and take all the money (if he splits).

The other part can only get money if he agrees with you, and every choice except for (steal, steal) is going to split the money between you two. Split is the only choice that makes sense.

It would be much more interesting if they changes the stakes a bit: Stealing money gives you more than the sum of splitting.

stevenkovar · on April 25, 2012

What's interesting about this show is the pair forced into a Prisoner's Dilemma are given the chance to discuss the matter with each other, adding another layer to its complexity. In most examples I've seen personally, the two subjects are given their options in private and told to make a decision without having contact.

sopooneo · on April 25, 2012

If you assume the players in the classical prisoner's dilemma have common knowledge of the rules and that neither of their choices are revealed until both have made their choice, I don't think their being allowed to talk makes much difference at all to the underlying dynamics.

codehotter · on April 25, 2012

If you don't think you can trust the other participant, you could do the following:

Point at one of the other's balls. Say "I will choose steal unless you pick that one. If you do pick that one, you may choose the ball I'll pick."

Unfortunately, your expected value is only 37.5% with this strategy, not 50%. You can still do better if you trust each other.

thejteam · on April 25, 2012

When the person comes to collect your answer, make them believe you left through the window and then hide and run out the door while they are investigating. And lock it.

At least it worked in "The Mysterious Benedict Society and the Prisoner's Dilemma".

AznHisoka · on April 25, 2012

I would think threatening the other person to split would be a dominant strategy. I wouldn't wanna steal if I knew the other person was gonna harass me the rest of my life.

chris_wot · on April 25, 2012

That was awesome. But did you see Nick's face at the end at what Ibrahim said? Lol!

Natsu · on April 25, 2012

A little bit of game theory goes a long way. But something like that wouldn't work for very long before someone came along who was only faking it.

There are a bunch of related videos of this show as well. It can be interesting to try to predict the outcomes to see just how good your internal lie detector really is.

EDIT: Also, don't miss one of the comments. BCR came up with the clever solution of passing one of their balls to the other player. That's quite the game breaker, though, so it probably wouldn't be allowed. I guess I should also mention that this is technically a variant of the PD, and that I change the title because almost nobody would know what the story was about otherwise.