The Bayesian meets Monty Hall
Posted by Tom Moertel Sat, 01 Jan 2011 23:15:00 GMT
The Monty Hall Problem is a probability-theory classic. Here’s how it goes.
You are on Monty’s game show, trying to find a car hidden behind one of three doors. Monty knows which one; you don’t. He asks you to choose a door, and you do. But he doesn’t open your door. Instead, he goes to the other two doors and, per the rules of the game, opens one that he knows does not contain the car. After making this revelation, he asks if you want to stay with your first choice or switch to the other remaining door.
Should you stay or switch?
The answer – spoiler alert! – is that you should switch: it doubles your chances of finding the car.
This conclusion, however, is hard for many people to believe. According to Wikipedia, “when the Monty Hall problem appeared in Parade, approximately 10,000 readers, including nearly 1,000 with PhDs, wrote to the magazine claiming the published solution – switch! – was wrong.” Because the problem seems to defy intuition, it has been the subject of much study.
But most explanations I’ve seen are unsatisfying because, at one point or another, they require an intuitive leap that many people can’t make. The reader either has the necessary spark of realization at the right moment, or is left behind.
I’m going to try to offer a more satisfying explanation by working through the problem using nothing but Bayesian probability theory and showing every step. Using this method, there are no leaps to be made: everything follows naturally from the theory itself and from the knowledge we have of the problem.
Ready? Let’s get started!
The Bayesian approach
To begin, let’s define some propositions to formalize our discussion of the problem:
- C=i: the car is behind door i (for i = 1, 2, or 3)
- Y=i: you first choose door i
- M=i: Monty reveals that the car is not behind door i
- K: Our prior knowledge about life, the universe, and everything, including the rules for the television game show and our understanding of Monty’s incentives
(For a refresher on propositions and probabilities and our notation, refer to our earlier discussion, More on the evidence of a single coin toss.)
Now let’s formalize our knowledge about these propositions.
Representing our initial knowledge
To represent our knowledge, we’ll use probability equations. First, given our prior knowledge K, we have no reason to believe the car is more likely to be hidden behind any of the three doors; therefore, we must consider each C=i proposition equally likely:
P(C=1|K) = P(C=2|K) = P(C=3|K).
Further, the car must be hidden behind one of the three doors:
P((C=1 ∨ C=2 ∨ C=3)|K) = 1,
and by the sum rule for mutually exclusive propositions:
P(C=1|K) + P(C=2|K) + P(C=3|K) = 1.
Therefore, solving for the individual probabilities, we must assign 1/3 to each:
P(C=1|K) = P(C=2|K) = P(C=3|K) = 1/3.
Adjusting our knowledge in light of you choosing a door
Now, let’s say you choose a door. We’ll call it door 1, which we are free to call it because, so far, we haven’t made any assignments between door numbers and physical doors. So, we make the first assignment now: the door you chose we will call door 1.
Therefore, we now know Y=1. Let’s adjust our probabilities in light of this new evidence.
The quick way is to realize, since you don’t know anything about where the car is hidden, that knowing which door you chose cannot provide us with any new knowledge about which door the car is behind. Therefore, this new evidence shouldn’t change the corresponding probabilities:
P(C=1|Y=1∧K) = P(C=2|Y=1∧K) = P(C=3|Y=1∧K) = 1/3.
But, if we want to do the math, we see that the Bayesian probability adjustments give the same result. Recall the formula for updating a probability in light of new evidence:
(new plausibility) = (old plausibility) × (evidence adjustment)
For door 1, then,
P(C=1|Y=1∧K)
= (old plausibility) × (evidence adjustment)
= P(C=1|K) × [ P(Y=1|C=1∧K) / P(Y=1|K) ]
= P(C=1|K) × [ P(Y=1|K) / P(Y=1|K) ] { since you don’t know C=1 }
= P(C=1|K) × 1
= P(C=1|K)
= 1/3.
The calculations for the other two doors work out identically.
Adjusting our knowledge in light of Monty’s revelation
Now Monty opens a door to show you that it wasn’t hiding the car. We’ll call it door 2. Thus, we now know M=2. Let’s update our beliefs in light of this new evidence.
First, in light of Monty’s revelation, we know that the car is not behind door 2:
P(C=2|M=2∧Y=1∧K) = 0.
But, for the remaining doors, 1 and 3, let’s turn to the adjustment formula for guidance. Before doing any calculations, however, let’s write the formulas for both doors:
P(C=1|M=2∧Y=1∧K) = P(C=1|Y=1∧K) × [ P(M=2|C=1∧Y=1∧K) / P(M=2|Y=1∧K) ]
P(C=3|M=2∧Y=1∧K) = P(C=3|Y=1∧K) × [ P(M=2|C=3∧Y=1∧K) / P(M=2|Y=1∧K) ]
Note that the prior probabilities are equal: P(C=1|Y=1∧K) = P(C=3|Y=1∧K) = 1/3. Also, the evidence-adjustment factors have the same denominator, P(M=2|Y=1∧K). Thus, we can cancel these terms if we divide the second equation by the first, to arrive at the ratio of our adjusted probabilities. This ratio will tell us how strongly we should prefer door 3 to door 1.
After canceling those terms, the calculations are straightforward, in light of our knowledge of the game. We take particular advantage of the knowledge that, after you make your choice, Monty must reveal to you that one of the remaining doors does not hide the car. That is, he must open one of the other two doors, but if one of them is hiding the car, he cannot open that one.
P(C=3|M=2∧Y=1∧K) / P(C=1|M=2∧Y=1∧K)
= P(M=2|C=3∧Y=1∧K) / P(M=2|C=1∧Y=1∧K) { by the adjustment formula and canceling }
= 1 / P(M=2|C=1∧Y=1∧K) { given K, we know Monty must open 2 if C=3 ∧ Y=1 }
= 1 / (1/2) { given K, we know of no reason for Monty to prefer 2 to 3 if C=1 ∧ Y=1 }
= 2.
Therefore, our final belief that the car is behind door 3 should be twice as strong as our belief that it is behind door 1. Switch!

The most compact and compelling way of explaining it I use is the following: Modify the problem slightly so that you have to choose the stay or switch strategy before playing. Note that these two strategies are exhaustive, in any particular round one or the other will have found the car. So P(stay)+P(switch)=1. If you choose the stay strategy it doesn’t matter what Monty does so we can omit that step. If you choose the stay strategy, we just immediately open the door you chose. Most people have no trouble seeing that you only have a 1/3 chance of getting the car. Therefore, by exhaustiveness, there’s a 2/3 chance that the switch strategy would have worked.
The way I most commonly see it explained which also requires no intuition is simply to list the six (modulo symmetry, 18 total) game trees and count how often each strategy works.
Shouldn’t theories—Bayesian or otherwise—be retained based on how well they sync with intuition? In practice, does switching actually swing the probabilities in a statistically-significant way, even the way predicted by the theory?
@Derek Elkins: That’s an interesting way of looking at the problem: noting that, for the case when you choose to stay, the game where you must pre-announce your final choice is identical to the original game and, therefore, P(stay) in both games must be the same, 1/3. Knowing this, and knowing that P(stay) + P(¬stay) must equal 1 in both games, we can compute P(switch) = P(¬stay) for both games, as well: 1 – 1/3 = 2/3.
@The 27th Comrade: Why should we judge theories by their correspondence to intuition’s predictions? Isn’t intuition often wrong? And, when it is, shouldn’t a theory prove its merit by corresponding to reality instead?
On your second question, if you write a simulation, you’ll find that switching actually does improve your win frequency to the predicted 2/3.
you have an infinite number of doors to choose from, you initially choose door #1, monty opens door #2, and then gives you the option to choose all of the remaining doors instead of door #1. do you switch?
If Derek’s take on this is sound ( can’t judge myself ) that’s the first time somebody lay it obvious to me, and I literally thought ‘obviously’.
Some additions:
I don’t know what hal is trying to get at. Adding infinity complicates things, and either way, given the choice of opening many doors versus one, of course you’ll take it, but that doesn’t really have much impact on the Monty Hall problem. However, it does suggest an n-door Monty Hall problem. Let’s set n to 100 for concreteness. You choose a door, Monty opens 98 of the remaining doors, and then you can switch or stay with your original choice. By any of the three methods described for calculating the probability of each strategy, you’ll arrive at a 1% chance of getting the car with the stay strategy and a 99% chance with the switch strategy. In general, 1/n and 1-1/n.
Once people accept these probabilities, their next question is: what are we learning from Monty that makes us prefer the door he didn’t open so much? Well, let’s say we’re doing the (n+1)-door Monty Hall problem. If you get lucky and choose the door with the car, then Monty has to choose n-1 out of n doors. There are n choose (n-1) = n ways of doing this. However, if you didn’t choose the door, then Monty has to choose n-1 doors out of n-1 doors. There is only (n-1) choose (n-1) = 1 way to do this, namely open every door but the one with the car. So it is much more likely (particularly for higher n) that Monty left a particular door closed because he had to rather than because he just happened to.
exactly. infinity (or n=100) makes it easy to explain why the stay strategy is wrong.
...as monty is known to open a door that does not have a car behind it, your initial choice will continue to have a 1/n chance of being correct. with a large n it becomes abundantly clear you should switch.
To clarify hal’s original message, you can modify the original (n-door) Monty Hall problem to an equivalent one where, instead of Monty immediately opening all but one of the remaining doors, he let’s you choose which doors to open and you get the car if it’s behind any of the doors. The stay strategy corresponds to refusing this offer, and the switch strategy corresponds accepting it. In the 100-door case, the switch strategy corresponds to getting 98 guesses, corresponding to the 98 doors Monty opens, plus 1, corresponding to the door you ultimately choose.
Intuition is correct and a good arbiter of what is worthy more-often than we give it credit for. (It is intuition that we base when we say “run a simulation; and, if it agrees, the theory is sound”.) I am caught in wondering how to write such a simulation, as it seems I need some genuine counterfactuals, where /dev/random won’t suffice. :-D Perhaps I have to use humans; but God Does Not Play Dice™!
Extremely interesting historical perspective of the paradox ! thanks a lot…