The theoretical formalism of contemporary physics is a probability calculus. The probability algorithms it places at our disposal — state vectors, wave functions, density matrices, statistical operators, and what have you — all serve the same purpose, which is to calculate the probabilities of possible measurement outcomes on the basis of actual measurement outcomes. That’s reason enough to put together what we already know and what we need to know about probabilities.
Probability is a measure of likelihood ranging from 0 to 1. If an event has a probability equal to 1, it is certain that it will happen; if it has a probability equal to 0, it is certain that it will not happen; and if it has a probability equal to 1/2, then it is as likely as not that it will happen.
Tossing a fair coin yields heads with probability 1/2. Casting a fair die yields any given natural number between 1 and 6 with probability 1/6. These are examples of the principle of indifference, which states: if there are n mutually exclusive and jointly exhaustive possibilities (possible events), and if we have no reason to consider any one of them more likely than any other, then each possibility should be assigned a probability equal to 1/n.
(Saying that events are mutually exclusive is the same as saying that at most one of them happens. Saying that events are jointly exhaustive is the same as saying that at least one of them happens. Since one of them is sure to happen, the n probabilities must add up to 1. Since the principle requires that the individual probabilities be equal, each probability must be 1/n.)
There are two kinds of situations in which we may have no reason to consider one possibility more likely than another. In situations of the first kind, there are objective matters of fact that would make it certain, if we knew them, that a particular event will happen, but we don’t know any of the relevant matters of fact. The probabilities we assign in this case, or whenever we know some but not all relevant facts, are in an obvious sense subjective. They are ignorance probabilities. They have everything to do with our (lack of) knowledge of relevant facts, but nothing with the existence of relevant facts. Therefore they are also known as epistemic probabilities.
In situations of the second kind, there are no objective matters of fact that would make it certain that a particular event will happen. There may not even be any objective matter of fact that would make it more likely that one event will occur rather than another. There isn’t any relevant fact that we are ignorant of. The probabilities we assign in this case are neither subjective nor epistemic. They have every right to be considered objective. Quantum-mechanical probabilities are essentially of this kind.
Probabilities and relative frequencies
Until the advent of quantum mechanics all probabilities were thought to be subjective. This had two unfortunate consequences. The first is that probabilities came to be thought of as something intrinsically subjective. The second is that something that was not a probability at all but a relative frequency — came to be referred to as an “objective probability.”
Relative frequencies are useful in that they allow us to measure the likelihood of possible events, at least approximately, provided that trials can be repeated under conditions that are identical in all relevant respects. We obviously cannot measure the likelihood of heads by tossing a single coin. But since we can toss a coin any number of times, we can count the number NH of heads and the number NT of tails obtained in N tosses and calculate the fraction NH/N of heads and the fraction NT/N of tails. And we can expect the magnitude of the difference NH − NT to increase significantly slower than the sum N = NH + NT, so that the magnitude of the difference NH/N − NT/N approaches 0 as N “goes to infinity”:
|NH/N − NT/N| → 0 as N → ∞.
We can therefore expect the relative frequencies NH/N and NT/N to approach the respective probabilities pH and pT in this “limit”:
NH/N → pH and NT/N → pT as N → ∞.
Adding and multiplying probabilities
Suppose you roll a die, and suppose you win if you throw either a 1 or a 6 (no matter which). Since there are six equiprobable outcomes, two of which make you win, your chances of winning are 2/6. In this example it is appropriate to add probabilities:
p(1 v 6) = p(1) + p(6).
Sum Rule. Given n mutually exclusive and jointly exhaustive events (such as the possible outcomes of a measurement), and given m of these n events, the probability that one of the m events takes place (no matter which) is the sum of the individual probabilities of these m events. (One nice thing about relative frequencies is that they make a rule such as this virtually self-evident.)
Suppose now that you roll two dice. And suppose that you win if your total equals 12. Since there are now 6×6 equiprobable outcomes, only one of which makes you win, your chances of winning are 1/36. In this example it is appropriate to multiply probabilities:
p(6 & 6) = p(6) × p(6).
The general rule is:
Product rule. The joint probability p(e1&…&em) of m independent events e1,…,em (that is, the probability with which all of them happen) is the product p(e1)×…×p(em) of the probabilities of the individual events.
It must be stressed that the product rule only applies to independent events. Saying that two events e1 and e2 are independent is the same as saying that the probability of e1 is independent of whether or not e2 happens, and vice versa.
If e1 and e2 are not independent, one has to distinguish between marginal probabilities, which are assigned to either event regardless of whether the other event happens, and conditional probabilities, which are assigned to either event dependent on the outcome of the other event. If the two events are not independent, their joint probability is given by
p(e1 & e2) = p(e1|e2) p(e2) = p(e2|e1) p(e1),
where p(e1) and p(e2) are marginal probabilities, while p(e1|e2) is the probability of e1 conditional on the occurrence of e2, and p(e2|e1) is the probability of e2 conditional on the occurrence of e1.
e1 and e2 are said to be correlated if (and only if)
p(e1|e2) ≠ p(e1|e2),
where e2 takes place if e2 does not take place. Saying that the two events are independent is thus the same as saying that they are uncorrelated, for
p(e1 & e2) = p(e1) × p(e2)
holds if and only if
p(e1|e2) = p(e1|e2)
holds, in which case both sides equal the marginal probability p(e1).