Let’s take stock. We learned that measurement outcomes are represented by the subspaces of a vector space. Because subspaces correspond one-to-one to projectors, this is the same as saying that

- Measurement outcomes are represented by the projectors of a vector space.

We also learned that

- The outcomes of compatible elementary tests correspond to commuting projectors.

Finally, we decided that if the subspaces **A** and **B** represent two possible outcomes of a measurement with three possible outcomes, then p(**A**U**B**) = p(**A**) + p(**B**). If we further take into account that

- two subspaces corresponding to different outcomes of the same measurement are orthogonal (operationally this means that the probability of obtaining different outcomes in one and the same measurement is zero, formally this means that every vector in one subspace is orthogonal to every vector in the other),
- projectors are said to be orthogonal if the corresponding subspaces are,
- the sum of two orthogonal projectors is another orthogonal projector,

we arrive at the following:

- If P
_{A}and P_{B}are orthogonal projectors, then the probability of the outcome represented by the sum of projectors P_{A}+ P_{B}is the sum of the probabilities of the outcomes represented by P_{A}and P_{B}, respectively.

These three items (or “postulates”) allow us^{[1]} to prove Gleason’s theorem^{[2]}, which holds for vector spaces with at least three dimensions. (More recently the validity of Gleason’s theorem has been extended to include 2-dimensional vector spaces.) The theorem states that the probability of obtaining the outcome represented by the projector **P** is given by

(*Trace Rule*) p(**P**) = Tr(**WP**),

where **W** is a unique operator, known as *density operator*, whose properties will be listed presently. To obtain the *trace* of an operator **X**, we apply **X** to the basis vectors **a**_{1}, **a**_{2}, **a**_{3},…, take the inner product of the resulting vectors with the same basis vectors, and sum over the basis vectors:

Tr(**X**) = <**a**_{1}|**Xa**_{1}> + <**a**_{2}|**Xa**_{2}> + <**a**_{3}|**Xa**_{3}> + ···

If **P** projects into a 1-dimensional subspace containing the unit vector **v**, the trace rule reduces to

p(**P**) = <**v**|**Wv**>.

The properties of the density operator (and the reasons why it has them) are as follows:

**W**is linear. This ensures that the third postulate is satisfied: p(**P**_{1}+**P**_{2}) = p(**P**_{1}) + p(**P**_{2}), where**P**_{1}and**P**_{2}are orthogonal.**W**is self-adjoint. This ensures that the probability <**v**|**Wv**> is a real number. (What could be the meaning of a complex probability?)**W**is positive. This ensures that <**v**|**Wv**> does not come out negative. (What could be the meaning of a negative probability?)- The trace of
**W**equals 1. This ensures that the probabilities of the possible outcomes of a measurement add up to 1. Together with the positivity of the density operator, it ensures that no probability comes out greater than 1. (What could be the meaning of a probability greater than 1?) **W**^{2}=**W**or**W**^{2}<**W**.

The equality of an operator with its own square is characteristic of a projector. Since the trace of **W** equals 1, and since the trace of a projector equals the dimension of the subspace into which it projects, the equality **W**^{2} = **W** tells us that **W** projects into a 1-dimensional subspace. Since we began by upgrading from a point in a phase space to a line (or 1-dimensional subspace) in a vector space, we are not surprised by this result. If that subspace contains the unit vector **w**, the trace rule reduces to

p(**P**) = <**w**|**Pw**>,

and if **P** projects into a 1-dimensional subspace containing the unit vector **v**, it further reduces to

(*Born’s Rule*) p(**P**) = |<**v**|**w**>|^{2}.

If the density operator satisfies the equality **W**^{2} = **W**, it is known as (or said to describe) a *pure state*, and the unit vector **w** is the so-called *state vector*. (Although **W** is uniquely determined by **w**, the converse is not true. If **w**_{1} and **w**_{2} only differ by their phases, they determine the same density operator and, hence, yield the same probabilities. They are therefore physically equivalent.)

If the density operator satisfies the inequality **W**^{2} < **W**, it is known as (or said to describe) a *mixture* or *mixed state*. A pure state defines probabilities distributions. It is a machine with inputs and outputs: insert the possible outcomes of the measurement you are going to make, insert the time of the measurement, and get the probabilities with which those outcomes are obtained. A mixed state defines a probability distribution over probability distributions. It adds a second layer of uncertainty to the uncertainty inherent in a pure state.

There are situations in which this additional uncertainty is subjective in the same sense in which probability distributions over a classical phase space are subjective: the uncertainty arises from a lack of knowledge of relevant facts. But there are also situations in which the additional uncertainty is due to a lack of relevant facts. In these situations it represents an additional objective fuzziness, over and above that associated with the individual algorithms. Later we shall come across examples of this kind of uncertainty.

1. [↑] Peres, A. (1995). *Quantum Theory: Concepts and Methods*, Kluwer, p. 190.

2. [↑] Gleason, A.M. (1957). Measures on the closed subspaces of a Hilbert space, *Journal of Mathematics and Mechanics* 6, 885–894.