QI III: Quantum mechanics essentials

2.Quantum mechanics essentials

Quantum mechanics is commonly introduced by discussing the concept of a wave function. This is a function \(\psi(x)\) such that the complex norm \(|\psi(x)|^2\) gives the probability of finding the system in state \(x\). We have seen that the space of wave functions can be seen as an infinite-dimensional vector space, called Hilbert space. Here we will revise these concepts, and then introduce the Dirac notation, which is more compact and also more suitable to describe systems which classically have only a discrete number of states.

2.1.States and wave functions

Let us reminder ourselves of the quantum mechanics of a single particle on the real line \(x\in\mathbb{R}\). In the formulation of quantum mechanics using wave functions, this system is described by a complex-valued wave function of space and time, \(\psi(x, t)\). The probability of finding a particle in a region \(x\in[a,b]\) at some moment in time \(t\) is given by

\begin{equation} P(a,b; t) = \int_{a}^{b} \big|\psi(x, t)\big|^2\, {\rm d}x\,. \end{equation}

This probability is clearly \(\geq 0\) because of the properties of the norm of a complex number. We normalise the wave function such that

\begin{equation} \label{e:normalisation} P(-\infty, +\infty; t) = 1 \,, \end{equation}

that is to say, the probability of finding the particle somewhere is one.

The time-evolution of the wave function is given by the Schrödinger equation,

\begin{equation} i\hbar \frac{\partial\psi(x,t)}{\partial t} = -\frac{\hbar^2}{2m}\frac{\partial^2}{\partial x^2}\psi(x,t) + V(x) \psi(x,t)\,. \end{equation}

The right-hand side equals the Hamiltonian operator acting on the wave function, \(\hat{H} \psi(x,t) = \hat{K} \psi(x,t) + \hat{V} \psi(x,t)\) where \(K,V\) are the kinetic and potential energy respectively. If we impose \eqref{e:normalisation} for a single moment in time \(t=t_0\), then the Schrödinger equation guarantees that it will remain valid for arbitrary other times.

The function \(\psi(x)\) is said to describe the state of the system. Rather than knowing exactly where a particle is, as we do classically, we only know the probability density \(P(x)\) (or more precisely, the amplitude \(\psi(x)\)) of finding the particle somewhere on the real line. Because the Schrödinger equation is linear, any linear superposition of wave functions is also a solution. So you can have a wave function strongly peaked on earth, and another one strongly peaked at the moon, and the linear combination is still a valid quantum mechanical wave function,

\begin{equation} \psi(x) = \psi_{\text{earth}}(x) + \psi_{\text{moon}}(x)\,. \end{equation}

In quantum mechanics, only measurement will force the system into one of the two classical configurations.

2.2.The Dirac notation (bra-ket)

Because the space of wave functions is linear (wave functions can be superposed) and because we have a norm, we can view each wave function as a vector in a (complex) vector space: Hilbert space. Typically, this vector space will be infinite-dimensional, because there is an infinite, or even a continuum, of possible classical configurations. But this is not necessary, and in fact we will consider in this module mainly systems for which we have only a finite (but possibly large) number of classical configurations.

In Dirac notation, the ket \(|\psi\rangle \in {\cal H}\) corresponds to the physical state. The dual bra vectors \(\langle\phi|\) live in \({\cal H}^*\) and together they can form the inner product \(\langle\phi|\psi\rangle\).

The Dirac notation consists in writing \(|\psi\rangle\) for a vector in Hilbert space corresponding to the wave function \(\psi(x)\). And instead of calling it a vector, we call it a ket, for reasons that will become clear shortly. The fact that we no longer write the \(x\) label is significant. Compare the situation in linear algebra. There, we can have a physical, arrow-like object which we call a vectors (let's say \(\vec{v}\)). To write down concretely which vector we mean, we choose a basis of unit vectors, and then write down the components of the vector on that basis, e.g. \(\vec{v} = (2, 3)\). But changing the basis does not change the vector itself, only its components. With wave functions a similar thing happens. The representation \(\psi(x)\) refers to the “basis” of position eigenstates labelled by the position \(x\). But it is perfectly possible to write down the wave function in a different basis, for instance the basis of momentum eigenstates.

So we use \(|\psi\rangle\) from now on, as a more abstract way of expressing the vector in Hilbert space. For any two such vectors, we have a positive definite inner product for the corresponding wave fuctions,

\begin{equation} \text{inner product}(\phi, \psi\big) = \int_{-\infty}^{\infty} \phi^*(x,t) \psi(x,t)\, {\rm d}x =: \langle \phi | \psi \rangle\,. \end{equation}

On the right-hand side we have introduced the inner product in the Dirac notation, \(\langle \phi | \psi\rangle\). It requires that we have access to the dual vector \(\langle \phi|\), which as you can see from the explicit integral representation, is simply related to the complex conjugate of the wave function \(\phi\). This new object \(\langle \phi|\) is called a bra, so that the inner product (or bracket) between two states reads bra-ket.

In this module we will almost always consider finite-dimensional Hilbert spaces. That means that we can define the Hilbert space by specifying a finite set of basis states. Of course, there is not a unique choice of basis states, and it will often be useful to consider different choices (related by changes of basis matrices). Commonly we will consider orthonormal bases, i.e. ones where all the basis states have norm \(1\) and are mutually orthogonal.

To better understand the bra symbols we need to introduce the concept of dual of a vector space \(V\). Formally, the dual \(V^*\) of a vector space \(V\) is the vector space of linear functionals from \(V\) into \(\mathbb{C}\). Or in formulas,

\begin{multline} V^* = \{\Phi \,:\,V\to \mathbb{C}\,\,\mbox{s.t.} \,\, \Phi(a {\mathbf{z}} + b{\mathbf{w}}) = a \,\Phi({\mathbf{z}}) + b\,\Phi({\mathbf{w}}), \\[1ex] \,\forall \,a\,,b\in\mathbb{C}\,,\,\mbox{and}\,\forall\, {\mathbf{z}}\,,{\mathbf{w}}\in V\}\,. \end{multline}

\begin{multline} V^* = \{\Phi \,:\,V\to \mathbb{C}\,\,\mbox{s.t.} \,\, \Phi(a {\mathbf{z}} + b{\mathbf{w}}) \\ = a \,\Phi({\mathbf{z}}) + b\,\Phi({\mathbf{w}}), \\[1ex] \,\forall \,a\,,b\in\mathbb{C}\,,\,\mbox{and}\,\forall\, {\mathbf{z}}\,,{\mathbf{w}}\in V\}\,. \end{multline}

Show that this is indeed a vector space over the complex numbers and show that if \(V\) has dimension \(n\) so does \(V^*\).

But this dual space is nothing mysterious. If \(V=\mathbb{C}^n\), vectors in the standard basis are simple \(n\) dimensional column vectors, i.e. \(n\times 1\) matrices if you wish, similarly you can think of \(V^*\) as the vector space of \(1\times n\) matrices, i.e. row vectors. Furthermore whenever \(V\) is endowed with a complex inner product \(\langle\cdot,\cdot\rangle\), precisely as our Hilbert space of states \(\mathcal{H}\), for each vector \(\mathbf{z}\in V\) we can associate an element \(\Phi_\mathbf{z} \in V^*\) schematically as \(\Phi_\mathbf{z} = \langle \mathbf{z},\cdot\rangle\)

\[ \Phi_\mathbf{z} ( \mathbf{w}) \doteq \langle \mathbf{z},\mathbf{w}\rangle \]

for all \(\mathbf{w} \in V\). It is easy to check (just write down what it means) that \(\Phi_\mathbf{z}(\cdot)\) just defined is indeed a linear functional from \(V\) into \(\mathbb{C}\), hence \(\Phi_\mathbf{z} \in V^*\). This means that as soon as our vector space \(V\) has a complex inner product we have immediately an isomorphism between \(V\) and \(V^*\)

\[ V\ni \mathbf{z} \mapsto \Phi_{\mathbf{z}} (\cdot) = \langle \mathbf{z},\cdot\rangle \,\in V^*\,. \]

A ket \(|\psi\rangle\) can be thought of as a column-vector, and then the bra \(\langle \phi|\) is a row vector, so that their inner product is a scalar.

In quantum mechanics we have vectors, i.e. elements of \(\mathcal{H}\), denoted by ket vectors \(\ket{\psi}\), and thanks to the complex inner product for each vector \(\ket{\psi}\) we can consider the corresponding element in the dual space \(\bra{\psi} \in \mathcal{H}^*\). A bra vector \(\bra{\psi}\) applied to a ket vector \(\ket{\phi}\) gives precisely the inner product \(\ip{\psi}{\phi}\).

Note in particular that it does NOT make any sense to consider \(\ket{\psi} + \bra{\phi}\) since one is a column vector that cannot be added to the other object which is a row vector!

Suppose our quantum mechanical system is described by a three-dimensional Hilbert space \(\mathcal{H}\) written in terms of the orthonormal basis \(\ket{0},\ket{1},\ket{2}\), i.e. \(\mathcal{H} = \mbox{span}\{\ket{0},\ket{1},\ket{2}\}\). We can represent any vector in \(\mathcal{H}\) as a three dimensional column vector using the standard basis

\begin{align*} &\ket{0}\mapsto \left(\begin{matrix}1 \\ 0 \\0 \end{matrix}\right)\,, \ket{1}\mapsto \left(\begin{matrix}0 \\ 1 \\0 \end{matrix}\right)\,, \ket{2}\mapsto \left(\begin{matrix}0 \\ 0 \\1 \end{matrix}\right)\,,\\[1ex] &\ket{\psi} = a \ket{0}+b\ket{1}+c\ket{2} \mapsto \left(\begin{matrix}a \\ b \\c \end{matrix}\right)\,,\\ &\bra{\psi} = a^* \bra{0}+b^* \bra{1}+c^* \bra{2} \\ &\quad \mapsto \left(\begin{matrix}a^* & b^* &c^* \end{matrix}\right) = \left(\begin{matrix}a \\ b \\c \end{matrix}\right)^\dagger\,. \end{align*}

To understand the second line let us remember that we are told that three basis vectors are orthonormal hence we know that the matrix that represents the inner product in this basis is given by the identity matrix. At this point it is very simple to compute the inner product between two states, say the inner product of \(\ket{\phi} = d \ket{0}+e\ket{1}+f\ket{2}\) with \(\ket{\psi} = a \ket{0}+b\ket{1}+c\ket{2}\)

\[ \ip{\phi}{\psi} = \left(\begin{matrix} d^* & e^* & f^* \end{matrix}\right) \left(\begin{matrix}a\\b\\c\end{matrix}\right) = \left(\begin{matrix}d\\e\\f\end{matrix}\right)^\dagger\left(\begin{matrix}a\\b\\c\end{matrix}\right)\,, \]

where remember \(A^\dagger = (A^*)^T\) is the transpose complex conjugate.

In particular we also see that if the ket \(\ket{\psi} = a \ket{0}+b\ket{1}+c\ket{2} \) is represented by the column vector \(\left( \begin{matrix} a\\b\\c\end{matrix}\right)\), then the bra \(\bra{\psi}\) can really be thought of as \(\bra{\psi} = (\ket{\psi})^\dagger\) and represented by the row vector \((a,b,c)^\dagger = (a^*,b^*,c^*)\).

2.3.Hilbert space formalities

You can now write this all up in formal language if you want. A quantum mechanical system is described by a ket \(|\psi\rangle\) in Hilbert space \(\mathcal{H}\). A Hilbert space is a (complex) vector space with Hermitian inner product. This means that for any \(\ket{\psi} \in \mathcal{H}\) and \(\ket{\phi} \in \mathcal{H}\):

For any complex numbers \(a\) and \(b\), \((a\ket{\psi} + b\ket{\phi}) \in \mathcal{H}\).

( linear combinations of vectors = quantum superposition)
The inner product of \(\ket{\psi}\) with \(\ket{\phi}\) is a complex number denoted
\[\ip{\psi}{\phi} = \ip{\phi}{\psi}^* \in\mathbb{C}.\]
The inner product is Hermitean,
\[\ip{\psi}{\phi} = \big(\ip{\phi}{\psi}\big)^*.\]
The inner product is linear in the second state (and so anti-linear in the first state). I.e. if \(\ket{\phi} = c_1\ket{\phi_1} + c_2\ket{\phi_2}\) then
\begin{eqnarray*} \ip{\psi}{\phi} & = & c_1\ip{\psi}{\phi_1} + c_2\ip{\psi}{\phi_2} \,,\\[1ex] \ip{\phi}{\psi} & = & c_1^*\ip{\phi_1}{\psi} + c_2^*\ip{\phi_2}{\psi} . \end{eqnarray*}
In other words, the inner product is linear in the second factor, and anti-linear in the first; it is sesquilinear.

Note that in your linear algebra module you might have seen a slightly different definition for an hermitian inner product which is linear in the first term! This is just a convention and in this module we will keep the inner product to be linear in the second term. Combining linearity in the second term with hermiticity tells us that the inner product is not quite linear in the first term
This inner product is real, \(\ip{\psi}{\psi} \in \mathbb{R}\). However, we also have a physical state condition (and we will only consider such states in this module): \(\ip{\psi}{\psi} \ge 0\) and \(\ip{\psi}{\psi} = 0 \iff \ket{\psi} = 0\). We will use the notation \(\norm{\psi} \equiv \sqrt{\ip{\psi}{\psi}}\) for the norm of \(\ket{\psi}\).

Finally, states which differ only by a normalisation factor are physically equivalent, i.e.

\[\ket{\psi} \sim c \ket{\psi}\]

for any non-zero \(c \in \mathbb{C}\). There are two ways to work with this equivalence relation. One is to ignore the normalisation but then include appropriate factors of the norms of states in formulae. The other, which we will usually assume, is to always work with normalised states, i.e. unless indicated otherwise a state \(\ket{\psi}\) will be assumed to be have \(\norm{\psi} = 1\). If you have a state which is not normalised, just divide it by its norm to get a normalised state.

Physical states are rays in Hilbert space, as the normalisation is irrelevant (and often tuned so the norm of the vector is one).

Note that normalisation does not fix a unique representative of the equivalence class of states since multiplying by a phase \(\exp(i \theta)\) for any real phase \(\theta\) does not change the norm, i.e. \(\norm{\psi} = 1\) if and only if \(\vert\vert e^{i\theta} \ket{\psi} \vert \vert = 1\).

Sometimes (pure) quantum mechanical states are called rays in the Hilbert space because of the equivalence \(\ket{\psi} \sim c \ket{\psi}\) with \(c\in\mathbb{C}\) non-zero. NOTE: Obviously the zero state cannot be normalised but that is OK as it does not describe the state of a physical system, and there is no physical process to transform a non-zero state to the zero state ( Do not confuse the zero state (meaning the unique state with norm zero) with a state labelled by zero, i.e. \(\ket{0} \ne 0\)! The norm of \(\ket{0}\) is always non-zero \(|| \ket{0}|| =1\), while the norm of the \(0\) vector is always vanishing \(||0||=0\).).

2.4.Operators

In quantum mechanics we work with linear operators acting on the states in a Hilbert space. Such operators are used to describe the time-evolution of the system and to describe measurements. If \(\hat{A}\) is a linear operator then acting on linear combinations of states we have

\[ \hat{A} (a\ket{\psi} + b \ket{\phi}) = a(\hat{A}\ket{\psi}) + b(\hat{A}\ket{\phi})\,, \]

i.e. it is linear. Also, products and linear combinations of linear operators, are again linear operators.

The adjoint (also commonly called the Hermitian conjugate) of \(\hat{A}\) is denoted \(\hat{A}^{\dagger}\) and defined by

\[ \left\langle \psi \vert \left( \hat{A}^{\dagger} \vert \phi\right\rangle \right)= \left[ \left\langle \phi \vert \left( \hat{A} \vert \psi \right\rangle \right)\right]^* \]

for all states \(\ket{\psi}\) and \(\ket{\phi}\).

In quantum mechanics, we usually focus on two types of linear operators:

self-adjoint operators
(or Hermitian) meaning \(\hat{H}^{\dagger} = \hat{H}\). Self-adjoint operators correspond to observables, i.e. quantities which can be measured. E.g. \(\hat{X}\) position operator, \(\hat{P}\) momentum operator, \(\hat{H}\) Hamiltonian operator, \(\hat{S}\) spin operator. The reason for that comes from the fact that hermitian operators have real eigenvalues.
unitary operators
meaning \(\hat{U}^{\dagger}\hat{U} = \hat{I}\). Unitary operators are used to describe time-evolution in quantum mechanics.

Show that the eigenvalues of a self-adjoint operator \(\hat{H}\) must be real. Show that eigenstates of a self-adjoint operator corresponding to different eigenvalues are automatically orthogonal to each others.

Note: when we pass to matrices: self-adjoint operators become hermitean matrices, i.e. \(H^\dagger = (H^T)^\star = H\) while unitary operators become unitary matrices, i.e. \(U^\dagger U = U U^\dagger =I\). Do not confuse these two properties!

Given the matrices

\begin{align*} A & = \left( \begin{matrix} 1 & i \\ -i & 2 \end{matrix} \right)\,,\\ B & = \left( \begin{matrix} \cos(\sqrt{5}) + \frac{2i \sin(\sqrt{5})}{\sqrt{5}} & \frac{\sin(\sqrt{5})}{\sqrt{5}} \\ - \frac{\sin(\sqrt{5})}{\sqrt{5}} & \cos(\sqrt{5}) - \frac{2i \sin(\sqrt{5})}{\sqrt{5}} \end{matrix} \right)\,,\\ C & = \left( \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right)\,,\\ D &= \left(\begin{matrix} 1 & i \\ 1+i & 3 \end{matrix} \right)\,, \end{align*}

check that \(A\) is hermitean, \(B\) is unitary, \(C\) is both unitary and hermitean, \(D\) is neither unitary nor hermitean.

Find a basis for the vector space (check that it is indeed a vector space!) of the \(2\times 2\) Hermitean matrices.

We also define the commutator of two operators \(\hat{A}\) with \(\hat{B}\) as:

\[ \com{A}{B} \equiv \hat{A}\hat{B} - \hat{B}\hat{A} . \]

There is also a similar definition of the anti-commutator

\[ \acom{A}{B} \equiv \hat{A}\hat{B} + \hat{B}\hat{A} . \]

The expectation value of an operator gives the average outcome of the measurement of the corresponding observable, if we start with the same system many times over.

The expectation value of an observable \(\hat{A}\) on a state \(\psi\) denoted by \(\langle A\rangle_\psi\) is given by

\[ \langle A\rangle_\psi = \bra{\psi} \hat{A} \ket{\psi}\,. \]

In most cases if there is no confusion regarding which state we are considering we will drop the subscript and simply write \(\langle A\rangle\). This expectation value can really be interpreted as the average outcome of many measurements of the same observable \(\hat{A}\) on the same state \(\ket{\psi}\), i.e. prepare \(1000\) times the same state \(\ket{\psi}\), measure \(1000\) times the same observable \(\hat{A}\) and then take the average.

Note that the expectation value is clearly a real number

\begin{equation} \begin{aligned} \langle A\rangle_\psi^* &= (\bra{\psi} \hat{A} \ket{\psi})^* = (\bra{\psi} \hat{A}^\dagger \ket{\psi}) \\[1ex] & = (\bra{\psi} \hat{A} \ket{\psi})=\langle A\rangle_\psi\,, \end{aligned} \end{equation}

where the hermiticity of the inner product, i.e. \({\ip{\psi}{\phi}}^* = \ip{\phi}{\psi}\), and the hermiticity of \(\hat{A}\), \(\hat{A}^\dagger= \hat{A}\), both play a crucial role.

2.5.Matrix representation

As already mentioned above, for an \(N\)-dimensional Hilbert space, we can represent states using ket vectors or complex \(N\)-component column vectors, similarly for bra vectors we need \(N\)-component row vectors. A standard choice is to represent the basis states by column vectors with all components zero, except a single ‘one’. Then, provided we are using an orthonormal basis (so that the matrix that represents the inner product is given just by the identity matrix), the inner product of two states represented by column vectors \(\vec{u}\) and \(\vec{v}\) is given by standard matrix multiplication as \(\vec{u}^{\dagger} \vec{v}\). Linear operators are then represented by \(N \times N\) matrices.

By writing out the action of an operator on each of the basis vectors in Hilbert space, we can construct its matrix representation.

In such a representation self-adjoint operators \(\hat{H}^\dagger=\hat{H}\) are indeed Hermitian matrices \(H^\dagger= H\), and unitary operators \(\hat{U}^\dagger \hat{U} = \hat{I}\) are unitary matrices \(U^\dagger U = I\) where \(I\) is the identity matrix of the appropriate dimension. Note that in most cases we will keep the hat symbol \(\hat{\phantom{A}}\), as in \(\hat{A}\), to denote the abstract operator without having having picked any particular basis to be represented as the standard one, once we choose a particular orthonormal basis to be represented via the standard one we will refer to the matrix representing the operator \(\hat{A}\) in this basis with the same letter but without the hat, i.e. just \(A\). Note that we will be playing a lot with different basis so although the abstract operator is one \(\hat{A}\) it might be represented by different matrices \(A_1\,,\,A_2\,,...\) according to which basis we choose! However we should remember from Linear Algebra I that if we change basis, say from \(\{\vec{v}_1,...\vec{v}_2\}\) to \(\{\vec{w}_1,...,\vec{w}_n\}\), then the matrix \(A_2\) representing the linear transformation \(\hat{A}\) in the new basis is related to the matrix \(A_1\) representing the same linear transformation \(\hat{A}\) but in the old basis via

\[ A_2 =S^{-1} A_1 S\,, \]

where \(S\) is the change of basis matrix to go from the new basis to the old one.

Let \(\mathcal{H} = \mbox{span}\{ \ket{0},\ket{1}\}\) be a \(2\)-dimensional Hilbert space of states, and assume that the basis vector are orthonormal. We are given a linear operator \(\hat{A}\) defined on this basis

\begin{multline} \hat{A} \ket{0} = a \ket{0} + b\ket{1}\,, \qquad\hat{A}\ket{1} = c \ket{0}+ d \ket{1}\,, \end{multline}

\begin{multline} \hat{A} \ket{0} = a \ket{0} + b\ket{1}\,,\\ \qquad\hat{A}\ket{1} = c \ket{0}+ d \ket{1}\,, \end{multline}

with \(a,b,c,d\in\mathbb{C}\).

First of all we can pass to the vector/matrix representation by using coordinates. We chose to represent the first basis vector \(\ket{0} \to \left(\begin{matrix} 1\\0\end{matrix}\right)\) and the second one as \(\ket{1} \to \left(\begin{matrix} 0\\1\end{matrix}\right)\). Once we make this choice of basis the operator \(\hat{A} \) can be represented as the \(2\times 2\) matrix

\[ A= \left(\begin{matrix} a & c \\b & d\end{matrix}\right)\,, \]

and the abstract form \(\hat{A} \ket{0} = a \ket{0} + b\ket{1}\) can be simply stated in matrix language as \(A\left(\begin{matrix} 1\\0\end{matrix}\right) = \left(\begin{matrix} a & c \\b & d\end{matrix}\right)\left(\begin{matrix} 1\\0\end{matrix}\right) = \left(\begin{matrix} a\\b\end{matrix}\right)\), and similarly for the other basis vector.

Let us now compute the adjoint \(\hat{A}^\dagger\) of \(\hat{A}\) and find which conditions we have to impose on the coefficients \(a,b,c,d\) such that \(\hat{A}\) becomes self-adjoint. If we want to compute \(\hat{A}^\dagger\) we need to know what this operator does on a basis, i.e. we need to compute \(\hat{A}^\dagger \ket{0}\) and \(\hat{A}^\dagger\ket{1}\).

So we can write

\[ \hat{A}^\dagger \ket{0} = \alpha \ket{0} +\beta\ket{1}\,,\qquad \hat{A}^\dagger \ket{1} = \gamma \ket{0}+\delta\ket{1}\,, \]

for some yet undetermined \(\alpha,\beta,\gamma,\delta\in\mathbb{C}\). To fix these coefficients we need to remember that \(\ket{0},\ket{1}\) form an orthonormal basis so we have

\begin{align*} &\bra{0} \hat{A}^\dagger \ket{0} =\bra{0}\left( \alpha \ket{0} +\beta\ket{1}\right) = \alpha\,,\\ &\bra{1} \hat{A}^\dagger \ket{0} =\bra{1}\left( \alpha \ket{0} +\beta\ket{1}\right) = \beta\,,\\ &\bra{0} \hat{A}^\dagger \ket{1} =\bra{0}\left( \gamma \ket{0} +\delta\ket{1}\right) = \gamma\,,\\ &\bra{1} \hat{A}^\dagger \ket{1} =\bra{1}\left( \gamma \ket{0} +\delta\ket{1}\right) = \delta\,. \end{align*}

Finally we need to remember the definition \(\ipop{\psi}{\hat{A}^{\dagger}}{\phi} = \ipop{\phi}{\hat{A}}{\psi}^* \) so we have

\begin{align*} &\alpha = \bra{0} \hat{A}^\dagger \ket{0} =( \bra{0} \hat{A} \ket{0} )^*=a^* \,,\\ &\beta=\bra{1} \hat{A}^\dagger \ket{0} ( \bra{0} \hat{A} \ket{1} )^* = c^*\,,\\ &\gamma= \bra{0} \hat{A}^\dagger \ket{1}=( \bra{1} \hat{A} \ket{0} )^* = b^* \,,\\ &\delta=\bra{1} \hat{A}^\dagger \ket{1}=( \bra{1} \hat{A} \ket{1} )^*=d^*\,. \end{align*}

Note the order of the vectors being flipped! We have then

\[ \hat{A}^\dagger \ket{0} = a^* \ket{0} + c^* \ket{1}\,,\qquad \hat{A}^\dagger \ket{1} = b^* \ket{0}+d^*\ket{1}\,. \]

Using the same basis as above this operator can be represented as the \(2\times 2\) matrix

\[ A^\dagger= \left(\begin{matrix} a^* & b^* \\c^* & d^*\end{matrix}\right)\,. \]

When passing to coordinates the matrix representing the adjoint operator \(\hat{A}^\dagger \) is exactly \(A^\dagger = (A^*)^T\), i.e. the transpose complex conjugate of the matrix \(A\) representing the operator \(\hat{A}\). Finally if we want the operator to be self-adjoint we must have \(\hat{A}^\dagger =\hat{A}\) which imposes \(a=a^*,\,b=c^*,\,d=d^*\). These conditions are identical to imposing that the matrix representing \(\hat{A}\) is an hermitian matrix, i.e. \(A=A^\dagger\).

Once we realise we are just doing linear algebra we can easily understand what happens when we change basis. Suppose for example we are given the operator

\[ \hat{B} \ket{0} = 2i \ket{0} + 5 \ket{1}\,,\qquad\hat{B}\ket{1} = -3 \ket{0}+ (1+i) \ket{1}\,, \]

which can be represented in the same basis as above by the matrix \(B= \left(\begin{matrix}2i & -3\\5& 1+i \end{matrix}\right)\).

We want to use now a new orthonormal basis defined by \(\ket{\pm} = \frac{1}{\sqrt{2}} (\ket{0}\pm\ket{1})\) (check that this is indeed an orthonormal basis) or equivalently \(\ket{0} = \frac{1}{\sqrt{2}}(\ket{+}+\ket{-}),\,\ket{1} = \frac{1}{\sqrt{2}}(\ket{+}-\ket{-})\). We can proceed in two ways. One possibility is to rewrite the action of \(\hat{B}\) in this new basis

\begin{align*} \hat{B} \ket{+} &= \frac{1}{\sqrt{2}} \hat{B}( \ket{0}+\ket{1}) \\ & = \frac{1}{\sqrt{2}}((-3+2i) \ket{0} +(6+i)\ket{1})\\ & = \frac{(-3+2i)}{2}(\ket{+}+\ket{-}) + \frac{(6+i)}{2} (\ket{+}-\ket{-}) \\ & = \frac{3+3i}{2}\ket{+}+\frac{-9+i}{2}\ket{-}\,,\\ \hat{B} \ket{-} &= \frac{1}{\sqrt{2}} \hat{B}( \ket{0}-\ket{1}) \\ & = \frac{1}{\sqrt{2}}((3+2i) \ket{0} +(4-i)\ket{1})\\ & = \frac{(3+2i)}{2}(\ket{+}+\ket{-}) + \frac{(4-i)}{2} (\ket{+}-\ket{-}) \\ & = \frac{7+i}{2}\ket{+}+\frac{-1+3i}{2}\ket{-}\,. \end{align*}

Hence in the new basis represented by \(\ket{+} \to \left(\begin{matrix} 1\\0\end{matrix}\right),\,\ket{-} \to \left(\begin{matrix} 0\\1\end{matrix}\right)\) the new matrix \(\tilde{B}\) representing the same operator \(\hat{B}\) now takes the form

\[ \tilde{B} = \frac{1}{2}\left(\begin{matrix}3+3i & 7+i\\ -9+i & -1+3i \end{matrix}\right)\,. \]

We could have reached the same conclusion noting that we are just making a change of basis from \(\{\ket{0},\ket{1}\}\) to \(\{\ket{+},\ket{-}\}\) and the change of basis matrix is simply given by

\[ S= \frac{1}{\sqrt{2}} \left(\begin{matrix} 1 & 1 \\ 1 & -1\end{matrix}\right)\,, \]

hence the matrix representation \(\tilde{B}\) of the same operator \(\hat{B}\) but in the new basis, is simply given by \(S^{-1} B S\) where \(B\) is the matrix representation of \(\hat{B}\) in the old basis. This matrix multiplication produces exactly the same matrix \(\tilde{B}\) just computed above.

Change of basis will play a crucial role in the discussion of qubits. We will have a set of privileged operators and we will keep on changing from a basis of eigenvectors for one such operator to a basis of eigenvectors for another of these operators. Every time we change basis the matrix representing these operators will change according to a change of basis transformation, i.e. \(S^{-1} B S\).

2.6.Time-evolution

In quantum mechanics the time-evolution of the system is governed by a self-adjoint operator called the Hamiltonian \(\hat{H}\). In this module we work in the Schrödinger picture, so a state in the system evolves with time, and we can consider a state \(\ket{\psi(t)}\) (think of it as a time-dependent vector). The time-evolution is described by the Schrödinger equation

\[ i\hbar\frac{{\rm d}}{{\rm d} t}\ket{\psi(t)} = \hat{H} \ket{\psi(t)} . \]

This can also be written in an integrated form to define the state in terms of some initial state, say at time \(t\gt{}0\) in terms of the state at \(t=0\):

\[ \ket{\psi(t)} = \hat{U}_t \ket{\psi(0)} \]

where \(\hat{U}_t\) is a unitary operator. In the case where the Hamiltonian operator is not time-dependent we have

\[ \hat{U}_t = \exp \left( -\frac{i}{\hbar} t \,\hat{H} \right) . \]

In quantum information we usually assume complete control over a quantum system or subsystem. This means that we can interact with the system in a arbitrary way, e.g. by rotating it, applying electric or magnetic fields etc. In terms of time evolution this means that we assume we have the ability to transform the state of the system

\[ \ket{\psi} \rightarrow \hat{U} \ket{\psi} \]

using any unitary operator \(\hat{U}\) we want. As such we will usually talk about transformations by a unitary operator \(\hat{U}\), rather than in terms of a Hamiltonian operator with evolution for some specific period of time.

2.6.1.Exponential of operators

In this Section we defined the time evolution operator in terms of the exponential of the Hamiltonian operator. This is a general concept:

Def: The exponential of a matrix \(A\), or more generally of an operator \(\hat{A}\), is defined by the Taylor series

\[ \exp( \hat{A}) = \sum_{n=0}^\infty \frac{\hat{A}^n}{n!} = \hat{I} + \frac{\hat{A}}{1!}+ \frac{\hat{A}^2}{2!} +...\,, \]

where \(\hat{I}\) denotes the identity operator. Note that for the operators we will consider this series will always converge.

Suppose \(A = \mbox{diag}(\lambda_1,...\lambda_N)\) be a diagonal \(N\times N\) matrix with \(\lambda_i \in\mathbb{C}\). Let us compute \(e^A\). To this end we need to compute \(A^n\) which is \(A \cdot A \cdot...\cdot A\) n-times. In general this is a difficult task but for \(A\) diagonal it is actually very simple \(A^n = \mbox{diag}(\lambda_1^n,...\lambda_N^n)\) hence

\begin{align*} e^A &= \sum_{n=0}^\infty \frac{A^n}{n!} = \sum_{n=0}^\infty \frac{1}{n!} \mbox{diag}(\lambda_1^n,...\lambda_N^n) \\ &= \mbox{diag}( \sum_{n=0}^\infty \frac{\lambda_1^n}{n!},\sum_{n=0}^\infty \frac{\lambda_2^n}{n!},...,\sum_{n=0}^\infty \frac{\lambda_N^n}{n!}) \\ &= \mbox{diag}( e^{\lambda_1},...,e^{\lambda_N})\,. \end{align*}

So \(e^A\) is once again diagonal with diagonal elements simply given by the exponential of the diagonal elements of \(A\).

Consider

\[N= \left(\begin{matrix} 0 & 1 &0 \\ 0 & 0 & 1\\ 0& 0& 0\end{matrix}\right)\]

and compute \(e^{t N}\) with \(t\in\mathbb{R}\). First we notice that \(N\) is a nilpotent matrix, i.e. we can find \(m\in\mathbb{N}\) such that \(N^m = 0\), in particular in this case

\[N^2 = \left(\begin{matrix} 0 & 0 &1 \\ 0 & 0 & 0\\ 0& 0& 0\end{matrix}\right)\]

and \(N^3 = 0\), i.e. the \(3\times 3\) zero matrix. In this case the exponential series truncates after finitely many terms

\[ e^{t N} = \sum_{n=0}^\infty \frac{t^n N^n}{n!} = \mathbb{I}_3+ t N +\frac{t^2 N^2}{2!} = \left(\begin{matrix} 1 & t &\frac{t^2}{2} \\ 0 & 1 & t\\ 0& 0& 1\end{matrix}\right)\,. \]

Note in particular that \(e^{t N}\) is NOT simply given by the exponential of each entries of \(t N\)!

Let \(\sigma_2 = \left(\begin{matrix} 0 & -i \\ i & 0\end{matrix}\right)\), we want to compute \(U_2(\alpha) = e^{i \alpha \sigma_2}\) and show that this is a unitary matrix for every \(\alpha\in\mathbb{R}\). First we notice that \((i \sigma_2)^2= -\mathbb{I}_2\) hence \((i \sigma_2)^{2n} = (-1)^n \mathbb{I}_2\) while \((i \sigma_2)^{2n+1} = (-1)^n (i \sigma_2)\).

This alternating pattern between even and odd powers allows us to evalute the exponential

\begin{align*} U_2(\alpha) &= e^{i \alpha \sigma_2} = \sum_{n=0}^\infty \frac{\alpha^n (i \sigma_2)^n}{n!} \\ &= \sum_{n\,\text{even}} +\sum_{ n\,\text{odd}} = \sum_{n=0}^\infty \frac{(-1)^n \alpha^{2n}}{(2n)!} \mathbb{I}_2 + \sum_{n=0}^\infty \frac{(-1)^n \alpha^{2n+1}}{(2n+1)!} (i\sigma_2) \\ &= \cos(\alpha) \mathbb{I}_2+ i\sigma_2 \sin(\alpha) = \left(\begin{matrix}\cos(\alpha) & \sin(\alpha)\\ -\sin(\alpha) & \cos(\alpha) \end{matrix}\right)\,. \end{align*}

In the last line we have used the Taylor expansion for the sine and cosine functions

\begin{align*} \cos(\alpha) &= \sum_{n=0}^\infty \frac{(-1)^n \alpha^{2n}}{(2n)!}\,,\\ \sin(\alpha) &= \sum_{n=0}^\infty \frac{(-1)^n \alpha^{2n+1}}{(2n+1)!}\,.\\ \end{align*}

Finally it is easy to check that \(U_2(\alpha)^\dagger U_2(\alpha) = U_2(\alpha) U_2(\alpha)^\dagger = \mathbb{I}_2\) so the matrix \(U_2(\alpha)\) is indeed a unitary matrix and hence a possible time evolution operator of a \(2\)-dimensional system. We will see that this type of unitary time evolution will be crucial for the study of the qubit systems later on.

A final comment: From these examples it should be clear that in general to compute the exponential of a matrix you CANNOT simply compute the exponential of each entry! For more on exponentials of matrices and the various ways to compute them, see e.g. [4].

2.7.Problems

Show that if a wavefunction \(\psi(x, t)\) satisfies the one-dimensional Schrödinger equation
\[ i\hbar \frac{\partial\psi}{\partial t} = -\hbar^2 \frac{\partial^2\psi}{\partial x^2} + V(x)\psi \]
for a real potential \(V(x)\), and assuming that \(\psi\) and \(\frac{\partial\psi}{\partial x}\) vanish as \(x \rightarrow \pm \infty\), then we have
\[ \frac{\mathrm{d}}{\mathrm{d}t} \int_{-\infty}^{\infty} \left\vert \psi(x,t) \right\vert^2 \mathrm{d}x = 0 . \]
What does this equation mean?

Solution:
▶
First note that taking the complex conjugate of the Schrödinger equation gives
\[ i\hbar \frac{\partial\psi^*}{\partial t} = \hbar^2 \frac{\partial^2\psi^*}{\partial x^2} - V(x)\psi^* . \]
The asymptotic conditions on \(\psi\) allow us to integrate by parts without any boundary terms in the following:
\begin{eqnarray*} i\hbar\frac{\mathrm{d}}{\mathrm{d}t} \int_{-\infty}^{\infty} \left\vert \psi(x,t) \right\vert^2 \mathrm{d}x & = & \int_{-\infty}^{\infty} \left[\left(\hbar^2 \frac{\partial^2\psi^*}{\partial x^2} - V(x)\psi^*\right)\psi + \psi^*\left(-\hbar^2 \frac{\partial^2\psi}{\partial x^2} + V(x)\psi\right)\right] \mathrm{d}x \\ & = & \hbar^2 \int_{-\infty}^{\infty} \left[ \frac{\partial^2\psi^*}{\partial x^2}\psi - \psi^* \frac{\partial^2\psi}{\partial x^2} \right] \mathrm{d}x \\ & = & \hbar^2 \int_{-\infty}^{\infty} \left[ -\frac{\partial\psi^*}{\partial x}\frac{\partial\psi}{\partial x} + \frac{\partial\psi^*}{\partial x}\frac{\partial\psi}{\partial x} \right] \mathrm{d}x = 0 \end{eqnarray*}
The physical content of this equation is that the total probability of finding a particle somewhere on the real line is constant in time (no probability “leaks out”).
Suppose \(\ket{\alpha}\) and \(\ket{\beta}\) are eigenstates of a self-adjoint operator \(\hat{A}\), with eigenvalues \(\alpha\) and \(\beta\) respectively. Using the definition of the adjoint of \(\hat{A}\) and inner products such as \(\ipop{\alpha}{\hat{A}}{\alpha}\) and \(\ipop{\alpha}{\hat{A}}{\beta}\), show that:
1. \(\alpha \in \mathbb{R}\) (hence all eigenvalues of a self-adjoint operator are real.)
2. If \(\alpha \ne \beta\) then \(\ip{\alpha}{\beta} = 0\) (hence eigenstates of a self-adjoint operator with different eigenvalues are orthogonal.)
Solution:
▶
1. We have \(\hat{A} \ket{\alpha} = \alpha \ket{\alpha}\) so \(\ipop{\alpha}{\hat{A}}{\alpha} = \alpha \ip{\alpha}{\alpha}\). But taking the complex conjugate we have:
  \[ \alpha^* \ip{\alpha}{\alpha} = \ipop{\alpha}{\hat{A}}{\alpha}^* = \ipop{\alpha}{\hat{A}^{\dagger}}{\alpha} = \ipop{\alpha}{\hat{A}}{\alpha} =\alpha \ip{\alpha}{\alpha} . \]
  So, unless \(\ket{\alpha} = 0\) in which case it is not really an eigenstate of \(\hat{A}\), we must have \(\alpha^* = \alpha\).
2. First calculate
  \[ \ipop{\alpha}{\hat{A}}{\beta} = \beta \ip{\alpha}{\beta} \]
  and then (noting from the previous part \(\beta\) must be real) take the complex conjugate to find
  \[ \beta \ip{\beta}{\alpha} = \ipop{\alpha}{\hat{A}}{\beta}^* = \ipop{\beta}{\hat{A}^{\dagger}}{\alpha} = \ipop{\beta}{\hat{A}}{\alpha} = \alpha \ip{\beta}{\alpha} . \]
  So, if \(\alpha \ne \beta\) the only way this can be true is if \(\ip{\beta}{\alpha} = 0\), as so also its complex conjugate \(\ip{\alpha}{\beta} = 0\).
1
Suppose a self-adjoint operator \(\hat{A}\) has normalised eigenstates \(\ket{1}\), \(\ket{2}\) and \(\ket{3}\) with eigenvalues \(1\), \(2\), \(3\) respectively.
1. What is the probability of measuring \(A=2\) if the system is described by the state:
  1. \(\ket{\psi} = \frac{1}{2}\ket{1} + \frac{1}{4}\ket{2} + \frac{1}{4}\ket{3} \)?
  2. \(\ket{\phi} = \frac{3}{4}\ket{1} - \frac{1}{2}\ket{2} + \frac{1}{4}\ket{3} \)?
2. Calculate \(\ev{A}\) for the state \(\ket{\psi}\) and for the state \(\ket{\phi}\).
Solution:
▶
1. Note from the previous question we can deduce that the three eigenstates are orthogonal. We then notice that the states \(\ket{\psi}\) and \(\ket{\phi}\) are not normalised, so first find the norms:
  \begin{eqnarray*} \norm{\psi}^2 & = & \ip{\psi}{\psi} = \frac{1}{4} + \frac{1}{16} + \frac{1}{16} = \frac{3}{8} \\ \norm{\phi}^2 & = & \ip{\phi}{\phi} = \frac{9}{16} + \frac{1}{4} + \frac{1}{16} = \frac{7}{8} \end{eqnarray*}
  Then the probability of measuring \(A=2\) is given by the magnitude squared of the coefficient of \(\ket{2}\) divided by the norm squared. I.e.
  1. For \(\ket{\psi}\) the probability is \(\frac{8}{3} \times \vert \frac{1}{4} \vert^2 = \frac{1}{6}\).
  2. For \(\ket{\phi}\) the probability is \(\frac{8}{7} \times \vert -\frac{1}{2} \vert^2 = \frac{2}{7}\).
2. The expectation value of \(A\) is just the sum of each value of \(A\) weighted by the probability of getting that value. The probabilities for \(A=1\) and \(A=3\) can easily be calculated as for \(A=2\) above. The results are:
  1. For \(\ket{\psi}\) the expectation value is \(\ev{A} = \frac{2}{3} \times 1 + \frac{1}{6} \times 2 + \frac{1}{6} \times 3 = \frac{3}{2}\).
  2. For \(\ket{\phi}\) the expectation value is \(\ev{A} = \frac{9}{14} \times 1 + \frac{2}{7} \times 2 + \frac{1}{14} \times 3 = \frac{10}{7}\).
If we have a two-dimensional Hilbert space and represent states by two-component column vectors such as \(u\) and \(v\), find the necessary and sufficient conditions on the \(2 \times 2\) matrix \(M\) so that \(u^{\dagger}Mv\) defines an inner product on the Hilbert space. Recall the three conditions for the inner product on physical states, but you can just state without proof any properties which follow automatically from matrix multiplication.

Solution:
▶
The linear property follows immediately from the linearity of matrix multiplication so there is no need to check anything.

The ‘symmetry’ of the inner product is the statement that the complex conjugate of the inner product of \(v\) with \(u\) is the inner product of \(u\) with \(v\). I.e.
\[ u^{\dagger}Mv = (v^{\dagger}Mu)^* = u^{\dagger}M^{\dagger}v . \]
For this to hold for all vectors \(u\) and \(v\) we must have \(M^{\dagger} = M\), i.e. \(M\) must be a Hermitian matrix.

The ‘physical state’ property requires that for any non-zero vector \(u\), its norm squared (given by the inner product of \(u\) with itself) is positive, i.e.
\( u^{\dagger}Mu \gt{} 0\). This means that \(M\) must be a positive definite matrix. Since we know that \(M\) is Hermitian, this is equivalent to the requirement that the eigenvalues of \(M\) are all positive.
Recall that functions such as the exponential of operators are defined through their Taylor series.
1. Show that if \(\com{A}{B} = 0\) then
  \[ \exp\left(\hat{A}\right)\exp\left(\hat{B}\right) = \exp\left(\hat{A} + \hat{B} \right) . \]
2. Show that if \(\com{A}{B} \ne 0\) then
  \[ \exp\left(\alpha\hat{A}\right)\exp\left(\beta\hat{B}\right) \ne \exp\left(\alpha\hat{A} + \beta\hat{B} \right) \]
  for arbitrary \(\alpha, \beta \in \mathbb{C}\). (The expressions can be equal for specific values of \(\alpha, \beta\).)
3. Show that if \(\hat{J}^2 = -\hat{I}\) where \(\hat{I}\) is the identity operator then
  \[ \exp\left( \theta \hat{J} \right) = (\cos\theta) \hat{I} + (\sin\theta) \hat{J} \]
  for any \(\theta \in \mathbb{C}\).
Solution:
▶
1. Since \(\com{A}{B} = 0\) we can write any product of these operators as \(\hat{A}^m\hat{B}^n\) with some coefficient and we don't have to distinguish terms which differ by the order of the products of the operators. E.g. we have \(\hat{A}\hat{B}\hat{A}\hat{B}^3 = \hat{A}^2\hat{B}^4\). So, to prove the identity we just have to show that the coefficient of \(\hat{A}^m\hat{B}^n\) is the same on both sides for all (non-negative) integers \(m\) and \(n\).
  
  On the left hand side we have coefficient \(\frac{1}{m!}\frac{1}{n!}\) from the definition of the exponential function. On the right hand side this term will appear when expanding the term \((\hat{A} + \hat{B})^{m+n}\). That term has a coefficient \(\frac{1}{(m+n)!}\) from the exponential, and then we have a binomial coefficient \(\frac{(m+n)!}{m! \; n!}\) from expanding the polynomial. Therefore we see that the coefficient matches on both sides, so we have proven the identity.
2. Here we can think of the expressions on each side as ‘function’ of two variables, \(\alpha\) and \(\beta\). Then expanding the exponentials will give a two-variable Taylor series, and for the two functions to be equal their Taylor series must match. This means that each term with any given powers of \(\alpha\) and \(\beta\) must match.
  
  It is easy to see that the terms with \(\alpha^0 \beta^0 = 1\), \(\alpha^1 \beta^0 = \alpha\) and \(\alpha^0 \beta^1 = \beta\) all match as they don't care about the commutation relation between \(\hat{A}\) and \(\hat{B}\). In fact the same is true for any term \(\alpha^m \beta^0 = \alpha^m\) or \(\alpha^0 \beta^n = \beta^n\). The first non-trivial check comes for the term with \(\alpha \beta\). On the LHS we get \(\alpha\beta \hat{A} \hat{B}\) while on the RHS the term comes from the exponent squared so we have
  \[ \frac{1}{2!} \alpha \beta \left( \hat{A} \hat{B} + \hat{B} \hat{A} \right) . \]
  For non-zero \(\alpha\) and \(\beta\) these expression are the same only if \(\hat{A} \hat{B} = \hat{B} \hat{A}\), i.e. only if \(\com{A}{B} = 0\).
3. Use the definition of the exponential of an operator in terms of the Taylor series. Since \(\hat{J}^2 = - \hat{I}\) all even powers of \(\hat{J}\) will be proportional to the identity, specifically \(\hat{J}^{2n} = (-1)^n \hat{I}\). For odd powers there will be an extra factor of \(\hat{J}\) so \(\hat{J}^{2n+1} = (-1)^n \hat{J}\). Then we simply calculate
  \begin{eqnarray*} \exp\left( \theta \hat{J} \right) & = & \sum_{m=0}^{\infty} \frac{1}{m!} \left( \theta \hat{J} \right)^m \\ & = & \sum_{n=0}^{\infty} \frac{1}{(2n)!} \left( \theta \hat{J} \right)^{2n} + \sum_{n=0}^{\infty} \frac{1}{(2n+1)!} \left( \theta \hat{J} \right)^{2n+1} \\ & = & \sum_{n=0}^{\infty} \frac{(-1)^n}{(2n)!} \theta^{2n} \hat{I} + \sum_{n=0}^{\infty} \frac{(-1)^n}{(2n+1)!} \theta^{2n+1} \hat{J} \\ & = & \cos(\theta) \hat{I} + \sin(\theta) \hat{J} \end{eqnarray*}

Ask a question about the highlighted paragraph(s):

Long-tap anywhere to ask a question (or see the reply). Right-click anywhere to ask a question (or see the reply).

1.	Introduction
2.	Quantum mechanics essentials
2.1.	States and wave functions
2.2.	The Dirac notation (bra-ket)
2.3.	Hilbert space formalities
2.4.	Operators
2.5.	Matrix representation
2.6.	Time-evolution
2.7.	Problems
3.	Measurement and uncertainty
4.	Qubits and the Bloch sphere
5.	Bipartite systems
6.	Entanglement applications
7.	Information theory
8.	Changelog
9.	Bibliography