Complex Analysis II (Michaelmas)

Katie Gittins1

2022-2023

1 The Complex Plane and Riemann Sphere

1.1 Complex numbers

A complex number \(z\) is a quantity of the form \(z=x+iy\), where \(x,y\) are real numbers and \(i\) is the imaginary unit. We denote by \(\mathbb{C}\) the set of all complex numbers.

Algebra

We can add, subtract and multiply complex numbers: If \(z_{1}=x_{1}+iy_{1}\) and \(z_{2}=x_{2}+iy_{2}\) then \[z_{1}\pm z_{2}:=(x_{1}\pm x_{2})+i(y_{1}\pm y_{2}),\] \[z_{1}z_{2}:=(x_{1}x_{2}-y_{1}y_{2})+i(x_{1}y_{2}+x_{2}y_{1}).\] Notice that addition simply corresponds to adding the individual components. In general we denote by \(\mathrm{Re}(z)=x\) the real part of \(z\), and by \(\mathrm{Im}(z)=y\) the imaginary part of \(z\). By the definition of multiplication we have \(i^{2}=-1\), and using this we see that multiplication corresponds to ‘multiplying out the brackets’: \((x_{1}+iy_{1})(x_{2}+iy_{2})=(x_{1}x_{2}+i^{2}y_{1}y_{2})+i(x_{1}y_{2}+x_{2}y_{1}).\)

We can also divide complex numbers. For \(z_{2}\neq0\) (here we use the shorthand \(0=0+0i\)) we have \[\frac{z_{1}}{z_{2}}=\frac{x_{1}+iy_{1}}{x_{2}+iy_{2}}=\frac{(x_{1}+iy_{1})(x_{2}-iy_{2})}{(x_{2}+iy_{2})(x_{2}-iy_{2})}=\frac{x_{1}x_{2}+y_{1}y_{2}}{x_{2}^{2}+y_{2}^{2}}+i\frac{x_{2}y_{1}-x_{1}y_{2}}{x_{2}^{2}+y_{2}^{2}}\quad\in\mathbb{C}.\] The quantity we used to make the denominator real is important. In general, for \(z=x+iy\) we call \(\bar{z}:=x-iy\) the complex conjugate of \(z\).

We immediately have a multiplicative inverse \[z^{-1}:=\frac{1}{z}=\frac{x}{x^{2}+y^{2}}-i\frac{y}{x^{2}+y^{2}}.\] WARNING: While most of the nice properties of \(\mathbb{R}\) hold in \(\mathbb{C}\), we do not have notions of \(\leq\) \(<\), \(\geq\) or \(>\); the set \(\mathbb{C}\) is not ‘ordered’ and expressions like \(z_{1}<z_{2}\) have no meaning.

How do we visualise the complex numbers?

There are various ‘models’ for the complex numbers. The most commonly used/most intuitive is to think of \(\mathbb{C}\) as a copy of \(\mathbb{R}^{2}\) equipped with a map \[\mathbb{R}^{2}\times\mathbb{R}^{2}\rightarrow\mathbb{R}^{2}:\left((x_{1},y_{1}),(x_{2},y_{2})\right)\mapsto(x_{1}x_{2}-y_{1}y_{2},x_{1}y_{2}+x_{2}y_{1}).\] So, a copy of \(\mathbb{R}^{2}\) with a way of multiplying2 (and dividing) vectors! Indeed there is an obvious bijection \(f:\mathbb{R}^{2}\rightarrow\mathbb{C}\) given by \(f((x,y))=x+iy\). As a result, we often draw complex numbers on the usual \((x,y)\)-plane: such a picture is called an Argand diagram.

On \(\mathbb{R}^{2}\) there is a natural notion of size, and we use it in \(\mathbb{C}\): we call the quantity \(|z|:=\sqrt{x^{2}+y^{2}}\) the modulus or absolute value of \(z\) \((=x+iy)\).

[Important Properties of Complex numbers]  

  1. \(z_{1}z_{2}=0\quad\iff z_{1}=0\) or \(z_{2}=0,\)

  2. \(|z|=\sqrt{z\bar{z}},\)

  3. \(\mathrm{Re}(z)=\frac{z+\bar{z}}{2}\) and \(\mathrm{Im}(z)=\frac{z-\bar{z}}{2i},\)

  4. \(z^{-1}=\frac{\bar{z}}{|z|^{2}}.\)

Property \(1.\) is very important, and makes \(\mathbb{C}\) an integral domain (see Algebra II).

Polar form/coordinates of a complex number

Now we have a notion of distance, as in \(\mathbb{R}^{2}\) we can implement a change of variables \(z(x,y)\rightarrow z(r,\theta)\). Let \(r=|z|\) and let \(\theta\) denote the anticlockwise angle measured from the real axis. (Angles measured clockwise will be considered negative.) We call \(\theta\) the argument of \(z\) (for \(z\neq0\)) and write \(\arg(z)=\theta\). We then have the following polar coordinates for \(z\): \[z=r(\cos\theta+i\sin\theta),\] which we write in shorthand as \(z=re^{i\theta}\).

For example, \(i=e^{i\pi/2}\) and \(1+i=\sqrt{2}e^{i\pi/4}\). We have \(|-1/\sqrt{2}-i\sqrt{3}/\sqrt{2}|=\sqrt{2}\), so \[-1/\sqrt{2}-i\sqrt{3}/\sqrt{2}=\sqrt{2}(-1/2-i\sqrt{3}/2)=\sqrt{2}e^{-i2\pi/3}.\] Note that \(\arg(z)\) is only defined up to multiples of \(2\pi\); for example \(i=e^{i\pi/2}=e^{i5\pi/2}=e^{-i3\pi/2}\). Strictly speaking \(\arg(i)=\pi/2+2\pi k\), for any \(k\in\mathbb{Z}\) (and so \(\arg\) is a one-to-many function!). As a result, we need to be careful; we choose a fixed interval in which to express the argument: the principal value of \(\arg(z)\) is the value in the interval \((-\pi,\pi]\) and will be denoted \(\mathrm{Arg}(z)\). So \(\mathrm{Arg}(i)=\pi/2\) and \(\mathrm{Arg}(-1)=\pi\) for example.

[Properties of argument] We have the following properties of the argument:

  1. \(\arg(z_{1}z_{2})=\arg(z_{1})+\arg(z_{2})\bmod2\pi\)

  2. \(\arg(1/z)=-\arg(z)\bmod2\pi\)

  3. \(\arg(\bar{z})=-\arg(z) \bmod2\pi\) .

When we say two real numbers are equal \(\bmod2\pi\) we mean they differ by an integer multiple of \(2\pi\).

It is nice to have a geometric picture of what the algebraic operations on complex numbers mean.

Geometrically, multiplication in \(\mathbb{C}\) is given by a dilated rotation; i.e., if \(z_{1}=r_{1}e^{i\theta_{1}}\) and \(z_{2}=r_{2}e^{i\theta_{2}}\) then \[z_{1}z_{2}=r_{1}r_{2}\left(\cos(\theta_{1}+\theta_{2})+i\sin(\theta_{1}+\theta_{2})\right)=r_{1}r_{2}e^{i(\theta_{1}+\theta_{2})}.\] In particular, multiplying by \(z_{2}\) constitutes an (anticlockwise) rotation of \(z_{1}\) by \(\theta_{2}\) degrees, followed by a dilation with factor \(r_{2}\). Furthermore, addition represents a translation and conjugation represents a reflection in the real axis. Taking the real or imaginary part of a complex number \(z\) represents a projection of \(z\) onto the real or imaginary axis respectively.

By the standard double angle formula we have \[\begin{aligned} z_{1}z_{2} & =r_{1}r_{2}\left(\cos\theta_{1}+i\sin\theta_{1}\right)\left(\cos\theta_{2}+i\sin\theta_{2}\right)\\ & =r_{1}r_{2}\left((\cos\theta_{1}\cos\theta_{2}-\sin\theta_{1}\sin\theta_{2})+i(\sin\theta_{1}\cos\theta_{2}+\sin\theta_{2}\cos\theta_{1})\right)\\ & =r_{1}r_{2}\left(\cos(\theta_{1}+\theta_{2})+i\sin(\theta_{1}+\theta_{2})\right).\end{aligned}\] The geometric interpretations of addition, conjugation, and real/imaginary parts are fairly obvious.

  1. \(|z_{1}z_{2}|=|z_{1}|\,|z_{2}|,\)

  2. De Moivre’s formula: \((\cos\theta+i\sin\theta)^{n}=\cos(n\theta)+i\sin(n\theta)\).

The modulus also has the following important properties.

  1. (Triangle inequality) \(|z_{1}+z_{2}|\leq|z_{1}|+|z_{2}|\)

  2. \(|z|\geq0\text{ and }|z|=0\iff z=0\)

  3. \(\max(|\mathrm{Re}(z)|,|\mathrm{Im}(z)|)\leq|z|\leq|\mathrm{Re}(z)|+|\mathrm{Im}(z)|\)

The first two properties above along with Corollary [cor:norm-mult] make the modulus a norm on \(\mathbb{C}\) (see later, Definition [def:norm]).

Simple complex functions and geometry (examples)

  • We can also use functions to define regions in the complex plane. Consider the set of points \(z\) which satisfy the inequality \(|z-i|<|z+i|\). This is precisely the points in \(\mathbb{C}\) whose distance to \(i\) is strictly smaller than their distance to \(-i\). Thus, the inequality represents the upper half plane \(\mathbb{H}:=\{z\in\mathbb{C}:\mathrm{Im}(z)>0\}\).

  • Note that the equation \(|z-i|=1\) represents a circle centred at \(i\) of radius \(1\).

1.2 Exponential and trigonometric functions

[Complex exponential] We define the complex exponential function \(\exp:~\mathbb{C}\rightarrow\mathbb{C}\) by \[\exp(z):=e^{x}(\cos y+i\sin y).\qquad\qquad(z=x+iy)\] As shorthand we write \(\exp(z)=e^{z}\).

We will see later that \(e^{z}=\sum_{n=0}^{\infty}\tfrac{z^{n}}{n!}\) as in real analysis. We could have started with this as the definition instead.

We have the following properties of the complex exponential function:

  1. \(e^{z}\ne0\quad\) for all \(z\in\mathbb{C}\),

  2. \(e^{z_{1}+z_{2}}=e^{z_{1}}e^{z_{2}}\),

  3. \(e^{z}=1\quad\Longleftrightarrow\quad z=2\pi ik\) for some \(k\in\mathbb{Z}\),

  4. \(e^{-z}=1/e^{z}\),

  5. \(|e^{z}|=e^{\mathrm{Re}(z)}\).

Most are straightforward to check. \(3.\) is very important: \(\exp(z)=1\iff e^{x}\cos y=1\) and \(e^{x}\sin y=~0\). Since \(e^{x}>0\), the latter is equivalent to \(\sin y=0\) and so \(y=n\pi\) for some \(n\in\mathbb{Z}\). Thus we have \(\exp(z)=1\iff e^{x}\cos(n\pi)=1\iff e^{x}(-1)^{n}=1\iff n\) is even and \(e^{x}=1\iff x=0\) and \(y=2k\pi\) (\(k\in\mathbb{Z}\)).

We have \(\exp(2\pi i)=1\) and \(\exp(\pi i)=-1\). The latter is Euler’s formula.

The complex exponential function is \(2\pi i\)-periodic; that is, \(\exp(z+2k\pi i)=\exp(z)\) for any \(k\in\mathbb{Z}\).

The above implies \(\exp\) is determined entirely by the values it takes in any horizontal strip of width \(2\pi\) in the complex plane. Note also that \(\exp\) is ‘unbounded’, since by \(5.\) the modulus \(|\exp(z)|\) gets arbitrarily large as \(\mathrm{Re}(z)\) increases.

[Trigonometric functions] All as functions from \(\mathbb{C}\rightarrow\mathbb{C}\), we define \[\begin{aligned} \sin(z) & :=\frac{1}{2i}(e^{iz}-e^{-iz})\qquad & \cos(z):=\frac{1}{2}(e^{iz}+e^{-iz})\\ \sinh(z) & :=\frac{1}{2}(e^{z}-e^{-z})\qquad & \cosh(z):=\frac{1}{2}(e^{z}+e^{-z})\end{aligned}\] (For \(z=x\) real these coincide with the real functions. We will eventually get power series expressions for them from the one for the exponential function.)

All the usual double angle formulae and equations relating the functions hold: e.g., \(\cos^{2}(z)+\sin^{2}(z)=1\). Additionally, notice that we have \(\cosh(iz)=\cos(z)\) and \(\cos(iz)=\cosh(z)\), while \(\sinh(iz)=i\sin(z)\) and \(\sin(iz)=i\sinh(z)\). All four functions are unbounded.

1.3 Logarithms and complex powers

We use the notation \(\mathbb{C}^{*}=\mathbb{C}-\{0\}\), i.e. the set of nonzero complex numbers.

[Inverting the exponential function] For every \(w\in\mathbb{C}^{\ast}\), the equation \[\begin{align} e^{z} & =w\label{eq:exp}\end{align}\] has a solution \(z\). Furthermore, if we write \(w=|w|e^{i\phi}\) with \(\phi=\mathrm{Arg}(w)\), then all solutions to \(\eqref{eq:exp}\) are given by \[\begin{equation} z=\log|w|+i(\phi+2k\pi)\quad\text{ for }k\in\mathbb{Z}.\label{eq:expsolution} \end{equation}\] Here, \(\log|w|\) is the usual natural logarithm of the real number \(|w|\). Note that there are infinitely many solutions.

If \(z\) is of the form as in \(\eqref{eq:expsolution}\) for some given \(k\in\mathbb{Z}\), then \[e^{z}=e^{\log|w|+i(\phi+2k\pi)}=e^{\log|w|}e^{i(\phi+2k\pi)}=e^{\log|w|}e^{i(\phi)}=|w|e^{i(\phi)}=w,\] by Proposition [prop:exponential-properties], Part \(2.\) and Corollary [cor:Euler]. Thus, \(z\) is a solution.

To see all solutions are of the given form, first write \(z=x+iy\) and assume \(e^{z}=w\). Since \(e^{x}e^{iy}=e^{z}=w=|w|e^{i\phi}\), we have \(|e^{z}|=e^{x}=|w|\). Thus \(x=\log|w|\). Moreover, dividing both sides by \(|w|\) we have \(e^{iy}=e^{i\phi}\) and so \(e^{i(y-\phi)}=1\). It follows from Proposition [prop:exponential-properties], Part \(3.\) that all solutions are given by \(i(y-\phi)=2k\pi i\) for some \(k\in\mathbb{Z}\); in other words, \(y=\phi+2k\pi\).

We now come to an important topic called branch cuts.

First we give a high-level description of why branch cuts are necessary. We have just described for fixed \(w_{0}\neq0\), exactly what the possible numbers are such that \(e^{z}=w_{0}\). These values of \(z\) could all reasonably be called \(\log(w_{0})\). Can we make this choice of solution to \(e^{z}=w_{0}\) vary nicely as we move \(w_{0}\) a little bit? 3. Of course, we will run into trouble at \(0\), since \(e^{z}=0\) has no solutions. Bearing this in mind, could we at least define a ‘\(\log\)’ function that is ‘continuous’ on \(\mathbb{C}\setminus \{0\}\)? Since we don’t know the definition of continuous yet, let us just ask that the function should not jump abruptly when we move from a point to a nearby one.

The answer is no, and let’s see why by trying to come up with one.

Let’s suppose \(w_{0}=1\) and we pick a solution to \(e^{z}=w_{0}\). The obvious one is \(z=0\) so let’s pick that. Now let \(w\) be close to \(1\). If we’ve found a solution to \(e^{z}=w\) then we know from Lemma [lem:Invertingexp] that it must be of the form

\[z=\log|w|+i(\mathrm{Arg}(w)+2\pi k)\] for some \(k\in\mathbb{Z}\). Since \(w\) is close to \(1\), we know \(\log|w|\) is close to \(0\) and \(\mathrm{Arg}(w)\) is close to \(0\). Now, if \(k\) is not zero, then \(z\) will not be close to \(0\) (any integer that is not zero has absolute value at least one!). So \(k\) must be zero for \(z\) to be close to \(0\). We just argued that if we move \(w\) a little, and if we want \(\log\) to be ‘continuous’, we must choose the argument of \(w\) ‘continuously’.

Now let’s take our idea a little further. If we move \(w\) on a path beginning at \(1\), following the unit circle anticlockwise, until we reach \(1\) again, what happens to our solutions to \(e^{z}=w\) if we are choosing them continuously as above? In other words, what happens to the argument of \(w\)? Since we are moving anticlockwise and we are choosing the argument continuously, it is increasing as we go around the unit circle. So just before we complete the circle, the argument of \(w\) is just below \(2\pi\). This is a big problem, since it means there are values of \(w\) just below \(1\) on the unit circle where we have been forced to set \(\log(w)\) very close to \(2\pi i\). On the other hand, we began by assuming \(\log(1)=0\). So the way we have tried to do things, our \(\log\) function is going to have a jump discontinuity below 1.

No matter what way we try to define \(\log\) continuously on \(\mathbb{C}\setminus \{0\}\), we will run into a similar problem. On the other hand, if \(R_{\theta}\) is any ray of the form

\[R_{\theta}=\{re^{i\theta}\::\:r\in\mathbb{R},\:r\geq0\}\subset\mathbb{C}\] then it is possible to define a continuous function \(\log(z)\) on \(\mathbb{C}\setminus R_{\theta}\). (One can think of cutting out this ray as cutting out the points where \(\log\) will have a jump discontinuity).

[Complex logarithm functions]

For any two real numbers \(\theta_{1}<\theta_{2}\) with \(\theta_{2}-\theta_{1}=2\pi\), let \(\arg\) be the choice of argument function with values in \((\theta_{1},\theta_{2}]\). Then the function

\[\log(z):=\log|z|+i\arg(z)\] is called a branch of logarithm. It has a jump discontinuity along the ray \(R_{\theta_{1}}=R_{\theta_{2}}\). This ray is called a branch cut.

If we choose \(\arg(z)=\mathrm{Arg}(z)\in(-\pi,\pi]\), then we obtain a branch of logarithm called the principal branch of log. We write \(\mathrm{Log}\) for this principal branch: it is given by the formula \[\mathrm{Log}(z):=\log|z|+i\mathrm{Arg}(z).\] The principal branch of logarithm has a ‘jump discontinuity’ along the ray given by the non-positive real axis \(\mathbb{R}_{\leq0}\).

  • Any time one talks about a function called \(\log\), one has to declare which branch of log we use. This is normally done simply by stating the interval \((\theta_{1},\theta_{2}]\) where \(\arg(z)\) lives.

  • As soon as we define continuous functions, we will easily be able to see that the branch of log corresponding to \(\arg(z)\in(\theta_{1},\theta_{2}]\) is continuous on \(\mathbb{C}\setminus R_{\theta_{1}}\).

  • The principal branch, \(\mathrm{Log}\), agrees with the natural logarithm \(\log\) on the real line; that is, for \(x>0\) we have \(\mathrm{Log}x=\log x\). For this reason we will always use the principal branch unless otherwise stated.

[Properties of logarithms] We have the following properties when using any given branch of logarithm:

  1. \(e^{\log z}=z\) for any \(z\in\mathbb{C}\setminus \{0\}\), but,

  2. in general, \(\log(zw)\neq\log z+\log w\), and

  3. in general, \(\log(e^{z})\neq z\).

We would now like to define functions giving powers of complex numbers. We already know from the world of real numbers that to define a function giving for example, a square root of a positive real number, we have to make a choice of whether to take the positive or negative root. To take a root of a complex number, we have to make a similar choice, but we have more freedom.

[Complex powers] For \(w\in\mathbb{C}\) fixed, by choosing any branch of \(\log\) we can define a branch of the function \(z\mapsto z^{w}\) by the expression \[z^{w}:=\exp({w\log z}).\] For example, if \(w=1/n\) and we use the principal branch we get \[z^{1/n}=e^{(\log|z|+i\mathrm{Arg}(z))/n}=|z|^{1/n}e^{i\mathrm{Arg}(z)/n}.\]

Warning: different branches of log can give different power functions! So we must always specify which branch of \(\log\) we are using.

Now that we have defined complex powers, we should check that our exponential function matches up with the concept of ‘raising \(e\) to the power \(z\)’ for a suitable choice of \(\log e\). The natural choice of \(\log e\) is \(1\). Then, \(e\) raised to the power \(z\) should agree with computing \(\exp(z\log e)=\exp(z)\) as we expected.

Finding values of log and complex powers (examples)

\((a)\) Using the principal branch of \(\log\), we find \(\log(1-i)\) and \((1-i)^{1/2}\). We have \(|1-i|=\sqrt{2}\) and \(\mathrm{Arg}(1-i)=-\pi/4\). Thus, \(1-i=\sqrt{2}e^{-i\pi/4}\). Therefore, \(\mathrm{Log}(1-i)=\log|1-i|+i\mathrm{Arg}(1-i)=\log\sqrt{2}-i\pi/4\), and \[(1-i)^{1/2}=\exp\left(\frac{1}{2}\,\mathrm{Log}(1-i)\right)=\exp\left(\frac{1}{2}\,\left(\log\sqrt{2}-i\frac{\pi}{4}\right)\right)=\exp\left(\log\sqrt[4]{2}-i\frac{\pi}{8}\right)=\sqrt[4]{2}e^{-i\frac{\pi}{8}}.\] \((b)\) Using the principal branch and the previous example \[(1-i)^{i}=\exp(i\mathrm{Log}(1-i))=\exp\left(\frac{\pi}{4}+i\log\sqrt{2}\right)=e^{\pi/4}e^{i\log\sqrt{2}}.\] \((c)\) Again, using the principal branch \[2^{1/2}=\exp\left(\frac{1}{2}\log2\right)=\exp(\log\sqrt{2})=\sqrt{2}.\] What about the other root? It comes from using a different branch: if we let \(\log\) be the branch of logarithm corresponding to \(\arg(z)\in(\pi,3\pi]\) then \(\log(z)=\log|z|+i(\mathrm{Arg}(z)+2\pi)\), so we have \(\log(2)=\log2+i2\pi\) and \[2^{1/2}=\exp\left(\frac{1}{2}(\log2+i2\pi)\right)=\exp(\log\sqrt{2}+i\pi)=\sqrt{2}e^{i\pi}=-\sqrt{2}.\]

All \(n\)th roots can be found this way (see Sheet 1 Q23). In particular, for \(z\in\mathbb{C}^*\) all \(n\)th roots of \(z\) are of the form \[z^{1/n}=|z|^{1/n}\exp\left(i\frac{\mathrm{Arg}(z)}{n}+\frac{2k\pi i}{n}\right)\quad\text{ for }k=0,\ldots,n-1.\]

Visualizing complex functions

The ‘graph’ of a complex-valued function \(f:\mathbb{C}\to\mathbb{C}\) is 4-dimensional, so difficult to visualise - we certainly can’t draw it. We can employ other techniques to get a grasp on complex functions:

  • We can graph the real-valued function \(|f|:\mathbb{C}\to\mathbb{R}\). For example, consider the complex function \(\cos z\). When \(z=x\) is purely real, we have that \(|f(z)|=|\cos(x)|\) is obviously just the modulus of the real cosine function. But for \(z=iy\) purely complex we have \(|f(z)|=|\cos(iy)|=|\cosh y|=\cosh y.\) So in the imaginary direction \(f\) simply looks like cosh!

  • It is often useful to visualise complex functions by considering how they map regions of the complex plane. Consider the image of the ‘right half-plane’ \(\mathbb{H}_{R}:=\{z\in\mathbb{C}:\mathrm{Re}(z)>0\}\) under the map \(f(z)=z^{2}\). Note that \(\mathbb{H}_{R}=\{z\in\mathbb{C}:-\pi/2<\mathrm{Arg}(z)<\pi/2\}\). If \(z=re^{i\theta}\in\mathbb{H}_{R}\) then \(z^{2}=zz=r^{2}e^{i2\theta}\) has argument \(2\theta\in(-\pi,\pi)\). Thus, \(f\) maps \(\mathbb{H}_{R}\) to \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\), where \(\mathbb{R}_{\leq0}\) denotes the negative real axis (including \(0\)). The map is onto, since for every \(w=se^{i\phi}\in\mathbb{C}\setminus\mathbb{R}_{\leq0}\) (so \(\phi\in(-\pi,\pi)\)) we can find \(z\in\mathbb{H}_{R}\) such that \(f(z)=w\); namely we can choose \(z=\sqrt{s}e^{i\phi/2}\in\mathbb{H}_{R}\).

    Similarly, the left half plane \(\mathbb{H}_{L}=\{z\in\mathbb{C}:\mathrm{Re}(z)<0\}\) is mapped to \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\). Moreover, \(f\) maps both the strictly positive imaginary axis \(i\mathbb{R}_{>0}\) and the strictly negative imaginary axis \(i\mathbb{R}_{<0}\) (given by \(i\mathbb{R}_{>0}=\{iy\in\mathbb{C}:y>0\}\) and \(i\mathbb{R}_{<0}=\{iy\in\mathbb{C}:y<0\}\) respectively) to the strictly negative real axis \(\mathbb{R}_{<0}\).

    Adding the observation that \(f(0)=0\), we have that \(f(z)=z^{2}\) in essence maps \(\mathbb{C}\) to two copies of itself (except for the origin, which is only attained once in the image - remember this different behaviour at the origin later!)

  • Branches of log. Let \(\log\) be the branch of logarithm corresponding to \(\arg(z)\in(\theta_{1},\theta_{2}]\). Then \(\log\) maps \(\mathbb{C}\setminus R_{\theta_{1}}\) to the infinite horizontal strip \[\{z\in\mathbb{C}\::\:\theta_{1}<\mathrm{Im}(z)\leq\theta_{2}\}.\] Infinite rays emanating from \(0\) map to horizontal lines, and circles centred at zero, minus their interection with \(R_{\theta_{1}}\), map to vertical line segments.

1.4 The Riemann Sphere and extended complex plane

It is very useful at various points of the course to extend the complex plane by adding a point ‘at infinity’. To do this, we create a new object called ‘infinity’, denoted \(\infty\), and consider the set

\[\hat{\mathbb{C}}:=\mathbb{C}\cup\{\infty\}.\] At the moment, we have accomplished nothing really. What will be useful later is that we can think of the point \(\infty\) as being glued ‘nicely’ onto \(\mathbb{C}\). The correct way to do this is by introducing the Riemann sphere.

The Riemann sphere

Consider the unit sphere \(S^{2}:=\{(x,y,s)\in\mathbb{R}^{3}:x^{2}+y^{2}+s^{2}=1\}\) in \(\mathbb{R}^{3}\) and consider a copy of \(\mathbb{C}\) embedded in \(\mathbb{R}^{3}\) by identifying \(\mathbb{C}=\mathbb{R}^{2}\) with the \((x,y)\)-plane. Explicitly, a point \(x+iy\in\mathbb{C}\) corresponds to the point \((x,y,0)\in\mathbb{R}^{3}\).

Let \(N=(0,0,1)\in S^{2}\) denote the ‘north pole’. For any point \(v\in S^{2}\setminus\{N\}\), there is a unique straight line \(L_{N,v}\) passing through \(N\) and \(v\). Since \(v\neq N\), this line is not parallel to the \((x,y)\)-plane. Hence it intersects the \((x,y)\)-plane in a unique point \((x,y,0)\). This corresponds to the point \(x+iy\in\mathbb{C}\). We have defined a map \(P:S^{2}\setminus\{N\}\to\mathbb{C}\) by \(P(v)=x+iy\) in the notation of the preceding discussion. The map \(P\) is called the stereographic projection (from the north pole).

What is the formula for stereographic projection? Let \((x,y,s)\in S^{2}\setminus\{N\}\). Note that \(s\neq1\). The equation of the line passing through the point \((x,y,s)\in S^{2}\) and the North Pole \(N=(0,0,1)\in S^{2}\) is given by \[\gamma(t)\:=\:N+\left(\begin{pmatrix}x\\ y\\ s \end{pmatrix}-N\right)\,t\:=\:\begin{pmatrix}0\\ 0\\ 1 \end{pmatrix}+\begin{pmatrix}x\\ y\\ s-1 \end{pmatrix}\,t,\quad\quad\quad(t\in\mathbb{R}).\] This clearly intersects the plane when \(t=\frac{1}{1-s}\). Thus \[P:(x,y,s)\mapsto\frac{x}{1-s}+\frac{iy}{1-s}.\]

It is possible to find an inverse to \(P\): given any point \(z\in\mathbb{C}\), draw the straight line passing through \(N\) and \(z\) (thinking of \(\mathbb{C}\) as the \((x,y)\)-plane as before). This straight line passes through \(S^{2}\) in exactly one point. Hence \(P\) is a bijection that identifies \(S^{2}\setminus\{N\}\) with \(\mathbb{C}\).

Now we have identified \(\mathbb{C}\) with \(S^{2}\setminus\{N\}\) via the map \(P\), it gives us a natural way to view the added point \(\infty\) of \(\hat{\mathbb{C}}\). It should correspond to adding back in the north pole to \(S^{2}\setminus\{N\}\). In other words, we should think of \(\hat{\mathbb{C}}\) simply as the entire sphere \(S^{2}\)!

In fact, one can show that we have the following correspondences/mappings: \[\begin{align} \underline{\text{In } S^{2}} & & \underline{\text{In } \hat{\mathbb{C}}}\label{eq:s2-table1}\\ N & \longleftrightarrow & \infty\nonumber \\ S & \longleftrightarrow & 0\nonumber \\ \text{Equator } & \longleftrightarrow & \text{Unit circle }\{z\in\mathbb{C}:|z|=1\}\nonumber \\ \text{(open) Southern hemisphere} & \longleftrightarrow & \mathbb{D}:=\{z\in\mathbb{C}:|z|<1\}\nonumber \\ \text{(open) Northern hemisphere} & \longleftrightarrow & \hat{\mathbb{C}}\setminus\overline{B}_{1}(0)=\hat{\mathbb{C}}\setminus\{z\in\mathbb{C}:|z|\leq1\}\nonumber \\ & \text{Algebraically:}\nonumber \\ (x,y,s) & \longleftrightarrow & \frac{x+iy}{1-s}\quad\:\:\text{(Stereo. Proj.)}\nonumber \\ \frac{1}{|z|^{2}+1}\left(2\mathrm{Re}(z),2\mathrm{Im}(z),|z|^{2}-1\right) & \longleftrightarrow & z\quad\quad\quad\quad\text{(Inverse Stereo. Proj.)}\nonumber\end{align}\]

Note that we could have used the south pole \(S=(0,0,-1)\), rather than \(N\), to define the projection. In that case we would have the correspondence \((x,y,s)\mapsto\frac{x+iy}{1+s}\) (and you can check that the map \(f(z)=1/\bar{z}\) takes \(\frac{x+iy}{1-s}\) to \(\frac{x+iy}{1+s}\)).

The Riemann sphere is the unit sphere \(S^{2}\subset\mathbb{R}^{3}\) along with the stereographic projections from the north and south pole.

Later in your studies you might learn that the Riemann sphere is a special example of a Riemann surface. The purpose of considering the two stereographic projection maps as part of the definition is that any point in \(S^{2}\) is in the domain of one of the projections, so informally speaking, the maps allow us to think of a region nearby to any point in \(S^{2}\) as a region inside \(\mathbb{C}\).

2 Metric Spaces

2.1 Metric spaces

Now we have another way of thinking about \(\hat{\mathbb{C}}\) - as a sphere in \(\mathbb{R}^{3}\) - it looks like there might be (at least!) two natural ways to measure the distance between two points \(z\) and \(w\) in the extended complex plane (or indeed in \(\mathbb{C}\)).

In both cases, the set is the same, but there is a different notion of distance. A metric space is a set together with a ‘distance’ function that satisfies certain axioms.

[Metric spaces] A metric space is a set \(X\) together with a function \(d:X\times X\to\mathbb{R}_{\geq0}\) such that for all \(x,y,z\in X\)

  • (D1) Positivity. \(d(x,y)\geq0\) and \(d(x,y)=0\Longleftrightarrow x=y\);

  • (D2) Symmetry. \(d(x,y)=d(y,x)\);

  • (D3) Triangle inequality. \(d(x,y)\leq d(x,z)+d(z,y).\)

The function \(d\) is called a metric and we will often denote a metric space by \((X,d)\).

Examples of Metrics

  1. The metric induced by the modulus function \(|\,.\,|\) on \(\mathbb{R}\) or \(\mathbb{C}\). We can define a distance function \(d\) on \(\mathbb{R}\times\mathbb{R}\) or \(\mathbb{C}\times\mathbb{C}\) by the formula \(d(x,y)=|x-y|\). This metric satisfies (D1)-(D3) by the properties \(1.\) and \(2.\) of the modulus we gave just after Corollary [cor:norm-mult].

  2. The Euclidean norm on \(\mathbb{R}^n\) or \(\mathbb{C}^n\) For vectors \(\boldsymbol{x}\) in \(\mathbb{R}^{n}\) (or \(\mathbb{C}^{n}\)), the function \[d(\boldsymbol{x},\boldsymbol{y}):=\|\boldsymbol{x}-\boldsymbol{y}\|_{2}=\sqrt{\sum_{i=1}^{n}|x_{i}-y_{i}|^{2}}\] is a metric. It is easy to check properties \((D1)-(D3)\). Note that this norm comes from an inner-product. For example, for \(n=2\), the real Euclidean norm on \(\mathbb{R}^{2}\) comes from the usual dot product \(\|\boldsymbol{x}\|_{2}=\sqrt{\boldsymbol{x}\cdot\boldsymbol{x}}\) and the complex Euclidean norm on \(\mathbb{C}^{2}\) comes from the inner product \(\left<z,w\right>=z_{1}\bar{w}_{1}+z_{2}\bar{w}_{2}\); that is \(\|z\|_{2}=\sqrt{\left<z,z\right>}\).

    • (a) Metrics induced from inner products in vector spaces

      More generally, given any finite dimensional real vector space \(V\) with a (positive definite) inner product \(\left<\,.\,\right>\), then \[d(v,w):\,=\,\|v-w\|\,=\,\sqrt{\left<v-w,v-w\right>}\quad\quad(v,w\in V)\] is a metric. Properties (D1) and (D2) are obvious, property (D3) follows from Cauchy-Schwarz: \(|\left<v,w\right>|\leq\|v\|\cdot\|w\|\) - see sheet 2.

    • (b) Metrics induced from norms in vector spaces

      Even more generally, so long as a vector space has a ‘nice’ notion of the ‘size’ of each vector, we can define a metric in the obvious way. Such a notion is in generality referred to as a norm:

      [Norms and normed vector spaces] Given any real or complex vector space \(V\), a function \(\|\,.\,\|:V\to\mathbb{R}_{\geq0}\) is a norm if it satisfies (for \(v,w\in V\))

      • (N1) \(\|v\|\geq0\) and \(\|v\|=0\quad\Longleftrightarrow\quad v=0;\)

      • (N2) \(\|\lambda v\|=|\lambda|\cdot\|v\|\quad\) for \(\lambda\in\mathbb{R}\) or \(\mathbb{C}\);

      • (N3) \(\|v+w\|\leq\|v\|+\|w\|\) (the triangle inequality).

      Note that (N3) implies \(\|v-w\|\geq\left|\,\|v\|-\|w\|\,\right|\) (the reverse triangle inequality). A vector space equipped with a norm is called a normed vector space. The metric given by \(d(v,w):=\|v-w\|\) then always defines a metric (it is easy to check properties (D1)-(D3)). In particular, since the modulus function on \(\mathbb{C}\) is a norm, the metric we get from the modulus function comes from a norm.

  3. \(\ell_p\)-norm on \(\mathbb{R}^n\) or \(\mathbb{C}^n \, (p \geq 1)\)

    The above suggests that a vector space could be home to many different norms (so many different metrics). But, not all norms arise from an inner product as in \((3a)\); for example, for vectors \(\boldsymbol{x}\) in \(\mathbb{R}^{n}\) or \(\mathbb{C}^{n}\), the function \[\|\boldsymbol{x}\|_{p}:=\sqrt[p]{\sum_{i=1}^{n}|x_{i}|^{p}}\] defines a norm for every \(p\geq1\), called the \(\boldsymbol{\ell_{p}}\)-norm. But, for \(p\ne2\) (the Euclidean norm) it does not arise from an inner product. When \(p=1\), the \(\ell_{1}\)-norm is simply given by the sum of the size of the components \(\|\boldsymbol{x}\|_{1}=\sum_{i=1}^{n}|x_{i}|\) and is sometimes referred to as the Taxicab norm.

  4. \(\ell_\infty\)-norm on \(\mathbb{R}^n\) (or \(\mathbb{C}^n\))

    The function \[\|\boldsymbol{x}\|_{\infty}:=\max_{i=1,\ldots,n}{|x_{i}|}\] also defines a norm, called the \(\boldsymbol{\ell_{\infty}}\)-norm (or the sup-norm), thus it also defines a metric. It is in some sense the ‘limiting notion’ of the \(\ell_{p}\) norms.

  5. Riemannian (chordal) metric on \(\hat{\mathbb{C}}\)

    Let \(f:\hat{\mathbb{C}}\to S^{2}\) be the (inverse of the) stereographic projection. Then the function \[d(z,w):=\|f(z)-f(w)\|_{2}\quad\quad(z,w\in\hat{\mathbb{C}}),\] where \(\|\,.\,\|_{2}\) is Euclidean norm in \(\mathbb{R}^{3}\) (so, the \(\ell_{2}\)-norm), is a metric on \(\hat{\mathbb{C}}\). It is called the Riemannian metric (or chordal metric). Note that with respect to this metric, the distance from \(0\) to \(i\) is the same as the distance from \(i\) to \(\infty\), for example!

  6. Discrete metric

    Let \(X\) be a non-empty finite set. Then (for \(x,y\in X\)) the function \[d(x,y):=\begin{cases} 0\ & \text{ if }\quad x=y,\\ 1\ & \text{ if }\quad x\neq y, \end{cases}\] defines a metric, called the discrete metric. It is easy to check (D1)-(D3). In this case, \((X,d)\) is called a discrete metric space.

  7. Function spaces

    There are many of these, such as the space \(X=C([a,b])\) of continuous functions on an interval \([a,b]\). The function \[\|f\|:=\max_{x\in[a,b]}|f(x)|\] defines a norm, and thus a metric (see Analysis III for more examples).

  8. Subspace metric

    Any non-zero subset \(Y\subset X\) of a metric space \(X\) is itself a metric space with respect to the same metric (this is easy to check). The metric restricted to the set \(Y\) is then called the subspace metric [There is actually much more than meets the eye with this metric - see sheets 2 and 3.]

2.2 Open and closed sets

Since we have a general notion of distance in any metric space \(X\), we can define balls in the space. These will be the key to understanding the topology of the space.

[Balls in a metric space] Let \((X,d)\) be a metric space, \(x\in X\) and let \(r>0\) be a real number. Then:

  • The open ball \(\boldsymbol{B_{r}(x)}\) of radius \(\boldsymbol{r}\) centred at \(\boldsymbol{x}\) is \[B_{r}(x):=\{y\in X:\:d(x,y)<r\}.\]

  • The closed ball \(\boldsymbol{\bar{B}_{r}(x)}\) of radius \(\boldsymbol{r}\) centred at \(\boldsymbol{x}\) is \[\bar{B}_{r}(x):=\{y\in X:\:d(x,y)\leq r\}.\]

Visualizing balls (examples)

  1. Let \(X=\mathbb{C}\) and \(d(z,w)=|z-w|\), then \(B_{1}(0)=\mathbb{D}=\{z\::\:|z|<1\}\) as before. More generally \(B_{r}(z_{0})\) is the usual ball of radius \(r\) around \(z_{0}\), not including its boundary circle. \(\bar{B}_{r}(z_{0})\) is the ball of radius \(r\) around \(z_{0}\), including its boundary circle. This is the most important example from the point of view of Complex Analysis.

  2. Let us consider the unit ball \(B_{1}(\mathbf{0})\) in \(\mathbb{R}^{2}\) with respect to the \(\ell_{p}\)-norms, for \(p=1,2\) and \(\infty\).

    For \(p=2\) the unit ball \(B_{1}(\mathbf{0})\) is the usual Euclidean ball - so the inside of the unit circle centred at the origin. For \(p=\infty\), the equation \(\max\{|x|,|y|\}<1\) (for \((x,y)\in\mathbb{R}^{2}\)) clearly defines the interior of a square with vertices \((1,1),(-1,1),(1,-1)\) and \((-1,-1)\).

    For the \(\ell_{1}\)-norm a little care is needed. We are interested in the points \((x,y)\in\mathbb{R}^{2}\) with \(|x|+|y|<1\). In the 1st quadrant this means \(y<1-x\), in the 2nd it means \(y<1+x\), in the 3rd we have \(y>-1-x\) and in the 4th its \(y>x-1\). Thus, the unit ball is the interior of a diamond with vertices \((1,0),(0,1),(-1,0)\) and \((0,-1)\).

[Open/closed sets in a metric space] Let \((X,d)\) be a metric space. Then:

  • A subset \(U\subseteq X\) is open (in \(\boldsymbol{X}\)) if for every point \(x\in U\) there exists \(\epsilon>0\) such that \(B_{\epsilon}(x)\subset U\).

  • A subset \(U\subseteq X\) is closed (in \(\boldsymbol{X}\)) if its complement \(X\setminus U\) is open.

Sets in a metric space can be open and closed at the same time! For example, the empty set \(\emptyset\) and the whole metric space \(X\) are both open and closed. Such sets are referred to as ‘clopen’.

[Open balls are open] In a metric space, the open ball \(B_{r}(x)\) is open!

Let \(y\in B_{r}(x)\) with \(d(x,y)=s\) (and so \(s<r\)). We need to show there exists \(\epsilon>0\) such that \(B_{\epsilon}(y)\subseteq B_{r}(x)\). Simply take \(\epsilon=r-s>0\). Then for every \(z\in B_{\epsilon}(y)\) we have \[d(x,z)\:\stackrel{(D3)}{\leq}\:d(x,y)+d(y,z)\:<\:s+\epsilon\:=\:r.\] Thus, \(z\in B_{r}(x)\) as required.

It can also be shown that in a metric space the closed ball \(\overline{B}_{r}(x)\) is closed (see Sheet 2).

Open sets (examples/warnings)

  1. All of the previously encountered subsets \(\mathbb{H},\mathbb{D},\mathbb{C}^{\ast}\) and \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\) of the complex plane are open (see sheet 2). The 1st quadrant \(\Omega_{1}:=\{z\in\mathbb{C}:\mathrm{Re}(z)>~0,\mathrm{Im}(z)>0\}\) is open. To see this, for \(z\in\Omega_{1}\) consider the ball \(B_{r}(z)\) where \(r=\min(\mathrm{Re}(z),\mathrm{Im}(z))/2\).

  2. Let \(X\) be a discrete metric space, so \(d(x,y):=\begin{cases} 0\ & \text{ if }\quad x=y.\\ 1\ & \text{ if }\quad x\neq y. \end{cases}\quad\) Then, for \(x\in X\) and \(r>0\) we have \[B_{r}(x):=\begin{cases} \{x\}\ & \text{ if }\quad r\leq1.\\ X\ & \text{ if }\quad r>1. \end{cases}\] Therefore, (by Lemma [lem:open-balls-are-open]) every singleton \(\{x\}\) is an open set with respect to the discrete metric. Moreover, the complement \(X\setminus\{x\}\) is also open, since for any \(y\in X\setminus\{x\}\) (that is, any \(y\neq x\) in \(X\)) and any \(r<1\) the open ball \(B_{r}(y)=\{y\}\) is contained in \(X\setminus\{x\}\). Thus, all balls are clopen with respect to the discrete metric! In fact, any subset \(Y\subseteq X\) of a discrete metric space is clopen!

  3. Sets don’t have to be either open or closed. For example, \([0,1)\) is neither open nor closed in \(\mathbb{R}\) (with respect to the standard metric \(|\,.\,|\)) - simply check the point \(x=0\) in \([0,1)\) and the point \(x=1\) in the complement \((-\infty,0)\cup[1,\infty)\). However, recall that any subset of a metric space is itself a metric space (w.r.t the same metric) - the subspace metric. Thus the pair \(([0,1),|\,.\,|)\) is a metric space - but then (by the remark after Definition [def:open-and-closed-sets]) the set \([0,1)\) is open!

    Key: Open and closed sets are really relative notions, depending on the ambient space (as well as the metric).

    Notation: When we say a subset of \(\mathbb{R}\) or \(\mathbb{R}^{n}\) or \(\mathbb{C}\) are open/closed, we will mean with respect to the standard norms \(|\,.\,|\) and \(\|\,.\,\|_{n}\) and \(|\,.\,|\) respectively. Most sets we encounter do not simply look like open/closed balls, so it will be useful to have rules for union and intersection:

[Unions and intersections of open sets] Let \((X,d)\) be a metric space. Then:

  1. Arbitrary unions of open sets are open; that is \[\bigcup_{i\in I}\,U_{i}\quad\text{is open,}\quad\text{ for any (possibly infinite) collection of open sets }U_{i}.\]

  2. Finite intersections of open sets are open; that is \[\bigcap_{i=1}^{n}\,U_{i}\quad\text{is open,}\quad\text{ for any finite collection of open sets }U_{i}.\quad\quad\quad\quad\quad\quad\quad\]

  1. Let \(x\in\bigcup_{i\in I}U_{i}\). Then, by definition, it must be contained in the set \(U_{j}\) for some \(j\in I\). Since \(U_{j}\) is open there must exist a ball \(B_{\epsilon}(x)\) centred at \(x\) lying in \(U_{j}\). But then \(B_{\epsilon}(x)\,\subseteq\,U_{j}\,\subseteq\,\bigcup_{i\in I}U_{i}\) as required.

  2. Let \(x\in\bigcap_{i=1}^{n}U_{i}\). By definition \(x\in U_{i}\) for every \(i=1,\ldots,n\). But, since they are all open, for every \(U_{i}\) there must exist \(r_{i}>0\) such that \(B_{r_{i}}(x)\subset U_{i}\). Now simply take \(\epsilon={\displaystyle \min_{i=1,\ldots n}(r_{i})}\). Then for every \(i\) we have \(B_{\epsilon}(x)\subseteq B_{r_{i}}(x)\) and so \[B_{\epsilon}(x)\,\subseteq\,\bigcap_{i=1}^{n}B_{r_{i}}(x)\,\subseteq\,\bigcap_{i=1}^{n}U_{i}.\]

[Unions and intersections of closed sets] Let \((X,d)\) be a metric space. Then:

  1. Finite unions of closed sets are closed.

  2. Arbitrary intersections of closed sets are closed.

De Morgan’s laws together with Lemma [lem:metrics-give-topology].

We have the following remarks about open and closed sets.

  • An infinite intersection of open sets is not necessarily open (see Sheet 2). Similarly, an infinite union of closed sets is not necessarily closed: e.g., the union of closed intervals in \(\mathbb{R}\), \[\bigcup_{i=1}^{\infty}\left[\frac{1}{i},\:1-\frac{1}{i}\right]\,=\,(0,1)\quad\text{ is open in }\mathbb{R}.\]

  • The next generalisation of a metric space you will encounter (see Topology III) is called a Topological space \(\mathcal{T}\). There, the only stipulations are the existence of open sets such that \[(i)\:\emptyset\text{ and }\mathcal{T}\text{ are open };\quad\quad(ii)\:\text{Lemma 2.8 holds.} %\ref{lem:metrics-give-topology} holds.} %(open balls are open).}\] We have a hierarchy: \[\text{Inner-product space}\,\Longrightarrow\,\text{Normed space}\,\Longrightarrow\,\text{Metric space}\,\Longrightarrow\,\text{Topological space}.\]

  • Why have we been looking at examples in \(\mathbb{R}^{2}\) rather than \(\mathbb{C}\)? It turns out that both these spaces are ‘topologically equivalent’, that is, they have the same open sets - this is obvious since the complex modulus is essentially just the Euclidean norm on \(\mathbb{R}^{2}\).

As we have seen, some sets are neither open nor closed. It will be useful to ask what the largest possible open set is inside a given set. Similarly, what is the smallest closed set containing a given set?

[Interior points, closure, boundary, exterior] Let \(A\) be a subset of a metric space \((X,d)\).

  • The interior \(A^{0}\) of \(A\) is defined by \[A^{0}:=\{x\in A:\,\text{there exists an open set }U\subseteq A\text{ such that }x\in U\}.\]

  • The closure \({\bar{A}}\) of \(A\) is the complement of the interior of the complement: \[\bar{A}:=\{x\in X:\,U\cap A\neq\emptyset\text{ for every open set }U\text{ with }x\in U\}.\]

  • The boundary \({\partial A}\) of \(A\) is the closure without the interior: \[\partial A:=\bar{A}\setminus A^{0}\quad\quad\left[=X\setminus(A^{0}\cup(X\setminus A)^{0})\right].\]

  • The exterior \({A^{e}}\) of \(A\) is the complement of the closure: \[A^{e}:=X\setminus\bar{A}\quad\quad\left[=X\setminus(A^{0}\cup\partial A)=(X\setminus A)^{0}\right].\]

  • Don’t confuse closure with conjugation! Closure concerns sets in any metric space, conjugation concerns points in \(\mathbb{C}\).

  • Clearly the interior and exterior are open and clearly the boundary is closed. The closure is also closed (see sheet 2) - in fact, it is often defined more simply as \(\bar{A}:=X\setminus(X\setminus A)^{0}\), from which the closedness is obvious - it is the complement of an open set.

  • The boundary matches our naive notion. Broadly speaking, the interior of a set consists of all the points that are not on its ‘edge’, and to form the closure of a set you simply add all the missing edge points. Indeed, we have the following additional properties of a subset \(A\subset X\) (see sheet 2): \[\begin{aligned} (a) & \:A\text{ is open } & \iff & \:\:\partial A\cap A=\emptyset & \iff & \:A=A^{0}; & \text{In fact } & \quad A^{0}={\displaystyle \bigcup_{\substack{U\subseteq A\\ U\text{ open} } }U;}\\ (b) & \:A\text{ is closed } & \iff & \quad\partial A\subseteq A & \iff & \:A=\bar{A}; & \text{In fact } & \quad\bar{A}={\displaystyle \bigcap_{\substack{A\subseteq F\\ F\text{ closed} } }F;\ }\end{aligned}\] That is, the interior \(A^{0}\) is the largest open set contained in \(A\) and the closure \(\bar{A}\) is the smallest closed set containing \(A\). Convince yourself that all the definitions reflect your intuitive notions, say, for the plane!

  • In \(\mathbb{R}^{n}\) or \(\mathbb{C}^{n}\) for simple sets we only have to replace strict inequality with equality (or vice versa) to obtain the closure (or interior). For example, for \(A = \{ z \in \mathbb{C} : 1 < \vert z \vert \leq 3\}\), we have \(A^{0} = \{ z \in \mathbb{C} : 1 < \vert z \vert < 3\}\), \(\bar{A} = \{ z \in \mathbb{C} : 1 \leq \vert z \vert \leq 3\}\), and \(\partial A = \{ z \in \mathbb{C} : \vert z \vert =1 \} \cup \{ z \in \mathbb{C} : \vert z \vert =3\}\).

    Similarly \[\overline{\{z\in\mathbb{C}:\,1<\mathrm{Re}(z)\leq3,|\mathrm{Im}(z)|<1\}}=\{z\in\mathbb{C}:\,1\leq\mathrm{Re}(z)\leq3,|\mathrm{Im}(z)|\leq1\}\] and \[\{z\in\mathbb{C}:\,1<\mathrm{Re}(z)\leq3,|\mathrm{Im}(z)|<1\}^{0}=\{z\in\mathbb{C}:\,1<\mathrm{Re}(z)<3,|\mathrm{Im}(z)|<1\}.\] In fact, \(\overline{B_{r}(x)}=\bar{B}_{r}(x)\) for any ball in \(\mathbb{R}^{n}\) or \(\mathbb{C}^{n}\).

    However, this is not true in every metric space (see sheet 2) - there are metric spaces for which \(\overline{B_{r}(x)}\neq\bar{B}_{r}(x)\); that is, the smallest closed set containing the open ball \(B_{r}(x)\) is not the closed ball \(\bar{B}_{r}(x)\)! [Hint: what if the open ball is already closed!?]

2.3 Convergence and continuity

With our general notion of distance (a metric) comes a natural notion of convergence.

[Limits and convergence in a metric space] We say a sequence \(\{x_{n}\}\) in a metric space \((X,d)\) converges to \(x\in X\) if we have \[{\displaystyle \lim_{n\to\infty}d(x_{n},x)=0.}\] That is, \[\text{ for every }\epsilon>0\text{ there exists }N\in\mathbb{N}\text{ such that }d(x_{n},x)<\epsilon\text{ for every }n>N.\] We write “\(x_{n}\rightarrow x\) as \(n\rightarrow\infty\)”, or “\({\displaystyle \lim_{n\to\infty}x_{n}=x}\)”.

Convergent sequences (example)

As mentioned, the chordal metric on \(\hat{\mathbb{C}}\) is \(d(z,w)=\|f(z)-f(w)\|_{2}\), where \(\|\,.\,\|_{2}\) is the Euclidean norm on \(\mathbb{R}^{3}\) and \(f\) is the inverse Stereographic projection given by \[f(z)=\left(\frac{2\mathrm{Re}(z)}{|z|^{2}+1},\:\frac{2\mathrm{Im}(z)}{|z|^{2}+1},\:\frac{|z|^{2}-1}{|z|^{2}+1}\right).\] Show that with respect to this metric the sequence \(\{ki\}_{k\in\mathbb{N}}\) in \(\hat{\mathbb{C}}\) converges to \(\infty\in\hat{\mathbb{C}}\).

Since \(|ki|=k\) and \(f(\infty)=(0,0,1)\) we have \[\begin{aligned} d(ki,\infty)&=\|f(ki)-f(\infty)\|_{2} = \left\Vert \left(0,\frac{2k}{k^{2}+1},\frac{k^{2}-1}{k^{2}+1}\right)-(0,0,1)\right\Vert _{2}\\ & = \left\Vert \left(0,\frac{2k}{k^{2}+1},-\frac{2}{k^{2}+1}\right)\right\Vert _{2}\\ & = \sqrt{\left(\frac{2k}{k^{2}+1}\right)^{2}+\left(\frac{-2}{k^{2}+1}\right)^{2}}\quad\longrightarrow\quad0\quad\text{ as }k\rightarrow\infty.\end{aligned}\] Thus, the sequence indeed converges to \(\infty\). This is quite an odd notion as we are used to saying sequences ‘diverge’ if they tend to infinity. The key is that convergence depends on the metric being used.

Limits in \(\mathbb{C}\) with the standard metric (Very important!).

The above definition of limit in a metric space also gives us an example of limits in \(\mathbb{C}\) with the standard metric. This says that if \(z_{n}\) is a sequence of complex numbers, then \(\lim_{n\to\infty}z_{n}=z\) if and only if

‘for all \(\epsilon>0\), there exists \(N>0\) such that for all \(n\geq N\), \(|z_{n}-z|<\epsilon\)’.

Note this is the same definition as in Analysis I, but replacing the absolute value on the real line by the modulus on \(\mathbb{C}\). Importantly, by the same proofs as in Analysis I, limits in the complex plane follow the COLT rules.

Furthermore there is a very important link between convergence in the complex plane and real convergence (see Analysis I for the proof). Let \(\{z_{n}\}_{n\in\mathbb{N}}\) be a sequence of complex numbers \(z_{n}=x_{n}+iy_{n}\). Then, for any fixed \(z_{0}=x_{0}+iy_{0}\in\mathbb{C}\) we have \[\lim_{n\rightarrow\infty}z_{n}=z_{0}\quad\iff\quad\lim_{n\rightarrow\infty}x_{n}=x_{0}\:\:\text{ and }\:\:\lim_{n\rightarrow\infty}y_{n}=y_{0}.\] In other words, the sequence \(\{z_{n}\}\) converges iff the real sequences \(\{\mathrm{Re}(z_{n})\}\) and \(\{\mathrm{Im}(z_{n})\}\) converge (see Sheet 3).

Let’s return to our general setting of metric spaces and prove some properties of limits.

[Limits and open sets] Let \((X,d)\) be a metric space. Then:

  1. A sequence can have at most one limit.

  2. We have \[\lim_{n\to\infty}x_{n}=x\quad\Longleftrightarrow\quad\forall\text{ open }U\text{ with }x\in U,\:\exists\,N\in\mathbb{N}\text{ such that }\:\forall n>N\quad x_{n}\in U.\] Hence the notion of a limit in a metric space can be stated in terms only of its open sets.

  1. Assume \({\displaystyle \lim_{n\to\infty}x_{n}=x}\) and \({\displaystyle \lim_{n\to\infty}x_{n}=y}\). Then we have for each \(n\), by the triangle inequality \(d(x,y)\leq d(x,x_{n})+d(x_{n},y)\), so taking the limit as \(n\to\infty\) gives \[d(x,y)\leq\lim_{n\to\infty}d(x,x_{n})+\lim_{n\to\infty}d(x_{n},y)=0+0=0,\] so \(d(x,y)=0\), hence \(x=y\) by property (D1) of metric spaces.

  2. (\(\Rightarrow\)): Assume \({\displaystyle \lim_{n\to\infty}x_{n}=x}\) and that \(U\) is open with \(x\in U\). By definition there exists \(r>0\) such that \(B_{r}(x)\subseteq U\) and an \(N\in\mathbb{N}\) such that \(d(x_{n},x)<r\) for every \(n>N\). Thus, \(x_{n}\in B_{r}(x)\subseteq U\) for every \(n>N\).

    (\(\Leftarrow\)): Let \(\epsilon>0\). We wish to find \(N\in\mathbb{N}\) such that \(d(x_{n},x)<\epsilon\) for \(n>N\). Consider the ball \(B_{\epsilon}(x)\). It is open and contains \(x\), and so there exists \(N\in\mathbb{N}\) such that \(x_{n}\in B_{\epsilon}(x)\) for every \(n>N\). This is precisely the statement that \(d(x_{n},x)<\epsilon\) for \(n>N\).

The key to completing proofs of this type is to write down the definitions in your assumptions and also write precisely what you need to prove. Usually, doing this leads very quickly to the proof.

We can now define what it means for a function between two metric spaces to be continuous (this will incorporate many of the functions we have already encountered; e.g., \(f:\mathbb{C}\to\mathbb{C};\:\mathbb{R}\to\mathbb{C};\) or \(\mathbb{C}\to\mathbb{R}\)).

[Continuity] A map \(f:(X_{1},d_{1})\to(X_{2},d_{2})\) between two metric spaces is called continuous at \(x_{0}\in X_{1}\) if \[\forall\,\epsilon>0\quad\exists\,\delta>0\quad\text{such that}\quad\forall\,x\in X_{1}\quad\text{we have}\quad d_{1}(x,x_{0})<\delta\quad\Rightarrow\quad d_{2}(f(x),f(x_{0}))<\epsilon.\] We say a function \(f\) is continuous on \(X_{1}\) if it is continuous at every point \(x_{0}\in X_{1}\). [Note that this is exactly the same as in Analysis I.]

Equivalently, one could write \[\forall\,\epsilon>0\quad\exists\,\delta>0\quad\text{such that}\quad x\in B_{\delta}(x_{0})\Rightarrow\quad f(x)\in B_{\epsilon}(f(x_{0})),\] where the first ball is in \(X_{1}\) and the second ball is in \(X_{2}\).

[Basic properties of continuous functions]

  1. Products, sum, quotients of real/complex valued continuous functions on a metric space \(X\) are continuous. E.g., if \(f:X\to\mathbb{C}\) and \(g:X\to\mathbb{C}\) are continuous, then \(f+g\) and \(fg\) and \(f/g\) are continuous (where defined).

  2. Compositions of continuous functions are continuous. I.e., if \(f:X_{1}\to X_{2}\) and \(g:X_{2}\to X_{3}\) are continuous maps between metric spaces, then \(g\circ f:X_{1}\to X_{3}\) is continuous.

Almost word-for-word from Analysis I.

Examples of continuous functions on the complex plane (with the standard metric)

  • The identity function is continuous.

  • Constant functions are continuous.

  • The functions \(\mathrm{Re},\mathrm{Im}:\mathbb{C}\to\mathbb{R}\) are continuous.

  • The complex conjugation \(z\mapsto\bar{z}\) is continuous as a map from \(\mathbb{C}\to\mathbb{C}\).

  • The modulus function \(z\mapsto|z|\) is continuous as a map from \(\mathbb{C}\to\mathbb{R}\).

  • All of \(\exp,\sin,\cos,\sinh\) and \(\cosh\) are continuous on \(\mathbb{C}\), as are all polynomials.

  • If \(\arg\) is the choice of argument function with values in \((\theta_{1},\theta_{2}]\) then \(\arg\) is continuous on \(\mathbb{C}\backslash R_{_{\theta_{1}}}\) (recall \(R_{\theta_{1}}\) is the ray with angle \(\theta_{1}\))

  • If \(\log\) is a branch of \(\log\) corresponding to an argument function as above, then \(\log\) is continuous on \(\mathbb{C}\backslash R_{\theta_{1}}\).

As with limits, it will be useful to restate continuity in terms of open sets. First, recall that for any function \(f:X_{1}\to X_{2}\) and any set \(U\subseteq X_{2}\) we define the preimage \(f^{-1}(U)\) of \(U\) under \(f\) by \(f^{-1}(U):=\{x\in X_{1}:f(x)\in U\}\).

[Continuity via open sets] Let \(X_{1}\) and \(X_{2}\) be metric spaces. Then: \[\begin{aligned} f:X_{1}\to X_{2}\;\text{continuous} & \quad\Longleftrightarrow\quad f^{-1}(U) \text{ is open in } X_{1} \text{ for every open set } U \text{ in } X_{2}\\ & \quad\Longleftrightarrow\quad f^{-1}(F) \text{ is closed in } X_{1} \text{ for every closed set } F \text{ in } X_{2}.\end{aligned}\]

For open sets:

  • (\(\Rightarrow\)): Let \(U\) be open in \(X_{2}\), and pick \(x\in f^{-1}(U)\). Since \(U\) is open there exists \(\epsilon>0\) such that \(B_{\epsilon}(f(x))\subseteq U\) (since \(f(x)\in U\)). But \(f\) is continuous, so by definition there exists \(\delta>0\) such that if \(y\in B_{\delta}(x)\) we have \(f(y)\in B_{\epsilon}(f(x))\). Thus \(f(y)\in U\), and so \(y\in f^{-1}(U)\); since this is true for every \(y\in B_{\delta}(x)\) we have shown \(B_{\delta}(x)\subseteq f^{-1}(U)\) and so the set \(f^{-1}(U)\) is open.

  • (\(\Leftarrow\)): Let \(x\in X_{1}\) and \(\epsilon>0\). We need to find a \(\delta>0\) such that \(y\in B_{\delta}(x)\Rightarrow f(y)\in B_{\epsilon}(f(x))\). First notice that (by Lemma [lem:open-balls-are-open]) the ball \(B_{\epsilon}(f(x))\) is open. By assumption, the preimage \(f^{-1}(B_{\epsilon}(f(x)))\) of this ball is also open. The point \(x\) must be in this preimage (because the centre \(f(x)\) is certainly in \(B_{\epsilon}(f(x))\)). Then, since the preimage is open there must exist an open ball \(B_{\delta}(x)\) around \(x\) contained in \(f^{-1}(B_{\epsilon}(f(x)))\). But, this is precisely the statement that \(y\in B_{\delta}(x)\Rightarrow f(y)\in B_{\epsilon}(f(x))\) as required.

For closed sets see sheet 3.

  • From the proof we see that we can be slightly more precise than the statement of the theorem. We have, for example \[\begin{aligned} f:X_{1}\to X_{2}\;\text{continuous} & \quad\Longleftrightarrow\quad & f^{-1}(U) \text{ is open in } X_{1} \text{ for every open set } U \text{ in} X_{2}\\ \text{at }x\in X_{1} & & \text{containing }f(x)\end{aligned}\]

  • Note that for the \((\Leftarrow)\) direction we only actually needed the fact that \(f^{-1}(B)\) was open for any open ball \(B\) in \(X_{2}\). It turns out that the open balls ‘generate’ all the open sets (via Lemma [lem:metrics-give-topology]) - see Topology/Analysis III!

  • Note that, by the second part of Theorem [thm:continuity-open-sets], if a function \(f:X_{1}\to X_{2}\) is continuous then \(f^{-1}(\{x\})\) is closed for any \(x\in X_{2}\).

As with limits, it turned out that continuity depends only upon the open sets in the respective metric spaces. This means we can use the continuity of known functions to prove the openness of very complicated sets.

Showing sets are open using continuity (examples)

  • Show the following set is open: \[U=\{(x,y)\in\mathbb{R}^{2}:(x^{2}+y^{2})\sin^{3}(\sqrt{x^{2}+7})>2\}.\] Well, the function \[f:\mathbb{R}^{2}\to\mathbb{R}:\:(x,y)\mapsto(x^{2}+y^{2})\sin^{3}(\sqrt{x^{2}+7})\] is continuous by Lemma [lem:combining-continuous-function], because it is the product/composition of real valued continuous functions. Moreover, \[U=\{(x,y)\in\mathbb{R}^{2}:f((x,y))>2\}\:=\:f^{-1}((2,\infty)).\] Since \((2,\infty)\) is open in \(\mathbb{R}\) (see sheet 2), the set \(U\) is the preimage of an open set under a continuous map and by Theorem [thm:continuity-open-sets] it is therefore open.

  • We can actually do a little more using the following useful properties of the preimage (from Analysis I):

    Useful properties of preimage

    • \(f^{-1}(A\cup B)=f^{-1}(A)\cup f^{-1}(B)\).

    • \(f^{-1}(A\cap B)=f^{-1}(A)\cap f^{-1}(B)\).

    • \(f^{-1}(A\setminus B)=f^{-1}(A)\setminus f^{-1}(B)\).

    Show the following set is open: \[U=\{(x,y)\in\mathbb{R}^{2}:xy>1,\:x^{2}+y^{2}>3\}.\] Both \[f(x,y)=xy\quad\text{and}\quad g(x,y)=x^{2}+y^{2}\] are continuous as functions \(\mathbb{R}^{2}\to\mathbb{R}\), and \(U=f^{-1}((1,\infty))\:\cap\:g^{-1}((3,\infty))\).

    Since \(f\) and \(g\) are continuous and both \((1,\infty)\) and \((3,\infty)\) are open in \(\mathbb{R}\), the preimages \(f^{-1}((1,\infty))\) and \(g^{-1}((3,\infty))\) are open (by Theorem [thm:continuity-open-sets]). By Lemma [lem:metrics-give-topology] we have that \(U\) is open, since it is the intersection of two open sets.

Showing functions are not continuous using open sets (example)

  • We can also use openness to prove a function is not continuous. Indeed, for \(f :X_1 \to X_2\), if there exists an open set \(U\) in \(X_2\) such that \(f^{-1}(U)\) is not open in \(X_1\), then \(f\) is not continuous. For example, the function \(f:\mathbb{R}^{2}\to\mathbb{R}\) defined by \[f(x,y)=\begin{cases} \frac{xy}{x^{2}+y^{2}}, & \text{ if }(x,y)\neq0,\\ 0 & \text{ otherwise},\ \end{cases}\quad\] isn’t continuous at \((0,0)\). Why? Consider the preimage \(f^{-1}((-\epsilon,\epsilon))\). Claim: This preimage is not open for \(\epsilon\) sufficiently small.

    First note that the preimage in question contains \((0,0)\) since \(f((0,0))=0\). To show the preimage is not open it is enough to show that any open ball in \(\mathbb{R}^{2}\) centred at \((0,0)\) is not contained in \(f^{-1}((-\epsilon,\epsilon))\): Let \(\epsilon<1/4\), say, and for any \(\delta>0\) consider the ball \(B_{\delta}((0,0))\) centred at \((0,0)\). The point \((\delta/2,\delta/2)\) is in \(B_{\delta}((0,0))\) since \(\|(\frac{\delta}{2},\frac{\delta}{2})-(0,0)\|_{2}=\|(\frac{\delta}{2},\frac{\delta}{2})\|_{2}=\sqrt{\frac{\delta^{2}}{4}+\frac{\delta^{2}}{4}}=\frac{\delta}{\sqrt{2}}<\delta\). But \[f\left(\left(\frac{\delta}{2},\frac{\delta}{2}\right)\right)\quad=\quad\frac{\frac{\delta}{2}\cdot\frac{\delta}{2}}{\frac{\delta^{2}}{4}+\frac{\delta^{2}}{4}}\quad=\quad\frac{1}{2}\quad>\quad\epsilon,\] so \((\delta/2,\delta/2)\) is not in \(f^{-1}((-\epsilon,\epsilon))\). Thus, for any \(\delta>0\) the ball \(B_{\delta}((0,0))\) is not contained in \(f^{-1}((-\epsilon,\epsilon))\) and so this preimage is not open.

    Since \((-\epsilon,\epsilon)\) is open in \(\mathbb{R}\) it follows from Theorem [thm:continuity-open-sets] that \(f\) is not continuous.

Why preimages?

Note that the use of preimages in Theorem [thm:continuity-open-sets], rather than images, is important. The same result is not true of images. E.g., the function \(f(z)=|z|\) is continuous as a function \(\mathbb{C}\to\mathbb{R}\), but it maps an open set in the complex plane \(f:\mathbb{D}\to[0,1)\) to an interval that is neither open nor closed in \(\mathbb{R}\).

Note that this \(f\) is actually a bijection from \(\mathbb{R}_{\geq 0}\) to \(\mathbb{R}_{\geq0}\), and in \(\mathbb{R}_{\geq0}\) the interval \([0,1)\) is open! So, is the problem that we need the function to be bijective? No. For example, consider the metric spaces \(X_{1}=[0,1)\cup[2,3]\) and \(X_{2}=[0,2]\) with the usual (subspace) metric coming from the absolute value on \(\mathbb{R}\). Define \[f:X_{1}\to X_{2}:\:x\mapsto\begin{cases} x, & \text{ if }x\in[0,1).\\ x-1, & \text{ if }x\in[2,3].\ \end{cases}\] It is easy to check that \(f\) is a bijection and is continuous on its domain: (Continuity is trivial for \(x\neq2\). For \(x=2\), pick \(\epsilon>0\), then for any \(0<\delta<1\) we have \(B_{\delta}(2)=[2,2+\delta)\). Note that \(f(2)=1\) and so \(B_{\epsilon}(f(2))=(1-\epsilon,1+\epsilon)\). To show \(f\) is continuous we must therefore find a \(\delta\) so that \(f(x)\in(1-\epsilon,1+\epsilon)\) if \(x\in[2,2+\delta)\). Simply pick any \(\delta<\epsilon\), for then: \[x\in[2,2+\delta)\quad\Rightarrow\quad f(x)=x-1\in[1,1+\delta)\subset(1-\delta,1+\delta)\subset(1-\epsilon,1+\epsilon),\] as required.) But, the set \([2,3]\) is open in \(X_{1}\) (see sheet 3) and its image \(f([2,3])=[1,2]\) is not open in \(X_{2}\).

Thus, we genuinely do need to use preimages. When can we use images of continuous functions to preserve properties of the sets in question? When can we find the maximum/minimum value taken by a function on a set? It turns out a key concept is that of compactness.

2.4 Compactness

[Compactness] A non-empty subset \(K\) of a metric space \(X\) is called (sequentially) compact if for any sequence \(\{x_{k}\}_{k\in\mathbb{N}}\) in \(K\) there exists a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit in \(K\).

Note that the initial sequence in the definition does not have to converge. But what if it does? What is the link between a convergent sequence and its subsequences?

If \(\{x_{k}\}_{k\in\mathbb{N}}\) is a convergent sequence in a metric space \(X\), then any subsequence converges to the same limit.

Assume \(x_{k}\to x\). Let \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) be a subsequence (here \(n_{1}<n_{2}<\cdots\)). We know that for every \(\epsilon>0\) there exists \(N\in\mathbb{N}\) such that \(x_{k}\in B_{\epsilon}(x)\) for every \(k>N\). So simply note that \(n_{k}>k\), whence \(x_{n_{k}}\in B_{\epsilon}(x)\) for \(k>N\). Thus, \(x_{n_{k}}\to x\).

It seems quite daunting to have to check every sequence in a set for convergent subsequences, so it will be useful to re-express compactness in terms of our basic building blocks; that is, open and closed sets. However, exploring openness doesn’t seem particularly fruitful; for example, consider the set \((0,1)\) in \(\mathbb{R}\); the sequence \(\{1/n\}\) lies in \((0,1)\), but its limit is \(0\notin(0,1)\). Moreover, any subsequence must have the same limit \(0\) by Lemma [lem:subsequences-have-same-limit], and so \((0,1)\) is not compact. Let’s try closedness.

[Closed sets and limits of sequences] We have \[\begin{aligned} F\subset X \text{ is closed}\quad\Longleftrightarrow & \quad\text{Every sequence in }F \text{ which converges in }X\text{ has its limit point in }F.\\ & \quad\text{(that is, if }x_{n}\in F \text{ and }x_{n}\to x\text{ for some }x\in X\text{, then }x\in F.)\end{aligned}\]

\(\quad\)

  • (\(\Rightarrow\)): Assume \(F\) is closed and let \(\{x_{n}\}_{n\in\mathbb{N}}\) be a sequence with \(x_{n}\in F\) that converges to \(x\in X\). We wish to show \(x\in F\). For a contradiction, assume \(x\notin F\); that is, assume \(x\in X\setminus F\). We know \(X\setminus F\) is open, so by definition there must exist an open ball \(B_{\epsilon}(x)\) centred at \(x\) and contained in \(X\setminus F\). But \(x_{n}\to x\), so there exists \(N\in\mathbb{N}\) such that \(x_{n}\in B_{\epsilon}(x)\) for \(n>N\). Thus \(x_{n}\in X\setminus F\) for \(n>N\) (that is, \(x_{n}\notin F\) for \(n>N\)), which is a contradiction.

  • (\(\Leftarrow\)): We need to show \(X\setminus F\) is open. Let \(x\in X\setminus F\). We wish to show we can always find a ball \(B_{\epsilon}(x)\) centred at \(x\) and contained in \(X\setminus F\). If for some \(n\in\mathbb{N}\) we have \(B_{1/n}(x)\subseteq X\setminus F\) we are done. Otherwise, pick an element \(x_{n}\in B_{1/n}(x)\cap F\) for each \(n\). But then \(x_{n}\to x\) and \(x_{n}\in F\), but by assumption \(x\notin F\). This contradicts the premise of the proof.

[Relationship between compactness and closedness]

  1. Compact sets are closed.

  2. Any closed subset of a compact subset is compact.

  1. If \(K\) is compact and \(\{x_{k}\}_{k\in\mathbb{N}}\) is a convergent sequence with \(x_{k}\in F\) and limit \(x\in X\), then by definition there must exist a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit \(x_{0}\in F\). But, by Lemma [lem:subsequences-have-same-limit], these limits must be the same; i.e., \(x=x_{0}\). Thus \(x\in F\) and by Proposition [prop:closed-sets-by-convergence] the set \(F\) is closed.

  2. Assume \(F\subseteq K\) is closed and \(K\) is compact. Let \(\{x_{k}\}_{k\in\mathbb{N}}\) be any sequence in \(F\). Since each \(x_{k}\in K\) there must exist a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit \(x\in K\). But, by Proposition [prop:closed-sets-by-convergence] this limit must be in \(F\). This shows \(F\) is compact.

Are all closed sets compact? No: e.g., \([0,\infty)\) is closed in \(\mathbb{R}\), but \(x_{n}=n\) has no convergent subsequence. The problem here is that \([0,\infty)\) is unbounded.

[Bounded sets] A subset \(A\subseteq X\) of a metric space \(X\) is bounded if \[\text{there exists }R>0\text{ and }x\in X\text{ such that }A\subseteq B_{R}(x).\]

[Compact sets are bounded] Let \(K\subseteq X\) be a compact subset of a metric space \(X\). Then \(K\) is bounded.

We employ a contrapositive argument. Assume \(K\) is not bounded and fix \(x\in K\). For each \(k\in\mathbb{N}\) we can therefore find a point \(x_{k}\in K\) such that \(d(x_{k},x)\geq k\) (since \(B_{k}(x)\) does not contain \(K\)). However, the sequence \(\{x_{k}\}_{k\in\mathbb{N}}\) cannot have a convergent subsequence; given any potential limit point \(x_{0}\) of \(x_{n_{k}}\) we have \[d(x_{n_{k}},x_{0})\quad\stackrel{(D3)}{\geq}\quad d(x_{n_{k}},x)-d(x,x_{0})\quad\geq\quad n_{k}-d(x,x_{0})\quad\to\infty\text{ as }k\to\infty.\]

So, we know that compact sets are closed and bounded. Is that enough? Recall, the theorem of Bolzano-Weierstrass from Analysis I. It states precisely that the closed and bounded intervals \([a,b]\) in \(\mathbb{R}\) are compact.

[Heine-Borel for \(\mathbb{R}^{n}\)] \[\text{A subset }K\text{ of }\mathbb{R}^{n}\text{ is compact}\qquad\Longleftrightarrow\qquad K\text{ is closed and bounded.}\]

Heine-Borel does not hold for arbitrary metric spaces.

The proof is simply induction on \(n\), starting from the base ‘\(n=1\)’ case for subsets of \(\mathbb{R}\). For this reason (and since this is a Complex Analysis course), we will prove it for \(\mathbb{C}\) and leave the proof in higher dimensions for the enthusiastic reader - the statement for \(\mathbb{C}\) corresponds to that of \(\mathbb{R}^{2}\) in the statement of Theorem [thm:Heine-Borel], since \(\mathbb{C}\) with its standard metric can be identified with \(\mathbb{R}^{2}\) with its standard metric.

[Heine-Borel for \(\mathbb{C}\)] \[\text{A subset }K\text{ of }\mathbb{C}\text{ is compact}\qquad\Longleftrightarrow\qquad K\text{ is closed and bounded.}\]

Next, let \(K\subseteq\mathbb{C}\) be a closed and bounded subset of \(\mathbb{C}\). It is very easy to show that if \(K\) is bounded then there exists \(R>0\) such that \(K\subseteq B_{R}(0)\). Let \(R>0\) be such a number (so that \(|z|<R\) for every \(z\in K\)) and let \(\{z_{k}\}_{k\in\mathbb{N}}\) be a sequence in \(\mathbb{C}\) with \(z_{n}=x_{n}+iy_{n}\). Note that \(|x_{n}|<R\) and \(|y_{n}|<R\). We wish to show there is a convergent subsequence with limit in \(K\).

The real interval \([-R,R]\) is closed and bounded, so by Heine-Borel for \(\mathbb{R}\) the interval \([-R,R]\) is compact. Notice that \(x_{k}\in(-R,R)\subset[-R,R]\). Since \(\{x_{k}\}_{k\in\mathbb{N}}\) is a sequence in \([-R,R]\) there must be a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit in \([-R,R]\), say \(x_{n_{k}}\to x\in[-R,R]\).

Consider the corresponding complex subsequence \(\{z_{n_{k}}\}_{k\in\mathbb{N}}\), where \(z_{n_{k}}=x_{n_{k}}+iy_{n_{k}}\). Its imaginary part, the real sequence \(\{y_{n_{k}}\}_{k\in\mathbb{N}}\), also lies in the compact set \([-R,R]\). So, it also must has a convergent subsequence \(\{y_{n_{m_{k}}}\}_{k\in\mathbb{N}}\) (so a subsequence of a subsequence!!) which converges to some \(y\in[-R,R]\).

Finally, take the (sub)subsequence \(\{z_{n_{m_{k}}}\}_{k\in\mathbb{N}}\), where \(z_{n_{m_{k}}}=x_{n_{m_{k}}}+iy_{n_{m_{k}}}\). The imaginary part converges to \(y\) and, by Lemma [lem:subsequences-have-same-limit], the real part converges to \(x\). Then, by the facts we discussed about convergence of complex sequences, the subsequence \(\{z_{n_{m_{k}}}\}_{k\in\mathbb{N}}\) converges with limit \(x+iy\). Since \(K\) is closed by assumption, it follows from Proposition [prop:closed-sets-by-convergence] that this limit \(x+iy\) is in \(K\). This proves \(K\) is compact.

  • The complex plane \(\mathbb{C}\) is not compact with respect to the standard metric; e.g., the sequence \(\{ik\}_{k\in\mathbb{N}}\) has no convergent subsequence. (Neither is \(\mathbb{R}^{n}\).)

  • The Riemann sphere \(S^{2}\) in \(\mathbb{R}^{3}\) is compact as a subset of \(\mathbb{R}^{3}\) (with the usual Euclidean metric). Hence \(\hat{\mathbb{C}}=\mathbb{C}\cup\{\infty\}\) is compact with respect to the chordal metric. (One way to prove this formally will be Theorem [thm:cont-image-of-compact-is-compact] below).

  • The orthogonal group \({\rm O}(n)\) and the unitary group \({\rm U}(n)\) are compact; \({\rm SL}_{n}(\mathbb{R})\) and \({\rm GL}_{n}(\mathbb{R})\) are not (the same holds for matrices with entries in \(\mathbb{C}\)) - see sheet 3.

Finally, before stating the connection between continuous functions and compact sets, we restate continuity in terms of convergent sequences:

A function \(f:X\to Y\) between two metric spaces is continuous at \(x\in X\) if and only if \[\lim_{n\to\infty}f(x_{n})=f(x)\quad\text{ for every convergent sequence }\{x_{n}\}_{n\in\mathbb{N}}\text{ in }X\text{ with }x_{n}\to x.\]

See sheet 3.

Recall (from Analysis I) that non-empty compact sets in \(\mathbb{R}\) have a minimal and maximal element. [This is because \(f(x)=|x|\) is continuous - the statement you have seen says continuous functions on compact sets attain their max/min.]

Let \(f:X\to Y\) be a map between two metric spaces. Then, \[K\subset X\text{ is compact and }f\text{ is continuous}\quad\Longrightarrow\quad\text{the image }f(K) \text{ is compact in }Y.\] In particular, for \(Y=\mathbb{R}\), any continuous real-valued function on a metric space \(X\) attains minima and maxima on compact sets.

Let \(\{y_{k}\}_{k\in\mathbb{N}}\) be a sequence in \(f(K)\), say with \(f(x_{k})=y_{k}\). We wish to show it has a convergent subsequence with limit in \(f(K)\). Since \(K\) is compact, there must be a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) (of the sequence \(\{x_{k}\}_{k\in\mathbb{N}}\)) with some limit \(x\) in \(K\). Since \(f\) is continuous, we have by Lemma [lem:continuity-by-convergent-sequences] that \(x_{n_{k}}\to x\) implies \(y_{n_{k}}=f(x_{n_{k}})\to f(x)\). Thus \(\{y_{n_{k}}\}_{k\in\mathbb{N}}\) has limit \(f(x)\), and because \(x\in K\) we must have \(f(x)\in f(K)\).

It follows that if \(K\) is a compact subset of \(\mathbb{C}\), then \(\mathrm{Re}(z),\mathrm{Im}(z)\) and \(|z|\) all attain maximum and minimum values on \(K\). In term 2, we will find out a stronger statement under the additional assumptions that \(f\) is (complex) differentiable and \(K\) is ‘nice’; then the maximum modulus of a function occurs on the boundary of \(K\). This is called the maximum modulus theorem.

3 Complex Differentiation

3.1 Complex differentiability

Differentiation of a complex function is defined in a similar way to that in \(\mathbb{R}\), except using complex limits.

[Complex differentiability] A function \(f:U\rightarrow\mathbb{C}\) defined on an open set \(U\) in \(\mathbb{C}\) is (complex) differentiable at \(\boldsymbol{z_{0}\in U}\) if \[\lim_{z\to z_{0}}\frac{f(z)-f(z_{0})}{z-z_{0}}\quad\text{exists}.\] We call this limit the derivative of \(\boldsymbol{f}\) at \(\boldsymbol{z_{0}}\) and write \(\boldsymbol{f'(z_{0})}\) for the limit, i.e. \[f'(z_{0})=\lim_{h\to0}\frac{f(z_{0}+h)-f(z_{0})}{h}.\]

  • In the second formulation the quantity \(h\) is a complex number (not a real number), so the limit must exist from every direction. The second is often the more useful expression of the two.

  • Note that if a function \(f\) is complex differentiable at \(z\) then it is continuous at \(z\).

Differentiating complex functions from first principles (examples)

  1. Rules for differentiating polynomials are the same. For example, consider \(f(z)=z^{2}\) on \(\mathbb{C}\). For any \(z\in\mathbb{C}\) we have \[\lim_{h\to0}\frac{(z+h)^{2}-z^{2}}{h}=\lim_{h\to0}\frac{z^{2}+2hz+h^{2}-z^{2}}{h}=\lim_{h\to0}(2z+h)=2z.\] Thus, \(f\) is differentiable on \(\mathbb{C}\) and \(f'(z)=2z\) as expected.

  2. Consider \(f(z)=\bar{z}\). For it to be differentiable we must obtain the same limit from every direction. But, considering limits from the purely real and purely imaginary directions, for every \(z\in\mathbb{C}\) we have \[\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{\overline{z+h}-\bar{z}}{h}=\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{h}{h}=1,\] yet \[\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{\overline{z+ih}-\bar{z}}{ih}=\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{-ih}{ih}=-1.\] Since \(z\) was arbitrary this shows \(f\) is not differentiable anywhere.

  3. As in the real case, sums/products/quotients of complex differentiable functions are complex differentiable where defined (e.g., all polynomials/rational functions). In particular, the product and quotient rules hold for complex derivatives.

  4. Composition of differentiable functions are complex differentiable where defined. In particular, the chain rule holds for complex derivatives. The proofs of \(3.\) and \(4.\) here are almost identical to those from Analysis I, so are excluded.

  5. Generally, non-constant purely real/imaginary functions are not complex differentiable ; e.g., \(\mathrm{Re}(z),\mathrm{Im}(z),|z|\) are nowhere differentiable as functions from \(\mathbb{C}\to\mathbb{C}\).

3.2 Cauchy-Riemann equations

We have a basic way of determining whether a function is complex differentiable, but for more complicated functions we wish to find some nicer conditions. We consider a complex valued function \(f\), defined for \(z=x+iy\) in some set \(U\subseteq\mathbb{C}\), as a function of the two real variables \(x\) and \(y\). We write \[f(z)=u(x,y)+iv(x,y)\] and call \(\mathrm{Re}(f)=u(x,y)\) the real part of \(f\) and \(\mathrm{Im}(f)=v(x,y)\) the imaginary part of \(f\).

For example, if \(f(z)=z^{2}\) then \(z^{2}=(x+iy)^{2}=x^{2}-y^{2}+i2xy\), so \(u(x,y)=x^{2}-y^{2}\) and \(v(x,y)=2xy\).

When considered as real functions \(\mathbb{R}^{2}\to\mathbb{R}\) there are naturally two partial derivatives associated with each of \(u\) and \(v\): \[\begin{aligned} u_{x}(x,y)=\frac{\partial}{\partial x}u(x,y):=\lim_{h\to0}\frac{u(x+h,y)-u(x,y)}{h},\quad\quad & u_{y}(x,y):=\lim_{h\to0}\frac{u(x,y+h)-u(x,y)}{h},\\ v_{x}(x,y):=\lim_{h\to0}\frac{v(x+h,y)-v(x,y)}{h},\quad\quad & v_{y}(x,y):=\lim_{h\to0}\frac{v(x,y+h)-v(x,y)}{h}.\end{aligned}\] Note that the limits above are standard real limits in \(\mathbb{R}\) (so \(h\) is real).

[Cauchy-Riemann equations] Let \(f=u+iv\) be complex differentiable at \(z_{0}\). Then the real partial derivatives \(u_{x},u_{y},v_{x},v_{y}\) exist at \(z_{0}\) and satisfy the Cauchy-Riemann equations: \[u_{x}(z_{0})=v_{y}(z_{0})\qquad\qquad u_{y}(z_{0})=-v_{x}(z_{0}).\] Furthermore, the derivative of \(f\) at \(z_{0}\) can be written as \[\begin{aligned} f'(z_{0}) & =u_{x}(z_{0})+iv_{x}(z_{0})=v_{y}(z_{0})-iu_{y}(z_{0})\\ & =u_{x}(z_{0})-iu_{y}(z_{0})=v_{y}(z_{0})+iv_{x}(z_{0}).\end{aligned}\]

Since \(f\) is complex differentiable at \(z_{0}=x_{0}+iy_{0}\), the limit in the definition must exist and agree no matter which way we approach \(z_{0}\). Choosing the purely real and purely imaginary directions we have \[\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{f(z_{0}+h)-f(z_{0})}{h}\quad=\quad f'(z_{0})\quad=\quad\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{f(z_{0}+ih)-f(z_{0})}{ih}.\] We may split each limit into real and imaginary parts, by standard properties of complex limits. Since, when \(h\) is real, \[\begin{aligned} f(z_{0}+h) & =f((x_{0}+h)+iy_{0})=u(x_{0}+h,y_{0})+iv(x_{0}+h,y_{0}),\quad\text{ and}\\ f(z_{0}+ih) & =f(x_{0}+i(y_{0}+h))=u(x_{0},y_{0}+h)+iv(x_{0},y_{0}+h),\end{aligned}\] the above reads \[\begin{aligned} & \quad\lim_{h\rightarrow0}\frac{u(x_{0}+h,y_{0})-u(x_{0},y_{0})}{h}+i\lim_{h\rightarrow0}\frac{v(x_{0}+h,y_{0})-v(x_{0},y_{0})}{h}\quad & =\quad f'(z_{0})\\ = & \quad\frac{1}{i}\lim_{h\rightarrow0}\frac{u(x_{0},y_{0}+h)-u(x_{0},y_{0})}{h}+\frac{i}{i}\lim_{h\rightarrow0}\frac{v(x_{0},y_{0}+h)-v(x_{0},y_{0})}{h}.\end{aligned}\] In other words, \(u_{x}(z_{0})+iv_{x}(z_{0})=f'(z_{0})=(1/i)u_{y}(z_{0})+v_{y}(z_{0})\). Since \(1/i=-i\) the statement follows by comparing real and imaginary parts.

Returning to the function \(f(z)=z^{2}\), note that the C-R equations indeed hold: We have \(u(x,y)=x^{2}-y^{2}\) and \(v(x,y)=2xy\) and so \[u_{x}=2x,\quad v_{y}=2x,\quad u_{y}=-2y,\quad v_{x}=2y.\]

[Warning] Note that Proposition [prop:complex-diff-implies-CR] provides a way of showing a function is not differentiable at a given point. It cannot be used to prove a function is differentiable. I.e., it is only a necessary condition, not a sufficient one.

It turns out, by considering a complex function \(f:\mathbb{C}\to\mathbb{C}\) as a real function \(f:\mathbb{R}^{2}\to\mathbb{R}^{2}:(x,y)\mapsto(u(x,y),v(x,y))\), that we have precisely \[f\text{ is complex differentiable}\quad\iff\quad f\text{ is real differentiable}+\text{ C-R equations hold;}\] so indeed we do have a converse:

Let \(f=u+iv\) be defined on an open subset \(U\) of \(\mathbb{C}\). Assume the partial derivatives \(u_{x},u_{y},v_{x},v_{y}\) exist, are continuous, and satisfy the Cauchy-Riemann equations at \(z_{0}\in U\). Then \(f\) is complex differentiable at \(z_{0}\).

Omitted. (If you’re interested to see a proof of this theorem, then you can find one in Section 5.6, page 59, of the book “Introduction to Complex Analysis” by H. A. Priestley, Second Edition, Oxford University Press, 2009.)

Determining complex differentiability via Cauchy-Riemann (examples)

  1. Let \(f(z)=\exp(z)=e^{x}\cos y+ie^{x}\sin y\). Then \[u_{x}=e^{x}\cos y,\quad v_{y}=e^{x}\cos y,\quad u_{y}=-e^{x}\sin y,\quad v_{x}=e^{x}\sin y.\] All these functions are continuous as real functions (see Calculus I/AMV II) and the C-R equations hold. Thus, by Theorem [thm:Cauchy-Riemann-implies-complex-differentiability], \(\exp\) is differentiable everywhere in \(\mathbb{C}\) and by Proposition [prop:complex-diff-implies-CR] \[\exp'(z)=u_{x}+iv_{x}=e^{x}\cos y+ie^{x}\sin y=\exp(z).\]

  2. By the chain rule, \(f(z)=e^{iz}\) is differentiable and \(f'(z)=ie^{iz}\). Since they are just sums of \(\exp\), all of the functions \(\sin,\cos,\sinh,\cosh\) are differentiable everywhere in \(\mathbb{C}\) and you can verify \[\begin{aligned} \sin'(z)=\cos z,\quad\cos'(z)=-\sin z,\quad\sinh'(z)=\cosh z,\quad\cosh'(z)=\sinh z.\end{aligned}\] Similarly, all polynomials/rational functions are differentiable with the usual formulae: e.g., for \(a_{0}\ldots a_{n}\) complex; \[(a_{n}z^{n}+\cdots+a_{2}z^{2}+a_{1}z+a_{0})'=na_{n}z^{n-1}+\cdots+2a_{2}z+a_{1}.\] For the branch of log corresponding to arguments in \((\theta_{1},\theta_{2}]\), the function \(\log(z)\) is differentiable in \(\mathbb{C}\setminus R_{\theta_{1}}\), i.e. at all points outside the branch cut. At these points, the derivative is given by \(\log'(z)=1/z\) (see Sheet 4 Q7 where this is proved for the principal branch).

  3. \(f(z)=\sin(z)/z^{2}\) is differentiable everywhere except \(z=0\), since it is the quotient of two complex differentiable functions (and is not defined when \(z^{2}=0\)). Furthermore, \[\left[\sin(z)/z^{2}\right]'=\frac{(\cos z)(z^{2})-(\sin z)(2z)}{(z^{2})^{2}}=\frac{z\cos z-2\sin z}{z^{3}}.\]

  4. Let \(f(z)=f(x+iy)=(x^{3}+3x^{2}y-y^{3}-x^{2}-2y^{2})+i(-x^{3}+3xy^{2}-y^{3}+4xy+3y)\). Then, \[\begin{aligned} u_{x}=3x^{2}+6xy-2x,\quad\quad & v_{y}=6xy-3y^{2}+4x+3,\\ u_{y}=3x^{2}-3y^{2}-4y,\quad\quad & v_{x}=-3x^{2}+3y^{2}+4y.\end{aligned}\] Thus, the partial derivatives exist and are continuous everywhere. We have \(u_{y}=-v_{x}\) everywhere, but \(u_{x}=v_{y}\) if and only if \(3x^{2}-2x=-3y^{2}+4x+3\); that is, when \((x-1)^{2}+y^{2}=2\). So, \(f\) is differentiable only on the circle of radius \(\sqrt{2}\) centred at \(1\) in the complex plane!

Holomorphicity

In example \(4.\) above, the function is only differentiable on a 1-dimensional subset of \(\mathbb{C}\). This is a similar situation to a real function on \(\mathbb{R}\) being differentiable only at a single point - not a very interesting function to work with from an analytical perspective. It will be useful for us to consider functions that are differentiable on genuine \(2\)-dimensional sets in \(\mathbb{C}\), for this will allow us to (later) express the functions using Taylor series. Such functions will turn out to have some quite remarkable properties.

[Holomorphic functions] A function \(f:U\to\mathbb{C}\) defined on an open set \(U\subset\mathbb{C}\) is holomorphic on \(\boldsymbol{U}\) if it is complex differentiable at every point in \(U\).

We say \(f\) is holomorphic at \(\boldsymbol{z_{0}\in U}\) if it is holomorphic on some open ball \(B_{\epsilon}(z_{0})\); (in other words, if there exists \(\epsilon>0\) such that \(f\) is complex differentiable at every point in \(B_{\epsilon}(z_{0})\)).

  • By the previous examples, exp, trig functions, and polynomials are holomorphic on all of \(\mathbb{C}\). Logarithms and complex powers are holomorphic at points outside their branch cuts.

  • The function in the previous example \(4.\) is nowhere holomorphic despite being differentiable on a circle. To see this, simply note that any (open) ball in \(\mathbb{C}\) centred at a point \(z\) on this circle must contain a point \(w\) not on the circle. The function is not complex differentiable at \(w\), so is not holomorphic at \(z\). Thus, being holomorphic is a strictly stronger property than being complex differentiable.

3.3 Connected sets and zero derivatives

We explore what it means for the derivative to be zero for a complex function. Consider, for example, the function \(f:U\to\mathbb{C}\), where \(U=\{z\in\mathbb{C}:|z|\neq1\}\) and \[f(z)=\begin{cases} 1\ & \text{ if }\quad|z|<1,\\ 2\ & \text{ if }\quad|z|>1. \end{cases}\] The set \(U\) is open, the function \(f\) is clearly holomorphic on \(U\) and has zero derivative on \(U\). However, it takes two different values - it is not constant. The problem is that \(f\) lives on two ‘unconnected’ subsets of the complex plane. We wish to find conditions on the set \(U\) under which we can conclude \(f\) is constant (as we are used to for real functions).

First, we need some terminology:

[Paths & path-connectedness]

  1. A path or curve (from \(\boldsymbol{a\in\mathbb{C}}\) to \(\boldsymbol{b\in\mathbb{C}}\)) is a continuous function \(\gamma:[0,1]\to\mathbb{C}\) with \(\gamma(0)=a\) and \(\gamma(1)=b\). We say the path/curve is closed if \(a=b\) (in this case, the endpoints of the path join up).

  2. A path/curve is smooth if it is continuously differentiable. (Our definition of the derivative of a function from \(\mathbb{R}\to\mathbb{C}\) is the ‘obvious’ one.)

  3. We say a subset \(U\subseteq\mathbb{C}\) is path-connected if for every pair of points \(a,b\in U\) there exists a smooth path from \(a\) to \(b\) such that \(\gamma(t)\in U\) for every \(t\in[0,1]\).

  • [Advanced/Warning] These definitions are not universal to all of mathematics. Outside of complex analysis, people often use ‘smooth’ to refer to the property of having infinitely many derivatives. Also, usually people use ‘path-connected’ to mean you can connect two points simply by a continuous curve. We used a stronger definition to make our life easier.

Intuitively, a subset is path-connected if it has ‘one piece’; i.e., if you can draw a line between any two points in the set without the pencil leaving the subset (or the paper!).

[Domains] A domain \(D\) is an open, path-connected subset of \(\mathbb{C}\). Some people call domains regions.

Showing sets are domains (examples)

  1. We know both \(\mathbb{C}\) and \(B_{r}(z)\) (for any \(z\in\mathbb{C}\) and \(r>0\)) are open. They are also path-connected: Simply choose the line segment between \(a\) and \(b\) given by \(\gamma(t)=a+(b-a)t\). This is clearly a smooth path, so \(\mathbb{C}\) and \(B_{r}(z)\) are domains.

  2. The set \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\), on which the principal branches of \(\log\) and complex powers are defined, is path-connected: Choose arbitrary \(a,b\in\mathbb{C}\setminus\mathbb{R}_{\leq0}\). If \(a\) lies on the positive real axis then the line segment again works (since it avoids the negative real axis). Otherwise, we could try letting \[\gamma_{1}(t)=|a|e^{i\mathrm{Arg}(a)(1-t)},\quad\quad\gamma_{2}(t)=|a|+(b-|a|)t.\] The first path takes \(a\) to \(|a|\), avoiding \(\mathbb{R}_{\leq0}\) by tracing around the circle of radius \(|a|\) centred at the origin. The second joins \(|a|\) to \(b\) via a straight line. (You can reparametrize to make \(t\) run from \(0\) to \(1\) on the combined path if you like.) But the combined path running through \(\gamma_1\) then \(\gamma_2\) is not differentiable at the point where the paths \(\gamma_1\) and \(\gamma_2\) meet! One possibility for a differentiable path joining the points \(a\) and \(b\) would be the arc of the circle through \(a\) and \(b\) that does not cross \(\mathbb{R}_{\leq0}\).

    We know \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\) is also open, so it is a domain.

  3. The set \(\{z\in\mathbb{C}:|z|\neq1\}\) is not a domain. It is open, but it is not path-connected. Pick a point \(a\) with \(|a|<1\) and \(b\) with \(|b|>1\). We cannot draw a continuous path from \(a\) to \(b\) without crossing the circle \(|z|=1\); i.e., you cannot draw a continuous path between the points without leaving the set.

Previously we have seen that there is a chain rule for the composition of two complex differentiable functions. There is also the chain rule for the composition of a complex differentiable function and a smooth path:

[Chain rule] Let \(U\subset\mathbb{C}\) be an open set, \(f:U\to\mathbb{C}\) be a holomorphic function on \(\mathbb{C}\) and \(\gamma:[0,1]\to U\) be a smooth path. Then for \(t_{0}\in[0,1]\) we have

\[(f\circ\gamma)'(t_{0})=f'(\gamma(t_{0}))\gamma'(t_{0}).\]

Similar to the proof of the chain rule in Analysis I.

Let \(f:D\to\mathbb{C}\) be holomorphic on a domain \(D\subseteq\mathbb{C}\). If \(f'(z)=0\) for every \(z\in D\) then \(f\) is constant on \(D\).

Since \(D\) is path-connected, it is enough to show \(f\) is constant on any smooth path \(\gamma\) lying in \(D\). In other words we wish to show \(f\circ\gamma\) is a constant function of \(t\).

We have \((f\circ\gamma)'(t)=f'(\gamma(t))\gamma'(t)=0\) by the chain rule. Since \(f=u+iv\), we have

\[0=(f\circ\gamma)'(t)=(u\circ\gamma)'(t)+i(v\circ\gamma)'(t),\] which means both \((u\circ\gamma)'(t)\) and \((v\circ\gamma)'(t)\) are equal to \(0\) for all \(t\). Hence \(u\) and \(v\) are constant along \(\gamma\), so \(f\) is too.

The same statement holds if “\(f'=0\)” is replaced with “\(f\) is purely real/imaginary”, or “\(f\) has constant real/imaginary part”, or “\(f\) has constant modulus” (see Sheet 4). So none of \(\mathrm{Re}(z),\mathrm{Im}(z),|z|\) or \(\mathrm{Arg}(z)\) are holomorphic anywhere (since they are non-constant real-valued functions).

3.4 The angle-preserving properties of holomorphic functions

Given a smooth path \(\gamma:[0,1]\to\mathbb{C}\), we can talk about the tangent vector to \(\gamma\) at a point \(z_{0}=\gamma(t_{0})\) on the path.

[Conformal maps] We say a (real differentiable) map \(f:D\to\mathbb{C}\) on a domain \(D\subseteq\mathbb{C}\) is conformal at \(\boldsymbol{z_{0}}\) if it preserves the angle and orientation between any two tangent vectors at \(z_{0}\). This is exactly the same as saying that it preserves the angle and orientation between any two smooth curves passing through \(z_{0}\). We say that \(f\) is conformal if it is conformal at all points in \(D\).

It turns out that there is a close relationship between holomorphic and conformal maps.

[Holomorphic maps are conformal] A holomorphic map \(f\) with \(f'(z_{0})\ne0\) is conformal at \(z_{0}\).

Take a smooth path \(\gamma:[0,1]\to\mathbb{C}\) that passes through \(z_{0}\), and consider the tangent vector to \(\gamma\) at the point \(z_{0}=\gamma(t_{0})\) on the path. What happens to this tangent vector under the application of a holomorphic function \(f\)?

The path \(f\circ\gamma\) is the path obtained by mapping \(\gamma\) under \(f\). So to calculate what happens to the tangent vector \(\gamma'(t_{0})\) under \(f\), we should calculate the tangent vector \((f\circ\gamma)'(t_{0})\). This is done using the chain rule: \[(f\circ\gamma)'(t_{0})=f'(\gamma(t_{0}))\gamma'(t_{0})=f'(z_{0})\gamma'(t_{0}).\] This describes what happens to tangent vectors geometrically: \(f\) transforms the tangent vector \(\gamma'(t_{0})\) by multiplying it by the complex number \(f'(z_{0})\).

But we understand what multiplication by complex numbers means geometrically: if \(f'(z_{0})\neq0\) then multiplication by \(f'(z_{0})\) is dilation (by \(|f'(z_{0})|\)) followed by a rotation (by \(\mathrm{Arg}(f'(z_{0}))\)).

We now notice the following: both dilations and rotations preserve the angles and orientations between vectors! This completes the proof.

Determining where maps are conformal (examples)

  • Let \(f(z)=z^{2}\). Here, \(f'(z)=2z\) so \(f'(z)=0\iff z=0\) and so \(f\) is conformal on \(\mathbb{C}^{\ast}=\mathbb{C}\setminus\{0\}\) (since it is holomorphic everywhere). We don’t yet know whether \(f\) is conformal at the origin, but we can check: we know that \(f'(z)=2z\) so \(f'(0)=0\) and hence the action of \(f\) on tangent vectors at the origin is to multiply by \(0\), hence sending any tangent vector to zero.

  • Let \(f(z)=\frac{2}{3}z^{3}+\frac{1}{2}(1-2i)z^{2}-iz+2-1\). This function is clearly holomorphic as it is just a polynomial. We have \[f'(z)=2z^{2}+(1-2i)z-i=(2z+1)(z-i),\] which has zeros at \(-\frac{1}{2}\) and \(i\), so \(f\) is conformal on \(\mathbb{C}\setminus\{-\frac{1}{2},i\}\). [Note, we can explicitly check what \(f\) does to tangent vectors at each point in this set: For example, at \(z=i/2\) we have \(f'(i/2)=(i+1)(-i/2)=(1-i)/2=\frac{1}{\sqrt{2}}e^{-i\pi/4}.\) Thus, at \(z=i/2\) the function \(f\) rotates tangent vectors by \(\mathrm{Arg}(f'(i/2))=-\pi/4\) degrees, then dilates by a factor of \(|f'(z)|=1/\sqrt{2}\).]

    Rather than having to then check two paths through \(z=1/2\) and \(z=i\) respectively, to determine whether \(f\) is conformal there, it will be convenient to have a complete description in terms of holomorphicity. The converse to Lemma [lem:A-holomorphic-map-is-conformal] holds:

[Conformal maps are holomorphic] Let \(D\) be a domain. If \(f\) is conformal at \(z_{0}\in D\) then \(f\) is complex differentiable at \(z_{0}\) and \(f'(z_{0})\neq0\). Therefore, if \(f\) is conformal on \(D\), then \(f\) is holomorphic on \(D\) and \(f'(z)\ne0\) for all \(z\in D\). Thus \[f\text{ is conformal on }D\quad\iff\quad f\text{ is holomorphic with }f'(z)\neq0\text{ for all } z\in D.\]

(Non-examinable) Sketch: we know \(f=u+iv\) is real differentiable when considered as a map \(f:\left(\begin{smallmatrix}x\\ y \end{smallmatrix}\right)\mapsto\left(\begin{smallmatrix}u(x,y)\\ v(x,y) \end{smallmatrix}\right)\). The total differential of this map at a point \(z_{0}=x_{0}+iy_{0}\) (see AMV II) is the Jacobian matrix \(D_{z_{0}}=\left(\begin{smallmatrix}u_{x}(x_{0},y_{0}) & u_{y}(x_{0},y_{0})\\ v_{x}(x_{0},y_{0}) & v_{y}(x_{0},y_{0}) \end{smallmatrix}\right)\). To preserve angles it must be of the form \(c\left(\begin{smallmatrix}\cos\theta & -\sin\theta\\ \sin\theta & \cos\theta \end{smallmatrix}\right)\) for some \(c\in\mathbb{R}_{\neq0}\). Thus \(D_{z_{0}}\) is of the form \(\left(\begin{smallmatrix}a & -b\\ b & a \end{smallmatrix}\right)\) and the C-R equations are satisfied at \(z_{0}\). Since \(f\) is real differentiable by assumption, C-R equations are actually enough to show that \(f\) is complex differentiable at \(z_{0}\) (we have not stated exactly this result before, but it holds). Since \(\det D_{z}\neq0\) we have \(f'(z_{0})=u_{x}(x_{0},y_{0})+iv_{x}(x_{0},y_{0})\neq0\).

[Advanced] The idea of this proof, in reverse, can be adapted to give a more sophisticated proof of the CR equations!

Determining where maps are conformal (examples continued)

  • Let \(f(z)=xy+iy^{2}\). Then \(u_{x}=y,v_{y}=2y,u_{y}=x,v_{x}=0\). So, the C-R equations only hold at \(y=x=0\). Hence \(f\) is not conformal at any point \(z_{0}\neq0\), since it cannot be complex differentiable at \(z_{0}\) (as C-R. equations fail to hold), and hence it is not conformal at \(z_{0}\) by Proposition [prop:conf-to-holo]. What happens for \(z_{0}=0\)? There, \(f'(0)=u_{x}(0,0)+iv_{x}(0,0)=0+i0=0\). So \(f\) cannot be conformal at \(0\) either by Proposition [prop:conf-to-holo]. Thus, \(f\) is nowhere conformal.

Visualising conformal maps

Here is a useful result, helpful in visualising the action of conformal maps:

Any conformal map maps orthogonal grids in the \((x,y)\)-plane to orthogonal grids.

[Important warning] We must allow our grids to be made up of arbitrary smooth curves, not necessarily straight lines (Corollary [cor:conformal-preserves-grids] is false if they are required to be made of straight lines).

How to visualise conformal maps (examples)

  • Let \(f(z)=z^{2}=x^{2}-y^{2}+i2xy\) and consider the simple grid in the \((x,y)\)-plane made up of lines parallel to the real/imaginary axes, separated by distance \(1\).

    Given a line (not passing through the origin) in such a grid, say \(x=a\,(\neq0)\), the function \(f\) takes this line to the points \((a^{2}-y^{2},2ay)\) in the \((u,v)\)-plane. If \(u=a^{2}-y^{2}\) and \(v=2ay\), then \(v^{2}=4a^{2}y^{2}\) and so \(u=a^{2}-(\frac{v}{2a})^{2}\) and the image defines a parabola.

    Similarly, the line \(y=b\,(\neq0)\) is taken to the parabola \(u=(\frac{v}{2b})^{2}-b^{2}\) in the \((u,v)\)-plane. By sketching these on the same axes we see that the parabolas indeed cross at right angles.

  • For the same example, consider the level curves \(u(x,y)=a\) and \(v(x,y)=b\) with \(a,b\neq0\). These trace out the curves \(y^{2}=x^{2}-a\) and \(y=\frac{b}{2x}\) respectively. By sketching these curves on a graph (for, say, \(u=\pm1,v=\pm1\)) we see that they are perpendicular. (They have to be, since \(f(z)\) is conformal on \(\mathbb{C}-\{0\}\) and they map to perpendicular straight lines.)

3.5 Biholomorphic maps

We wish to build a dictionary of conformal maps as a way of getting back and forth between domains of the complex plane. To do this, we need our maps to be invertible.

[Biholomorphic maps] Let \(D\) and \(D'\) be domains. We say that \(f:D\to D'\) is biholomorphic if \(f\) is holomorphic, a bijection, and the inverse \(f^{-1}:D'\to D\) is also holomorphic. A biholomorphic map \(f\) is called a biholomorphism. When \(f\) as above exists, we say that the domains \(D\) and \(D'\) are biholomorphic and write \(f:D\xrightarrow{\sim}D'\).

[Advanced] It is possible to prove that if \(f\) as above is holomorphic and bijective, then it is automatic that its inverse is holomorphic.

Finding biholomorphisms (examples)

  1. The function \(\exp:\mathbb{C}\to\mathbb{C}^{\ast}\) is not biholomorphic since it is not injective (e.g., \(e^{z}=e^{z+2n\pi i}\)). How can we make it biholomorphic? We simply need to restrict it to a smaller domain where it will be injective. It is not too hard to check that \[\exp(z_{1})=\exp(z_{2})\] if and only if \(\mathrm{Re}(z_{1})=\mathrm{Re(}z_{2})\) and \(\mathrm{Im}(z_{1})-\mathrm{Im}(z_{2})\in2\pi\mathbb{Z}\). So if we rule out different \(z_{1}\) and \(z_{2}\) whose imaginary parts differ by an non-zero integer multiple of \(2\pi\) then we might be ok. We can do this by setting \[D:=\{z\in\mathbb{C}:\mathrm{Im}(z)\in(-\pi,\pi)\:\}.\] If \(z=x+iy\in D\) then \(\exp(z)=e^{x}e^{iy}\) so the principal argument of \(\exp(z)\) is never equal to \(\pi\). On the other hand, it is not hard to check that \(\exp\) maps \(D\) onto \(\mathbb{C}-\mathbb{R}_{\leq0}\). Moreover, by our choice of \(D\), \(\exp\) is injective on \(D\). So \(\exp\) is a bijection between \(D\) and \(\mathbb{C}-\mathbb{R}_{\leq0}\). Therefore it has an inverse function, and we already know what it is: the principal branch of \(\mathrm{Log}\). The function \(\mathrm{Log}\) maps \(\mathbb{C}-\mathbb{R}_{\leq0}\) to \(D\) and is differentiable at every point of \(\mathbb{C}-\mathbb{R}_{\leq0}\), so it is holomorphic on \(\mathbb{C}-\mathbb{R}_{\leq0}\). In summary, when restricted to \(D\), \(\exp\) is biholomorphic and hence \(D\) and \(\mathbb{C}-\mathbb{R}_{\leq0}\) are biholomorphic.

  2. Let \(f(z)=z^{2}\). We have seen that this function is holomorphic on \(\mathbb{C}\). However, as in the previous example, it is not injective since e.g. \(f(1)=f(-1)=1\). Note that \(f(z_{1})=f(z_{2})\) with \(z_{1}\neq z_{2}\) if and only if \(z_{1}=\pm z_{2}\), so we may be able to create a biholomorphic function by restricting \(f\) to a domain that never simultaneously contains \(z\) and \(-z\). Such a domain is the right half plane \(\mathbb{H}_{R}:=\{z\in\mathbb{C}:\mathrm{Re}(z)>0\}\). We saw in a previous example that \(f\) maps \(\mathbb{H}_{R}\) to \(\mathbb{C}-\mathbb{R}_{\leq0}\). In fact this is a bijection, with inverse given by \[f^{-1}:\mathbb{C}-\mathbb{R}_{\leq0}\to\mathbb{H}_{R},\quad f^{-1}(z):=\exp(\frac{1}{2}\mathrm{Log}(z)).\] Thus \(f^{-1}\) is holomorphic, since it is a composition of holomorphic functions, and hence \(f\) gives a biholomorphism \(f:\mathbb{H}_{R}\xrightarrow{\sim}\mathbb{C}-\mathbb{R}_{\leq0}.\)

  3. The ‘affine’ linear maps \(z\mapsto az+b\) (for \(a\in\mathbb{C}^{\ast},b\in\mathbb{C}\)) are biholomorphic \(\mathbb{C}\xrightarrow{\sim}\mathbb{C}\).

As expected we can compose biholomorphic maps to construct new ones.

[Automorphism groups] Let \(D\subset\mathbb{C}\) be a domain. The set of all biholomorphic maps \(f:D\xrightarrow{\sim}D\) from \(D\) to itself forms a group under composition. We call this group the automorphism group of \(\boldsymbol{D}\) and denote it by \({\rm Aut}(D)\).

We need only check the group properties:

  • The identity map \({\rm Id}:z\mapsto z\) is always in \(\mathrm{Aut}(D)\), and is clearly biholomorphic.

  • We know composition of functions is associative, so \(f\circ(g\circ h)=(f\circ g)\circ h\), for holomorphic maps.

  • \(f^{-1}\) is the inverse of \(f\) since \(f\circ f^{-1}={\rm Id}=f^{-1}\circ f\). Moreover, \(f^{-1}\) is in \(\mathrm{Aut}(D)\) since it is itself holomorphic by assumption and has holomorphic inverse \((f^{-1})^{-1}=f\).

  • \(\mathrm{Aut}(D)\) is closed: \((f\circ g)\) is holomorphic (by the chain rule) and its inverse \((f\circ g)^{-1}:=g^{-1}\circ f^{-1}\) is also the composition of two holomorphic functions, so holomorphic.

Note that example \(1.\) shows conformal maps are not necessarily biholomorphic. On the other hand, it turns out (see Sheet 5) that all biholomorphic maps are conformal. [Whilst the converse is not true in general, conformal maps turn out to be ‘locally’ biholomorphic.]

4 Möbius transformations

4.1 Definition and first properties of Möbius transformations

In this chapter we study a very special class of functions that play a truly fundamental role in mathematics, and have many beautiful properties. Recall that \(\mathrm{GL}_{2}(\mathbb{C})\) is the collection of matrices \[\begin{aligned} \mathrm{GL}_{2}(\mathbb{C}) & =\left\{ \left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\::\:a,b,c,d\in\mathbb{C},\:ad-bc\neq0\right\} \\ & =\{2\times 2\text{ complex matrices with nonzero determinant\}.}\end{aligned}\] Also recall the extended complex plane \(\hat{\mathbb{C}}=\mathbb{C}\cup\{\infty\}\) from Chapter 1.

[Möbius transformations] Given any matrix \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\in GL_{2}(\mathbb{C})\) we can define a function

\[M_{T}:\mathbb{C}\to\hat{\mathbb{C}}\] by the formula \[M_{T}(z)=\frac{az+b}{cz+d}\] if \(cz+d\neq0\), and if \(cz+d=0\), then we set \(M_{T}(z)=\infty\). The function \(M_{T}\) is called a Möbius transformation.

  • [Advanced] The reason for excluding matrices for which \(\det T=0\) is because if \(\det T=0\) and at least one of \(c\) and \(d\) is non-zero, so that the definition makes sense, then \(T\) has rank 1, and this implies that \((a,b)=\lambda(c,d)\) for some \(\lambda\in\mathbb{C}\). But this would give that for \(z\) with \(cz+d\neq0\), we would have \(M_{T}(z)=\lambda\), so the transformation would be constant. This is not a very interesting function to think about!

  • For \(T\in\mathrm{GL}_{2}(\mathbb{C})\), by taking complex square roots, we can find a number such that \(k^{2}=\det T\). Then \[M_{T}(z)=\frac{az+b}{cz+d}=\frac{\frac{az}{k}+\frac{b}{k}}{\frac{cz}{k}+\frac{d}{k}}=M_{\frac{_{1}}{^{k}}T}(z)\] and \[\det(\frac{_{1}}{^{k}}T)=\frac{1}{k^{2}}\det T=1.\] Thus, we may scale any \(T\in\mathrm{GL}_{2}(\mathbb{C})\) to obtain \(T'=\frac{T}{k}\) such that \(\det T'=1\), and \(M_{T}=M_{T'}\).

It is slightly annoying that at the moment, the domain of a Möbius transformation is not the same as its codomain. This would prevent us from composing two Möbius transformations. However, it is possible to extend the definition to describe where \(\infty\in\hat{\mathbb{C}}\) gets mapped to: we declare that

\[M_{T}(\infty)=\begin{cases} \frac{a}{c} & \text{if }c\neq0\\ \infty & \text{if }c=0 \end{cases}.\] Now we have \(M_{T}:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\).

  1. \(f(z)=z^{-1}\) is a Möbius transformation corresponding to the matrix \(\left(\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right)\). Note that \(|z|<1\) and \(z\neq0\) if and only if \(|f(z)|=|z^{-1}|=|z|^{-1}>1\), so \(f\) maps the punctured unit ball \(B_{1}(0)-\{0\}\) onto the outside of the closed unit ball. We have \(f(0)=\infty\) and \(f(\infty)=0\), so \(f\) interchanges these two points.

  2. The Cayley map \(f(z)=\frac{z-i}{z+i}\) corresponds to the matrix \(\left(\begin{array}{cc} 1 & -i\\ 1 & i \end{array}\right).\) We have \[f(z)\in B_{1}(0)\iff|f(z)|<1\iff|z+i|>|z-i|\iff z\in\mathbb{H}=\{z\in\mathbb{C}:\mathrm{Im}(z)>0\}.\] Hence \(f\) maps the upper half plane to the open unit ball centered at \(0\). Moreover, \(f(\infty)=1\) and \(f(-i)=\infty\).

The set of Möbius transformations form a group under composition. Furthermore,

  1. \(M_{T_{1}}\circ M_{T_{2}}=M_{T_{1}T_{2}}.\)

  2. \((M_{T})^{-1}=M_{T^{-1}}.\)

  3. \(M_{T}=\mathrm{Id}\quad\iff\quad T=t\left(\begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right)\quad(t\in\mathbb{C}^{\ast}).\)

See Sheet 6.

Recall from Linear Algebra that \(\mathrm{GL}_{2}(\mathbb{C})\) forms a group under matrix multiplication. Lemma [lem:Mobius-are-group] says more than that the Möbius transformations form a group. It says that the mapping \[T\mapsto M_{T}\] is a group homomorphism between \(\mathrm{GL}_{2}(\mathbb{C})\) and the Möbius transformations! In other words, composing Möbius transformations is basically just multiplying matrices!

Let \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\in\mathrm{GL}_{2}(\mathbb{C})\). If \(c=0\), the Möbius transformation \(M_{T}\) gives a biholomorphic map \[M_{T}:\mathbb{C}\xrightarrow{\sim}\mathbb{C}.\] If \(c\neq0\), then \(M_{T}\) gives a biholomorphic map \[M_{T}:\mathbb{C}-\left\{ \frac{-d}{c}\right\} \xrightarrow{\sim}\mathbb{C}-\left\{ \frac{a}{c}\right\} .\]

If \(c=0\) then \(a,d\neq0\) since \(\det T\neq0\). Hence \[M_{T}(z)=\frac{az+b}{d}=\frac{a}{d}z+\frac{b}{d}\] is just an affine linear map which is holomorphic. It is a bijection with inverse given by \((M_{T})^{-1}=\frac{d}{a}z-\frac{b}{a}\) , which is holomorphic by the same token. So \(M_{T}\) gives a biholomorphic map \(\mathbb{C}\xrightarrow{\sim}\mathbb{C}\).

If \(c\neq0\) let’s check that \(M_{T}\) is complex differentiable at every \(z\neq\frac{-d}{c}\). We can calculate

\[M_{T}'(z)=\frac{a(cz+d)-c(az+b)}{(cz+d)^{2}}=\frac{\det(T)}{(cz+d)^{2}}.\] Therefore the derivative exists at all points of \(\mathbb{C}-\left\{ \frac{-d}{c}\right\}\) and hence \(M_{T}\) is holomorphic there. It is bijective with the inverse given by the inverse Möbius transformation, which is holomorphic by the same argument. Hence \(M_{T}\) is a biholomorphism from \(\mathbb{C}-\left\{ \frac{-d}{c}\right\} \xrightarrow{\sim}\mathbb{C}-\left\{ \frac{a}{c}\right\}\).

A Möbius transformation \(M_{T}\) is conformal at all \(z\in\mathbb{C}\) with \(M_{T}(z)\neq\infty\).

We know that biholomorphic maps are conformal, and \(M_{T}\) is a biholomorphism on \(\mathbb{C}\) minus the points that map to \(\infty\). Alternatively, one can check that in both cases of the proof of Lemma [lem:Mobius-biholo], the derivative never vanishes.

Any Möbius transformation is a bijection from \(\hat{\mathbb{C}}\) to \(\hat{\mathbb{C}}\).

By Lemma [lem:Mobius-are-group] \(M_{T}\) has an inverse \(M_{T^{-1}}\), so that \(M_{T}M_{T^{-1}}=M_{T^{-1}}M_{T}\) as maps from \(\hat{\mathbb{C}}\) to \(\hat{\mathbb{C}}\), so \(M_{T}\) is a bijection.

4.2 Fixed points, the cross-ratio, and the three points Theorem

Let \(T\in\mathrm{GL}_{2}(\mathbb{C})\) and \(M_T\) be a Möbius transformation. Then a point \(z\) is a fixed point of \(M_T\) if \(M_T(z) = z\).

Let \(T\in\mathrm{GL}_{2}(\mathbb{C})\). If \(M_{T}:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\) is not the identity map, then \(M_{T}\) has at most \(2\) fixed points in \(\hat{\mathbb{C}}\). In other words, if a Möbius transformation has three fixed points in \(\hat{\mathbb{C}}\), then it is the identity.

We split into two cases.

First suppose \(M_{T}(\infty)=\infty\). From the definition, this can only happen if \(c=0\), so \(M_{T}\) preserves \(\mathbb{C}\), and for \(z\in\mathbb{C}\), \(M_{T}(z)=\frac{a}{d}z+\frac{b}{d}\) with \(a,d\neq0\) (because \(\det T\neq0\)). Such an affine linear map has at most one fixed point as follows:

  • If \(a=d\) then \(b\neq0\) since we assume \(M_{T}\) is not the identity. Then \(M_{T}\) is a translation by \(b/d\) which has no fixed points.

  • If \(a\neq d\) then \(M_{T}\) has a unique fixed point in \(\mathbb{C}\) given by \(z_{0}=b/(d-a)\).

Now suppose \(M_{T}(\infty)\neq\infty\). Then any fixed points of \(M_{T}\) are in \(\mathbb{C}\). Suppose \(z_{0}\in\mathbb{C}\) is such that \(M_{T}(z_{0})=z_{0}\). We have \[M_{\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)}(z_{0})=z_{0}\quad\iff\frac{az_{0}+b}{cz_{0}+d}=z_{0}\iff\quad cz_{0}^{2}+(d-a)z_{0}-b=0.\] There are at most two roots of this quadratic equation, so at most two fixed points of \(M_{T}\) in \(\hat{\mathbb{C}}\).

We introduce an important quantity called the cross-ratio.

Given four distinct points \(z_{0},z_{1},z_{2},z_{3}\in\mathbb{C}\), the cross-ratio of these points is denoted \((z_{0},z_{1};z_{2},z_{3})\) and defined by

\[(z_{0},z_{1};z_{2},z_{3}):=\frac{(z_{0}-z_{2})(z_{1}-z_{3})}{(z_{0}-z_{3})(z_{1}-z_{2})}.\] We can extend the definition to the case that one of the points is \(\infty\) by removing all differences involving that point, for example,

\[(\infty,z_{1};z_{2},z_{3}):=\frac{(z_{1}-z_{3})}{(z_{1}-z_{2})}.\]

We can use the cross ratio to prove that a Möbius transformation is uniquely determined by how it acts upon any three given points in \(\hat{\mathbb{C}}\).

[Three points Theorem] Let \(\{z_{1},z_{2},z_{3}\}\) and \(\{w_{1},w_{2},w_{3}\}\) be two sets of three ordered distinct points in \(\hat{\mathbb{C}}\). Then there exists a unique Möbius Transformation \(f\) such that \(f(z_{i})=w_{i}\) for \(i=1,2,3\).

Existence: Consider the functions \[F(z):=(z,w_{1};w_{2},w_{3}),\quad G(z):=(z,z_{1};z_{2},z_{3}).\] These are Möbius transformations with the properties that \(F(w_{1})=1,F(w_{2})=0,\:F(w_{3})=\infty\), and \(G(z_{1})=1,G(z_{2})=0,G(z_{3})=\infty\). Therefore \(F^{-1}\circ G\) maps each \(z_{i}\) to \(w_{i}\).

Uniqueness: Assume that there are two such maps, say \(f\) and \(\tilde{f}\). Then the Möbius transformation \(H:=f^{-1}\circ\tilde{f}\) satisfies \(H(z_{i})=f^{-1}(\tilde{f}(z_{i}))=f^{-1}(w_{i})=z_{i}.\) This shows \(H\) has three fixed points, \(z_{1},z_{2},z_{3}\), and by Lemma [lem:at-most-2-fixed-points] it must therefore be trivial. Thus, \(f=\tilde{f}\).

One has the following beautiful and fundamental fact about Möbius transformations:

[Möbius transformations preserve cross-ratio] Möbius transformations preserve the cross ratio: if \(z_{0},z_{1},z_{2},z_{3}\) are four distinct points in \(\hat{\mathbb{C}}\), and \(f\) is a Möbius transformation, then \[(f(z_{0}),f(z_{1});f(z_{2}),f(z_{3}))=(z_{0},z_{1};z_{2},z_{3}).\tag{$\ast$}\]

Let \(w_{i}=f(z_{i})\). Let \(F\) and \(G\) be the functions defined in the proof of Theorem [thm:3points]. Recall that \(F^{-1}\circ G\) mapped each \(z_{i}\) to \(w_{i}\). So does \(f\). Since there is a unique Möbius transformation with this property, we must have \(f=F^{-1}\circ G\). Rearranging, \(F\circ f=G.\) But this is just the identity stated after applying both functions to \(z_{0}:\) \[(f(z_{0}),f(z_{1});f(z_{2}),f(z_{3}))=F\circ f(z_{0})=G(z_{0})=(z_{0},z_{1};z_{2},z_{3}).\]

Finding a Möbius transformation from three points (example)

Find the unique Möbius transformation \(f:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\) mapping the points \(\{1,-1,i\}\) to the points \(\{0,\infty,1\}\). Our method is simply to rearrange \((\ast)\): Notice that \[\lim_{|w|\to\infty}\frac{f(z)-w}{w_{1}-w}\:=\:\lim_{|w|\to\infty}\frac{\frac{f(z)}{w}-1}{\frac{w_{1}}{w}-1}\:=\:\frac{-1}{-1}\:=\:1,\] so \((\ast)\) reduces to \[\begin{aligned} 1\cdot\frac{w_{1}-w_{3}}{f(z)-w_{3}}\quad &=\quad\left(\frac{z-z_{2}}{z-z_{3}}\right)\left(\frac{z_{1}-z_{3}}{z_{1}-z_{2}}\right)\\ \iff \frac{0-1}{f(z)-1}\quad & =\quad\left(\frac{z-(-1)}{z-i}\right)\left(\frac{1-i}{1-(-1)}\right)\\ \iff f(z)-1\quad & =\quad\frac{-2(z-i)}{(z+1)(1-i)}\\ \iff f(z)\quad & =\quad\frac{(-1-i)z+(1+i)}{(1-i)z+(1-i)}\\ \iff f(z)\quad & =\quad\left(\frac{-1-i}{1-i}\right)\left(\frac{z-1}{z+1}\right)\\ \iff f(z)\quad & =\quad\frac{-iz+i}{z+1}.\end{aligned}\]

Möbius transformations methods

This leads us to a general strategy to find a Möbius transformation from how it acts on three points: Simply notice that since the cross ratio is preserved, \[\left(\frac{f(z)-w_{2}}{f(z)-w_{3}}\right)\left(\frac{w_{1}-w_{3}}{w_{1}-w_{2}}\right)\quad=\quad\left(\frac{z-z_{2}}{z-z_{3}}\right)\left(\frac{z_{1}-z_{3}}{z_{1}-z_{2}}\right),\] and to find \(f\) we need only rearrange this equation.

There is also a general strategy to find the image of a region \(D\) under a Möbius transformation \(M_T\):

  1. Find the image \(M_{T}(\partial D)\) of the boundary \(\partial D\).

  2. Find the image \(M_{T}(z_{0})\) of a point \(z_{0}\in D\) in the interior.

  3. The region \(D'\) bounded by \(M_{T}(\partial D)\) and containing \(M_{T}(z_{0})\) is precisely the image of \(D\) under \(M_{T}\), and \[M_{T}:D\:\xrightarrow{\sim}\:D'=M_{T}(D).\]

4.3 Circles and lines

The following property of Möbius transformations is very useful when deciding how they map different regions of the complex plane.

Möbius transformations map circles and lines in \(\hat{\mathbb{C}}\) to circles and lines in \(\hat{\mathbb{C}}\).

[Technical] We consider any line to pass through infinity, so that the above makes sense. By circles in \(\hat{\mathbb{C}}\) we mean simply circles in \(\mathbb{C}\).

In order to talk about what happens to circles and lines under Möbius transformations we need to know the equations of circles and lines. Of course, we already know the equation of a circle or line, but it turns out these can both be described by the same type of equation using complex numbers.

A circle of centre \(\alpha\) and radius \(r\) is given by the equation \[|z-\alpha|^{2}=r^{2}\] which can be rewritten as

\[r^{2}=(z-\alpha)(\bar{z}-\bar{\alpha})=z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\alpha\bar{\alpha}.\] Introducing a new parameter \(\beta=\alpha\bar{\alpha}-r^{2}\in\mathbb{R}\) we have the equation

\[z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\beta=0.\] This is the general equation of a circle, given by parameters \(\alpha\in\mathbb{C}\) and \(\beta\in\mathbb{R}\) such that \(|\alpha|^{2}-\beta=r^{2}>0\).

What about lines? Well a line can always be written as a bisector:

\[\ell=\{z\in\mathbb{C}\::\:|z-w_{1}|=|z-w_{2}|\}\] with \(w_{1}\neq w_{2}\in\mathbb{C}\). This means

\[(z-w_{1})(\bar{z}-\bar{w_{1}})=(z-w_{2})(\bar{z}-\bar{w_{2}})\] or

\[z\bar{z}-w_{1}\bar{z}-\bar{w_{1}}z+w_{1}\bar{w_{1}}=z\bar{z}-w_{2}\bar{z}-\bar{w_{2}}z+w_{2}\bar{w_{2}}.\] Rearranging this, and writing \(\alpha=w_{1}-w_{2}\neq0\) and \(\beta=w_{1}\bar{w_{1}}-w_{2}\bar{w_{2}}\in\mathbb{R}\), we have

\[-\alpha\bar{z}-\bar{\alpha}z+\beta=0.\] This is the general equation of a line in \(\mathbb{C}\), given by parameters \(\alpha\in\mathbb{C}\) and \(\beta\in\mathbb{R}\) such that \(\alpha\neq0\).

Therefore we have proved

[Equation of circles and lines in \(\mathbb{C}\)] Given \(\gamma,\beta\in\mathbb{R}\) and \(\alpha\in\mathbb{C}\), the equation \[\gamma z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\beta=0\] describes a circle if \(\gamma=1\) and \((|\alpha|^{2}-\beta)>0\), and a line if \(\gamma=0\) and \(\alpha\neq0\). Conversely, any circle or line can be described by an equation of this form.

[Proof of Proposition [prop:circlines]] Consider an arbitrary Möbius transformation \(M_{T}\) where \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right).\) We can assume by previous remarks that \(\det T=1\). If \(c=0\), then \(M_{T}\) is affine linear and is easily seen to preserve circles and lines (since rotations, dilations, and translations obviously do). So we can assume \(c\neq0\).

Note that \[\begin{aligned} M_{T}(z) & =\frac{az+b}{cz+d}=\frac{caz+cb}{c(cz+d)}=\frac{a(cz+d)}{c(cz+d)}+\frac{cb-ad}{c(cz+d)}=\frac{a}{c}-\frac{1}{c(cz+d)}=\frac{a}{c}-\left(\frac{1}{c^{2}}\right)\left(\frac{1}{z+\frac{d}{c}}\right).\end{aligned}\] Thus \(M_{T}\) is just some linear maps composed with the function \(f(z)=1/z\), called an inversion. Since linear maps preserve circles and lines, it is therefore enough to consider the action of \(f(z)=1/z\) on lines and circles. Moreover, we have \(f^{-1}(z)=1/z=f(z)\) so it is enough to consider one direction.

Let \(X\) be a circle or line. By Lemma [lem:circles-lines], \(X\) is described by an equation \(\gamma z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\beta=0\) with \(\gamma,\beta\in\mathbb{R}\). We claim that \(f(X)\) is described by a similar equation. Indeed, if \(z\in f(X)\) and \(z\neq0\), then \(f(z)=f^{-1}(z)\in X\) and this happens if and only if \[\gamma\left(\frac{1}{z}\right)\overline{\left(\frac{1}{z}\right)}-\alpha\overline{\left(\frac{1}{z}\right)}-\bar{\alpha}\left(\frac{1}{z}\right)+\beta=0\iff\beta z\bar{z}-\alpha z-\bar{\alpha}\bar{z}+\gamma=0.\]

Finally, let’s check this is the equation of a circle or a line.

First consider the case \(\beta=0\). If we started with a line we have \(\alpha\neq0\). If we started with a circle we know \(|\alpha|^{2}=|\alpha|^{2}-\beta>0\) so we have \(\alpha\neq0\) again. Therefore, the new equation describes a line.

Next, consider the case \(\beta\neq0\). Then we can divide by \(\beta\) to get \(z\bar{z}-\frac{\alpha}{\beta}z-\bar{\frac{\alpha}{\beta}}\bar{z}+\frac{\gamma}{\beta}=0\). This looks like the equation of a circle, but we have to check \[\begin{align} \vert|\frac{\alpha}{\beta}\vert|^{2}-\frac{\gamma}{\beta} & >0\label{eq:circle-ineq}\end{align}\] If we began with a line, then \(\gamma=0\) and \(\alpha\neq0\) so this is clearly true. If we began with a circle, then \(\gamma=1\). If \(\beta<0\) then \(\eqref{eq:circle-ineq}\) is obvious since both terms are nonnegative and \(\gamma/\beta<0\). If \(\beta>0\) then \(\eqref{eq:circle-ineq}\) follows from \(|\alpha|^{2}-\beta>0\).

We we use the term circline to refer to an object that is either a circle or line. Hence the previous Proposition could have been stated ‘Möbius transformations preserve circlines’. The fact that Möbius transformations preserve circles and lines is very powerful when combined with:

Any three distinct non-colinear points \(z_{1},z_{2},z_{3}\in\mathbb{C}\) uniquely determine a circle in \(\mathbb{C}\) passing through those points. Any two distinct points uniquely determine a line passing through those points.

So to find out where a circle is mapped under a Möbius transformation, one simply needs to check where three points on the circle are mapped!

Finding the image of regions under Möbius transformations (example)

Find the image of the unit disc \(\mathbb{D}:=B_{1}(0)=\{w\in\mathbb{C}:|w|<1\}\) under the Möbius transformation corresponding to the matrix \(T=\left(\begin{array}{cc} 2+2i & -2-6i\\ 1 & -1-2i \end{array}\right)\). To determine the image we look at what happens to the boundary: we first find the image of the unit circle centred at the origin. Consider what happens to the four points on the unit circle, \(1,i,-1,-i\) under the map \(M_{T}(z)=\frac{(2+2i)z-(2+6i)}{z-(1+2i)}\). Under \(M_{T}\): \[\begin{aligned} & 1 & \mapsto\frac{(2+2i)-(2+6i)}{1-(1+2i)}=\frac{-4i}{-2i}=2;\\ & i & \mapsto\frac{(2+2i)i-(2+6i)}{i-(1+2i)}=\frac{-4-4i}{-1-i}=4;\:\\ & -1 & \mapsto\frac{-(2+2i)-(2+6i)}{-1-(1+2i)}=\frac{-4-8i}{-2-2i}\end{aligned}\] By Proposition [prop:circlines] the unit circle must be mapped to a circle, and the three calculations above tell us that the image of the unit circle must be the circle centred at \(z=3\) of radius \(1\). Note, the action of \(M_{T}\) is not something so simple as a rotation; for example we have \[-i\mapsto\frac{-(2+2i)i-(2+6i)}{-i-(1+2i)}=\frac{-8i}{-1-3i}=\frac{12+4i}{5},\quad\text{which is near the image of -1 !!}\]

What about the interior of this circle? Let’s pick a point in the unit disc and see where it is taken. The point \(z=0\) is an obvious choice. We have \[M_{T}(0)=\frac{(2+2i)0-(2+6i)}{0-(1+2i)}=\frac{2+6i}{1+2i}=\frac{(2+6i)(1-2i)}{5}=\frac{14+2i}{5},\] which lies inside this circle in question. Thus, by continuity \(M_{T}\) maps \(\mathbb{D}\) to the interior of the circle centred at \(z=3\) of radius \(1\) (it cannot map another point \(z_{0}\in\mathbb{D}\) to somewhere outside this circle as the image of the path from \(0\) to \(z_{0}\) would have to cross the boundary, but Möbius transformations are conformal). Continuity (and the existence of an inverse) tell us the map must be onto and so \[M_{T}:\mathbb{D}\:\xrightarrow{\sim}\:\{z\in\mathbb{C}:|z-3|<1\}.\]

4.4 The Riemann sphere, revisited

Since we saw earlier that we can identify \(\hat{\mathbb{C}}\) with the Riemann sphere \(S^{2}\) by stereographic projection from the north pole, we can now think of Möbius transformations as bijections from \(S^{2}\) to \(S^{2}\)! In fact, our definitions were carefully chosen so that Möbius transformations give continuous bijections from \(S^{2}\) to \(S^{2}\).

This is a very fruitful perspective that can unify our picture of Möbius transformations. I encourage you to try to think about Möbius transformations in this way whenever you can!

It turns out that the definition of holomorphicity can be extended4 to functions from \(S^{2}\) to \(S^{2}\). In fact, Möbius transformations give biholomorphic maps from \(S^{2}\) to \(S^{2}\). Even better: the Möbius transformations give all the biholomorphic maps from \(S^{2}\) to \(S^{2}\).

What do circles and lines in \(\hat{\mathbb{C}}\) correspond to in the Riemann sphere? The answer is just circles! We can now add some more to our table of correspondences in \(\eqref{eq:s2-table1}\):

\[\begin{aligned} \underline{\text{In } S^{2}} & &\underline{\text{In }\hat{\mathbb{C}}} \nonumber\\ N & \longleftrightarrow & \infty \nonumber\\ S & \longleftrightarrow & 0 \nonumber\\ & \text{Geometrically:}\nonumber\\ \text{Circle not through }N & \longleftrightarrow & \text{Circle}\nonumber\\ \text{Circle through }N & \longleftrightarrow & \text{Line}\nonumber\end{aligned}\] This also justifies our earlier convention that we think of all lines as passing through infinity.

[Advanced] One other nice fact is that stereographic projection itself is conformal! This means two tangent vectors to the sphere at the same point are mapped to two tangent vectors in \(\mathbb{C}\) (at the same point) with the same angle.

Consider the map \(f:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\), \(f(z)=i\frac{z-i}{z+i}\). Using stereographic projection, we can think of this as a transformation \(\hat{f}\) of the sphere \(S^{2}\). What is it?

We have that \(\hat{f}(N)\) should correspond to \(f(\infty)=i\) , \(\hat{f}(0,1,0)\) should correspond to \(f(i)=0\) and \(\hat{f}(0,0,-1)\) should correspond to \(f(0)=\frac{1}{i}=-i\). So \(\hat{f}\) maps \(N,(0,1,0),(0,0,-1)\) to \((0,1,0),(0,0,-1),(0,-1,0)\). We might guess that \(\hat{f}\) is simply a rotation of 90 degrees about the \(x\)-axis taking the ‘back’ hemisphere to the bottom/south hemisphere, and this turns out to be correct!

To see why this is true, let’s cheat slightly and assume that the rotation that we’ve guessed corresponds to a Möbius transformation. Then the Möbius transformation is determined by what it does to three points, and we’ve already checked that our guess agrees with the map \(f\) at three points \(\infty,i\) and \(0\).

4.5 Möbius transformations preserving the upper half plane or unit disc

Notation: For a domain \(D\subset\mathbb{C}\), denote by \({\rm Mob}(D)\) the set of all Möbius transformations \(f\) such that \(f(D)=D\), i.e. \(f\) maps \(D\) to \(D\).

[H2H] Every Möbius transformation mapping \(\mathbb{H}\) to \(\mathbb{H}\) is of the form \(M_{T}\) with \(T\) in the group \[{\rm SL}_{2}(\mathbb{R}):=\{T\in{\rm Mat}_{2}(\mathbb{R}):\det T=1\}.\] Conversely, every such Möbius transformation maps \(\mathbb{H}\) to \(\mathbb{H}\), and hence gives a biholomorphism from \(\mathbb{H}\) to \(\mathbb{H}\).

In more concise terms, we have

\[f\in\mathrm{Mob}(\mathbb{H})\iff f=M_{T}\text{ with }T\in\mathrm{SL}_{2}(\mathbb{R}).\]

This gives us a group homomorphism \(\mathrm{SL}_{2}(\mathbb{R})\to\mathrm{Mob}(\mathbb{H})\) mapping \(T\mapsto M_{T}\) and hence also a group homomorphism \(\mathrm{SL}_{2}(\mathbb{R})\to\mathrm{Aut}(\mathbb{H})\). from \(\mathbb{H}\) to \(\mathbb{H}\) are of the above form.

Any Möbius transformation \(f:\mathbb{H}\to\mathbb{H}\) must map the boundary of \(\mathbb{H}\) to the boundary of \(\mathbb{H}\). Since this boundary is precisely the real line, \(f:\mathbb{R}\cup\{\infty\}\to\mathbb{R}\cup\{\infty\}\). In particular, it must take the ordered set \(\{1,0,\infty\}\) to \(\{x_{1},x_{2},x_{3}\}\) for some \(x_{i}\in\mathbb{R}\cup\{\infty\}\). Consider the cross-ratio; assuming \(x_{i}\neq\infty\) we have \[\begin{aligned} & \left(\frac{f(z)-x_{2}}{f(z)-x_{3}}\right)\left(\frac{x_{1}-x_{3}}{x_{1}-x_{2}}\right)\quad & =\quad\left(\frac{z-0}{z-\infty}\right)\left(\frac{1-\infty}{1-0}\right)\quad=\quad z\nonumber\\ \iff & (f(z)-x_{2})(x_{1}-x_{3})\quad & =\quad z(f(z)-x_{3})(x_{1}-x_{2})\nonumber\\ \iff & f(z) & =\quad\frac{x_{3}(x_{1}-x_{2})z+x_{2}(x_{3}-x_{1})}{(x_{1}-x_{2})z+(x_{3}-x_{1})}.\nonumber\end{aligned}\] Thus, all coefficients are real, so \(T\in{\rm GL}_{2}(\mathbb{R})\).

Furthermore, if \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\in{\rm GL}_{2}(\mathbb{R})\) and \(z=x+iy\), then \[\begin{aligned} \mathrm{Im}\left(M_{T}(z)\right)=\mathrm{Im}\left(\frac{az+b}{cz+d}\right) & = & \mathrm{Im}\left(\frac{(az+b)(c\bar{z}+d)}{|cz+d|^{2}}\right)\nonumber\\ & = & \mathrm{Im}\left(\frac{adz+bc\bar{z}}{|cz+d|^{2}}\right)\nonumber\\ & = & \frac{(ad-bc)y}{|cz+d|^{2}}\quad=\quad\frac{y\det T}{|cz+d|^{2}}.\nonumber\end{aligned}\] We have \(z\in\mathbb{H}\iff y>0\) so \[M_{T}(z)\in\mathbb{H}\quad\iff\quad T\in{\rm GL}_{2}(\mathbb{R}),\:\det T>0\] and so we can replace \(T\) by a real matrix of determinant 1 by scaling \(T\) by a real number.

[D2D] Every Möbius transformation from the unit disk \(\mathbb{D}\) to itself is of the form \(M_{T}\) with \(T\) in the set \[{\rm SU}(1,1):=\left\{ T=\begin{pmatrix}\alpha & \beta\\ \bar{\beta} & \bar{\alpha} \end{pmatrix}:\:\alpha,\beta\in\mathbb{C},\:\det T=|\alpha|^{2}-|\beta|^{2}=1\right\} .\] Conversely, every such Möbius transformation maps \(\mathbb{D}\) to \(\mathbb{D}\) and hence gives a biholomorphic automorphism of \(\mathbb{D}\).

In more concise terms, we have

\[f\in\mathrm{Mob}(\mathbb{D})\iff f=M_{T}\text{ with }T\in\mathrm{SU}(1,1).\]

  • We obtain a group homomorphism \[{\rm SU}(1,1)\to\mathrm{Mob}(\mathbb{D})\qquad T\mapsto M_{T}\] and hence also a group homomorphism \(\mathrm{SU}(1,1)\to\mathrm{Aut}(\mathbb{D})\).

  • [Advanced] The group \({\rm SU}(1,1)\) is not the Special Unitary Group \(\mathrm{SU}(2)\). It is actually the set of matrices \(T\) which preserve the quadratic form \(\left<z,w\right>=\overline{z_{1}}w_{1}-\overline{z_{2}}w_{2}\) over \(\mathbb{C}\); so \(\left<Tz,Tw\right>\:=\:\left<z,w\right>\) for \(T\in{\rm SU}(1,1)\) and \(z,w\in\mathbb{C}\).

(\(\Rightarrow\)): Let \(M_{T}:\mathbb{D}\to\mathbb{D}\) be a Mob trans and consider the following picture: \[\begin{aligned} \mathbb{H}& \stackrel{f}{\rightarrow}& \mathbb{H}\\ \stackrel{M_C}{\downarrow}& &\stackrel{M_C}{\downarrow} \\ \mathbb{D}&\stackrel{M_T}{\rightarrow} &\mathbb{D}\end{aligned}\] where \(M_{C}\) is the Cayley Map, so \(C=\left(\begin{array}{cc} 1 & -i\\ 1 & i \end{array}\right)\).

We have that \(f:=M_{C^{-1}}\circ M_{T}\circ M_{C}\) is a Möbius transformation from \(\mathbb{H}\) to \(\mathbb{H}\). By Prop [prop:H2h] (H2H) we have \(f=M_{S}\) with \(S\in{\rm SL}_{2}(\mathbb{R})\), and by Lemma [lem:Mobius-are-group] we have \(S=C^{-1}TC\). Let \(S=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\) with \(ad-bc=1\) and \(a,b,c,d\in\mathbb{R}\). Then you can calculate that \[T\quad=\quad CSC^{-1}\quad=\quad\frac{1}{2}\begin{pmatrix}(a+d)+i(b-c) & (a-d)-i(c+b)\\ (a-d)+i(c+b) & (a+d)-i(b-c) \end{pmatrix}.\] Clearly this is of the required form. Finally, we have \[\det T=\det(CSC^{-1})=(\det C)^{-1}\cdot\det S\cdot\det C=\det S=1\] as required.

(\(\Leftarrow\)): Conversely, if \(T\in{\rm SU}(1,1)\) then the same calculation in reverse shows that the matrix \(S:=C^{-1}TC\) is in \({\rm SL}_{2}(\mathbb{R})\). Thus \(M_{S}:\mathbb{H}\to\mathbb{H}\) is a Möbius transformation by Prop [prop:H2h] (H2H), and by Lemma [lem:Mobius-are-group] the map \(M_{T}:=M_{C}^{-1}\circ M_{S}\circ M_{C}\) is a Möbius transformation \(\mathbb{H}\to\mathbb{H}\).

We can actually say much more about what they look like:

[D2D*] \(\quad\)

  1. Every Möbius transformation \(f\) from the unit disk \(\mathbb{D}\) to itself can be written \[f(z)\:=\:e^{i\theta}\:\frac{z-z_{0}}{\overline{z_{0}}\,z-1},\] for some angle \(\theta\) and \(z_{0}\in\mathbb{D}\) the unique point such that \(f(z_{0})=0\).

  2. All Möbius transformations of the unit disk to itself for which \(f(0)=0\) are rotations about \(0\).

The map \(g(z):=\tfrac{z-z_{0}}{\overline{z_{0}}\,z-1}\) in the above swaps \(0\) and \(z_{0}\) and is an “involution (see Sheet 6); that is, \(g\circ g=\mathrm{Id}\). Furthermore, the map \(z\mapsto e^{i\theta}z\) is a rotation. So all Möbius transformations of the unit disk are given by an involution followed by a rotation.

  1. By Proposition D2D we have \[f(z)\quad=\quad\frac{az+b}{\bar{b}z+\bar{a}}\quad=\quad\left(-\frac{a}{\bar{a}}\right)\:\left(\frac{z-\left(-\frac{b}{a}\right)}{\left(-\frac{\bar{b}}{\bar{a}}\right)z-1}\right),\] so \(z_{0}=-b/a\). Moreover, since \(\left|-\frac{a}{\bar{a}}\right|=1\), we must have \(-\frac{a}{\bar{a}}=e^{i\theta}\) for some \(\theta\in(-\pi,\pi]\). All that remains is to check that \(z_{0}\in\mathbb{D}\). Since \(|a|^{2}-|b|^{2}=1\), we have \[|z_{0}|^{2}-1\quad=\quad\left|-\frac{b}{a}\right|^{2}-1\quad=\quad\frac{|b|^{2}}{|a|^{2}}-1\quad=\quad-\frac{1}{|a|^{2}}\quad<\quad0,\] and so \(|z_{0}|<1\) as required.

  2. We have \[f(0)=0\quad\iff\quad e^{i\theta}\left(\frac{-z_{0}}{-1}\right)=0\quad\iff\quad z_{0}=0\quad\iff\quad f(z)=-e^{i\theta}z,\] and so \(f\) is a rotation.

Finding automorphisms (example)

Find a Möbius transformation \(f\) from the closed unit disc onto the closed unit disc taking \(\frac{i}{2}\) to \(0\) and \(-i\) to \(1\). Since \(\frac{i}{2}\mapsto0\), by Corollary D2D* with \(z_{0}=\frac{i}{2}\) we have \[f(z)=e^{i\theta}\left(\frac{z-\frac{i}{2}}{-\frac{i}{2}z-1}\right)=e^{i\theta}\left(\frac{2z-i}{-iz-2}\right)\] for some \(\theta\). Since \(-i\mapsto1\), we have \[1=e^{i\theta}\left(\frac{-2i-i}{-1-2}\right)\quad\iff\quad1=e^{i\theta}i\quad\iff\quad e^{i\theta}=-i.\] Thus \[f(z)=-i\left(\frac{2z-i}{-iz-2}\right)=i\left(\frac{2z-i}{iz+2}\right)=\frac{2z-i}{z-2i}\quad(\text{multiplying top and bottom by }-i).\]

Finding the image of geometric shapes under Möbius transformation (example)

Let \(F\) be the geometric figure made up of a line segment from \(0\) to \(-1\), a clockwise circular arc (tracing out the unit circle) from \(-1\) to \(i\), then a line segment from \(i\) to \(0\). Find the image of \(F\) under the Cayley Map \(M_{C}=\frac{z-i}{z+i}\).

Each section is a segment of a circle or a line, so must be taken to a segment of a circle or a line by Proposition [prop:circlines]. First, let us check where the three ‘vertices’ go. We have \[0\mapsto\frac{-i}{i}=-1;\quad-1\mapsto\frac{-1-i}{-1+i}=\frac{(1+i)^{2}}{2}=i;\quad i\mapsto\frac{i-i}{i+i}=0.\] Let’s look at where the line segment from \(0\) to \(i\) goes: \[\frac{i}{2}\mapsto\frac{\frac{i}{2}-i}{\frac{i}{2}+i}\,=\,\frac{-3i}{i}\,=\,-\frac{1}{3},\] so the line segment from \(0\) to \(i\) must map to the circle/line from \(-1\) to \(0\), passing through \(-\frac{1}{3}\); that is, it is the line segment from \(-1\) to \(0\) on the real axis.

Now, if we wanted we could do the same for the other two sections; pick a point on each remaining line/circular arc and see where it maps to, thus determining whether each image is a line segment or a circular arc. However, there is a much quicker method: We know Mob trans are conformal and so preserve angles/orientation - so tracing round the shape we must have the same angles in the image of the figure. So, travelling from \(M_{C}(0)=-1\), when we reach \(M_{C}(i)=0\) we must turn anticlockwise \(\pi/2\) degrees and head towards \(M_{C}(-1)=i\). Thus, the next section must simply be the straight line from \(0\) to \(i\). Similarly, the final section must be a circular arc from \(i\) back to \(-1\). Thus, the image is just \(F\). Actually, letting \(D=\{z\in\mathbb{D}:\pi/2<\mathrm{Arg}(z)<\pi\}\) be the interior of \(F\), one can check that \(D\) maps to \(D\) and and so \(M_{C^{-1}}\) (and therefore \(M_{C}\)) is actually in \(\mathrm{Aut}(D)\).

4.6 Finding biholomorphic maps between domains

  1. Find a biholomorphism from \(D:=\{z\in\mathbb{D}:\mathrm{Im}(z)<0\}\) to \(\mathbb{H}\). We build the map in various stages from simpler known maps.

    \(\quad\)

    • Step 1: We know the Cayley transform \(M_{C}\) is a map from \(\mathbb{H}\) to \(\mathbb{D}\), so its inverse \(M_{C^{-1}}\) maps \(\mathbb{D}\) to \(\mathbb{H}\). This is a good thing to start with as it has the right image. To find the image of \(D\) under \(M_{C^{-1}}\) first consider how it acts on two segments of the boundary:

      • The line segment from \(-1\) to \(1\) (through \(0\)). We have \(M_{C^{-1}}=\frac{iz+i}{-z+1}\) so \[M_{C^{-1}}(-1)=\frac{-i+i}{1+1}=0,\quad M_{C^{-1}}(1)=\frac{i+i}{-1+1}=\infty,\quad M_{C^{-1}}(0)=\frac{0+i}{0+1}=i.\] Thus, the line segment from \(-1\) to \(1\) (through \(0\)) is taken to the line segment from \(0\) to \(\infty\) (through \(i\)); so, the nonnegative imaginary axis. [You could also have just checked that “\(-d/c\)\(=-1\) is on the line, so it must be taken to another line.]

      • The circular arc from \(-1\) to \(1\) (through \(-i\)). We have \[M_{C^{-1}}(-i)=\frac{-i^{2}+i}{i+1}=1.\] Thus, the circular arc from \(-1\) to \(1\) (through \(-i\)) is taken to the line segment from \(0\) to \(\infty\) (through \(1\)); so, the real axis. [Instead, we could just have used conformality to deduce that this was the image - the angle and its orientation at \(z=-1\) must be preserved, so the positive real axis had to be the image.]

      Combined, this tells us that the image of \(D\) under \(M_{C^{-1}}\) is the first quadrant \(\Omega=\{w\in\mathbb{C}:0<\mathrm{Arg}(w)<\pi/2\}\) - by conformality the interior must stay on the ‘same side’ of each line segment. [You could instead explicitly check what happens to an element in \(D\), say \(z=-i/2\).]

    • Step 2: We now need a biholomorphic map from \(\Omega\) to \(\mathbb{H}\). We already know of one from earlier: the map \(g:z\mapsto z^{2}:\Omega\xrightarrow{\sim}\mathbb{H}\). Since compositions of biholomorphic maps are biholomorphic, this gives us the map we want; let \(f:=g\circ M_{C^{-1}}\) then \[f:D\:\xrightarrow{\sim}\:\mathbb{H}.\]

    We can write down the map \(f\) explicitly by composing the formulae of the functions: \[f(z)\quad=\quad g\circ M_{C^{-1}}(z)\quad=\quad g\left(\frac{iz+i}{-z+1}\right)\quad=\quad\left(\frac{iz+i}{-z+1}\right)^{2}.\]

  2. Find the image of \(\mathbb{D}\setminus\mathbb{R}_{\leq0}:=\{z\in\mathbb{D}:-\pi<\mathrm{Arg}z<\pi\}\) under the map \(h(z)=z^{1/2}\). Thus, find a biholomorphic map from \(\mathbb{D}\setminus\mathbb{R}_{\leq0}\) to the unit disc \(\mathbb{D}\).

    We have (using the principal branch) \[z=re^{i\theta}\:(\theta\in(-\pi,\pi),0<r<1)\:\iff\:z^{1/2}=\sqrt{r}e^{i\theta/2}\:\left(\frac{\theta}{2}\in(-\frac{\pi}{2},\frac{\pi}{2}),0<\sqrt{r}<1\right).\] We know the complex power function is biholomorphic on \(\mathbb{C}-\mathbb{R}_{\leq0}\), so \[h:\:\mathbb{D}\setminus\mathbb{R}_{\leq0}\:\:\xrightarrow{\sim}\:\:\{z\in\mathbb{D}:-\pi/2<\mathrm{Arg}z<\pi/2\}\] maps \(\mathbb{D}\setminus\mathbb{R}_{\leq0}\) biholomorphically to the (open) right half of the unit disc.

    We also know rotations are biholomorphic on \(\mathbb{C}\) and so \(r(z)=e^{-\pi/2}\,z=-iz\) maps the right half of the unit disc biholomorphically to the lower half of the unit disc.

    Using the previous example \(f\) from \(1.\) we can now write down a biholomorphic map from \(D:=\{z\in\mathbb{D}:\mathrm{Im}(z)<0\}\) to \(\mathbb{H}\), and then map \(\mathbb{H}\) (via the Cayley Map) to \(\mathbb{D}\). Namely, since \(M_{C}(z)=\frac{z-i}{z+i}:\mathbb{H}\xrightarrow{\sim}\mathbb{D}\) we have \(\tilde{f}:\mathbb{D}\setminus\mathbb{R}_{\leq0}\:\xrightarrow{\sim}\:\mathbb{D}\), where \[\tilde{f}(z):\:=\:(M_{C}\circ f\circ r\circ h)(z)\:=\:\frac{\left(\frac{i(-iz^{1/2})+i}{-(-iz^{1/2})+1}\right)^{2}-i}{\left(\frac{iz+i}{-z+1}\right)^{2}+i}=\frac{(z^{1/2}+i)^{2}-i(iz^{1/2}+1)^{2}}{(z^{1/2}+i)^{2}+i(iz^{1/2}+1)^{2}},\] which you can simplify further if you wish.

  3. Translations are biholomorphic on \(\mathbb{C}\) so for example \[f(z)=z-i:\{w\in\mathbb{C}:\mathrm{Im}(w)-1>\mathrm{Re}(w)\}\:\:\xrightarrow{\sim}\:\:\{w\in\mathbb{C}:\mathrm{Im}(w)>\mathrm{Re}(w)\}.\] Since \[g(z)=e^{-i\frac{\pi}{4}}z=\frac{1-i}{\sqrt{2}}z:\quad\{w\in\mathbb{C}:\mathrm{Im}(w)>\mathrm{Re}(w)\}\:\xrightarrow{\sim}\:\mathbb{H}\] we have that \(\tilde{f}:=g\circ f\) satisfies \[\tilde{f}(z)=\{w\in\mathbb{C}:\mathrm{Im}(w)-1>\mathrm{Re}(w)\}\:\xrightarrow{\sim}\:\mathbb{H}.\] Explicitly, \(\tilde{f}(z)=\left(\frac{1-i}{\sqrt{2}}\right)(z-i)\).


  1. These notes are essentially the notes of Sabine Bögli from 2021–2022, which in turn were essentially those of Michael Magee from 2018-2019, in which it says: Chapters 1-5 are an evolution of notes of Stephen Harrap, which were in turn based on original notes of Jens Funke. Chapters 6 onwards are based on notes of Thanasis Bouganis.↩︎

  2. The multiplication of complex numbers is commutative and associative. The fact that such a multiplication exists in 2 dimensions is truly remarkable: there is no such multiplication on \(\mathbb{R}^{3}\), not even if we relax the condition that it is commutative! The search for an associative multiplication on \(\mathbb{R}^{3}\) lead Hamilton to discover (a non-commutative) one on \(\mathbb{R}^{4}\) instead which was his famous discovery of the quaternions.↩︎

  3. To be precise, we would like the solution to \(e^{z}=w\) to vary as a continuous function of \(w\); see later for the precise definition of continuous function.↩︎

  4. The trick is to use the stereographic projection maps to transfer your function to \(\mathbb{C}\), and use the definition of holomorphicity there.↩︎