Complex Analysis II (Michaelmas)

Katie Gittins1

2022-2023

1 The Complex Plane and Riemann Sphere

1.1 Complex numbers

A complex number \(z\) is a quantity of the form \(z=x+iy\), where \(x,y\) are real numbers and \(i\) is the imaginary unit. We denote by \(\mathbb{C}\) the set of all complex numbers.

Algebra

We can add, subtract and multiply complex numbers: If \(z_{1}=x_{1}+iy_{1}\) and \(z_{2}=x_{2}+iy_{2}\) then \[z_{1}\pm z_{2}:=(x_{1}\pm x_{2})+i(y_{1}\pm y_{2}),\] \[z_{1}z_{2}:=(x_{1}x_{2}-y_{1}y_{2})+i(x_{1}y_{2}+x_{2}y_{1}).\] Notice that addition simply corresponds to adding the individual components. In general we denote by \(\mathrm{Re}(z)=x\) the real part of \(z\), and by \(\mathrm{Im}(z)=y\) the imaginary part of \(z\). By the definition of multiplication we have \(i^{2}=-1\), and using this we see that multiplication corresponds to ‘multiplying out the brackets’: \((x_{1}+iy_{1})(x_{2}+iy_{2})=(x_{1}x_{2}+i^{2}y_{1}y_{2})+i(x_{1}y_{2}+x_{2}y_{1}).\)

We can also divide complex numbers. For \(z_{2}\neq0\) (here we use the shorthand \(0=0+0i\)) we have \[\frac{z_{1}}{z_{2}}=\frac{x_{1}+iy_{1}}{x_{2}+iy_{2}}=\frac{(x_{1}+iy_{1})(x_{2}-iy_{2})}{(x_{2}+iy_{2})(x_{2}-iy_{2})}=\frac{x_{1}x_{2}+y_{1}y_{2}}{x_{2}^{2}+y_{2}^{2}}+i\frac{x_{2}y_{1}-x_{1}y_{2}}{x_{2}^{2}+y_{2}^{2}}\quad\in\mathbb{C}.\] The quantity we used to make the denominator real is important. In general, for \(z=x+iy\) we call \(\bar{z}:=x-iy\) the complex conjugate of \(z\).

We immediately have a multiplicative inverse \[z^{-1}:=\frac{1}{z}=\frac{x}{x^{2}+y^{2}}-i\frac{y}{x^{2}+y^{2}}.\] WARNING: While most of the nice properties of \(\mathbb{R}\) hold in \(\mathbb{C}\), we do not have notions of \(\leq\) \(<\), \(\geq\) or \(>\); the set \(\mathbb{C}\) is not ‘ordered’ and expressions like \(z_{1}<z_{2}\) have no meaning.

How do we visualise the complex numbers?

There are various ‘models’ for the complex numbers. The most commonly used/most intuitive is to think of \(\mathbb{C}\) as a copy of \(\mathbb{R}^{2}\) equipped with a map \[\mathbb{R}^{2}\times\mathbb{R}^{2}\rightarrow\mathbb{R}^{2}:\left((x_{1},y_{1}),(x_{2},y_{2})\right)\mapsto(x_{1}x_{2}-y_{1}y_{2},x_{1}y_{2}+x_{2}y_{1}).\] So, a copy of \(\mathbb{R}^{2}\) with a way of multiplying2 (and dividing) vectors! Indeed there is an obvious bijection \(f:\mathbb{R}^{2}\rightarrow\mathbb{C}\) given by \(f((x,y))=x+iy\). As a result, we often draw complex numbers on the usual \((x,y)\)-plane: such a picture is called an Argand diagram.

On \(\mathbb{R}^{2}\) there is a natural notion of size, and we use it in \(\mathbb{C}\): we call the quantity \(|z|:=\sqrt{x^{2}+y^{2}}\) the modulus or absolute value of \(z\) \((=x+iy)\).

[Important Properties of Complex numbers]  

  1. \(z_{1}z_{2}=0\quad\iff z_{1}=0\) or \(z_{2}=0,\)

  2. \(|z|=\sqrt{z\bar{z}},\)

  3. \(\mathrm{Re}(z)=\frac{z+\bar{z}}{2}\) and \(\mathrm{Im}(z)=\frac{z-\bar{z}}{2i},\)

  4. \(z^{-1}=\frac{\bar{z}}{|z|^{2}}.\)

Property \(1.\) is very important, and makes \(\mathbb{C}\) an integral domain (see Algebra II).

Polar form/coordinates of a complex number

Now we have a notion of distance, as in \(\mathbb{R}^{2}\) we can implement a change of variables \(z(x,y)\rightarrow z(r,\theta)\). Let \(r=|z|\) and let \(\theta\) denote the anticlockwise angle measured from the real axis. (Angles measured clockwise will be considered negative.) We call \(\theta\) the argument of \(z\) (for \(z\neq0\)) and write \(\arg(z)=\theta\). We then have the following polar coordinates for \(z\): \[z=r(\cos\theta+i\sin\theta),\] which we write in shorthand as \(z=re^{i\theta}\).

For example, \(i=e^{i\pi/2}\) and \(1+i=\sqrt{2}e^{i\pi/4}\). We have \(|-1/\sqrt{2}-i\sqrt{3}/\sqrt{2}|=\sqrt{2}\), so \[-1/\sqrt{2}-i\sqrt{3}/\sqrt{2}=\sqrt{2}(-1/2-i\sqrt{3}/2)=\sqrt{2}e^{-i2\pi/3}.\] Note that \(\arg(z)\) is only defined up to multiples of \(2\pi\); for example \(i=e^{i\pi/2}=e^{i5\pi/2}=e^{-i3\pi/2}\). Strictly speaking \(\arg(i)=\pi/2+2\pi k\), for any \(k\in\mathbb{Z}\) (and so \(\arg\) is a one-to-many function!). As a result, we need to be careful; we choose a fixed interval in which to express the argument: the principal value of \(\arg(z)\) is the value in the interval \((-\pi,\pi]\) and will be denoted \(\mathrm{Arg}(z)\). So \(\mathrm{Arg}(i)=\pi/2\) and \(\mathrm{Arg}(-1)=\pi\) for example.

[Properties of argument] We have the following properties of the argument:

  1. \(\arg(z_{1}z_{2})=\arg(z_{1})+\arg(z_{2})\bmod2\pi\)

  2. \(\arg(1/z)=-\arg(z)\bmod2\pi\)

  3. \(\arg(\bar{z})=-\arg(z) \bmod2\pi\) .

When we say two real numbers are equal \(\bmod2\pi\) we mean they differ by an integer multiple of \(2\pi\).

It is nice to have a geometric picture of what the algebraic operations on complex numbers mean.

Geometrically, multiplication in \(\mathbb{C}\) is given by a dilated rotation; i.e., if \(z_{1}=r_{1}e^{i\theta_{1}}\) and \(z_{2}=r_{2}e^{i\theta_{2}}\) then \[z_{1}z_{2}=r_{1}r_{2}\left(\cos(\theta_{1}+\theta_{2})+i\sin(\theta_{1}+\theta_{2})\right)=r_{1}r_{2}e^{i(\theta_{1}+\theta_{2})}.\] In particular, multiplying by \(z_{2}\) constitutes an (anticlockwise) rotation of \(z_{1}\) by \(\theta_{2}\) degrees, followed by a dilation with factor \(r_{2}\). Furthermore, addition represents a translation and conjugation represents a reflection in the real axis. Taking the real or imaginary part of a complex number \(z\) represents a projection of \(z\) onto the real or imaginary axis respectively.

By the standard double angle formula we have \[\begin{aligned} z_{1}z_{2} & =r_{1}r_{2}\left(\cos\theta_{1}+i\sin\theta_{1}\right)\left(\cos\theta_{2}+i\sin\theta_{2}\right)\\ & =r_{1}r_{2}\left((\cos\theta_{1}\cos\theta_{2}-\sin\theta_{1}\sin\theta_{2})+i(\sin\theta_{1}\cos\theta_{2}+\sin\theta_{2}\cos\theta_{1})\right)\\ & =r_{1}r_{2}\left(\cos(\theta_{1}+\theta_{2})+i\sin(\theta_{1}+\theta_{2})\right).\end{aligned}\] The geometric interpretations of addition, conjugation, and real/imaginary parts are fairly obvious.

  1. \(|z_{1}z_{2}|=|z_{1}|\,|z_{2}|,\)

  2. De Moivre’s formula: \((\cos\theta+i\sin\theta)^{n}=\cos(n\theta)+i\sin(n\theta)\).

The modulus also has the following important properties.

  1. (Triangle inequality) \(|z_{1}+z_{2}|\leq|z_{1}|+|z_{2}|\)

  2. \(|z|\geq0\text{ and }|z|=0\iff z=0\)

  3. \(\max(|\mathrm{Re}(z)|,|\mathrm{Im}(z)|)\leq|z|\leq|\mathrm{Re}(z)|+|\mathrm{Im}(z)|\)

The first two properties above along with Corollary [cor:norm-mult] make the modulus a norm on \(\mathbb{C}\) (see later, Definition [def:norm]).

Simple complex functions and geometry (examples)

  • We can also use functions to define regions in the complex plane. Consider the set of points \(z\) which satisfy the inequality \(|z-i|<|z+i|\). This is precisely the points in \(\mathbb{C}\) whose distance to \(i\) is strictly smaller than their distance to \(-i\). Thus, the inequality represents the upper half plane \(\mathbb{H}:=\{z\in\mathbb{C}:\mathrm{Im}(z)>0\}\).

  • Note that the equation \(|z-i|=1\) represents a circle centred at \(i\) of radius \(1\).

1.2 Exponential and trigonometric functions

[Complex exponential] We define the complex exponential function \(\exp:~\mathbb{C}\rightarrow\mathbb{C}\) by \[\exp(z):=e^{x}(\cos y+i\sin y).\qquad\qquad(z=x+iy)\] As shorthand we write \(\exp(z)=e^{z}\).

We will see later that \(e^{z}=\sum_{n=0}^{\infty}\tfrac{z^{n}}{n!}\) as in real analysis. We could have started with this as the definition instead.

We have the following properties of the complex exponential function:

  1. \(e^{z}\ne0\quad\) for all \(z\in\mathbb{C}\),

  2. \(e^{z_{1}+z_{2}}=e^{z_{1}}e^{z_{2}}\),

  3. \(e^{z}=1\quad\Longleftrightarrow\quad z=2\pi ik\) for some \(k\in\mathbb{Z}\),

  4. \(e^{-z}=1/e^{z}\),

  5. \(|e^{z}|=e^{\mathrm{Re}(z)}\).

Most are straightforward to check. \(3.\) is very important: \(\exp(z)=1\iff e^{x}\cos y=1\) and \(e^{x}\sin y=~0\). Since \(e^{x}>0\), the latter is equivalent to \(\sin y=0\) and so \(y=n\pi\) for some \(n\in\mathbb{Z}\). Thus we have \(\exp(z)=1\iff e^{x}\cos(n\pi)=1\iff e^{x}(-1)^{n}=1\iff n\) is even and \(e^{x}=1\iff x=0\) and \(y=2k\pi\) (\(k\in\mathbb{Z}\)).

We have \(\exp(2\pi i)=1\) and \(\exp(\pi i)=-1\). The latter is Euler’s formula.

The complex exponential function is \(2\pi i\)-periodic; that is, \(\exp(z+2k\pi i)=\exp(z)\) for any \(k\in\mathbb{Z}\).

The above implies \(\exp\) is determined entirely by the values it takes in any horizontal strip of width \(2\pi\) in the complex plane. Note also that \(\exp\) is ‘unbounded’, since by \(5.\) the modulus \(|\exp(z)|\) gets arbitrarily large as \(\mathrm{Re}(z)\) increases.

[Trigonometric functions] All as functions from \(\mathbb{C}\rightarrow\mathbb{C}\), we define \[\begin{aligned} \sin(z) & :=\frac{1}{2i}(e^{iz}-e^{-iz})\qquad & \cos(z):=\frac{1}{2}(e^{iz}+e^{-iz})\\ \sinh(z) & :=\frac{1}{2}(e^{z}-e^{-z})\qquad & \cosh(z):=\frac{1}{2}(e^{z}+e^{-z})\end{aligned}\] (For \(z=x\) real these coincide with the real functions. We will eventually get power series expressions for them from the one for the exponential function.)

All the usual double angle formulae and equations relating the functions hold: e.g., \(\cos^{2}(z)+\sin^{2}(z)=1\). Additionally, notice that we have \(\cosh(iz)=\cos(z)\) and \(\cos(iz)=\cosh(z)\), while \(\sinh(iz)=i\sin(z)\) and \(\sin(iz)=i\sinh(z)\). All four functions are unbounded.

1.3 Logarithms and complex powers

We use the notation \(\mathbb{C}^{*}=\mathbb{C}-\{0\}\), i.e. the set of nonzero complex numbers.

[Inverting the exponential function] For every \(w\in\mathbb{C}^{\ast}\), the equation \[\begin{align} e^{z} & =w\label{eq:exp}\end{align}\] has a solution \(z\). Furthermore, if we write \(w=|w|e^{i\phi}\) with \(\phi=\mathrm{Arg}(w)\), then all solutions to \(\eqref{eq:exp}\) are given by \[\begin{equation} z=\log|w|+i(\phi+2k\pi)\quad\text{ for }k\in\mathbb{Z}.\label{eq:expsolution} \end{equation}\] Here, \(\log|w|\) is the usual natural logarithm of the real number \(|w|\). Note that there are infinitely many solutions.

If \(z\) is of the form as in \(\eqref{eq:expsolution}\) for some given \(k\in\mathbb{Z}\), then \[e^{z}=e^{\log|w|+i(\phi+2k\pi)}=e^{\log|w|}e^{i(\phi+2k\pi)}=e^{\log|w|}e^{i(\phi)}=|w|e^{i(\phi)}=w,\] by Proposition [prop:exponential-properties], Part \(2.\) and Corollary [cor:Euler]. Thus, \(z\) is a solution.

To see all solutions are of the given form, first write \(z=x+iy\) and assume \(e^{z}=w\). Since \(e^{x}e^{iy}=e^{z}=w=|w|e^{i\phi}\), we have \(|e^{z}|=e^{x}=|w|\). Thus \(x=\log|w|\). Moreover, dividing both sides by \(|w|\) we have \(e^{iy}=e^{i\phi}\) and so \(e^{i(y-\phi)}=1\). It follows from Proposition [prop:exponential-properties], Part \(3.\) that all solutions are given by \(i(y-\phi)=2k\pi i\) for some \(k\in\mathbb{Z}\); in other words, \(y=\phi+2k\pi\).

We now come to an important topic called branch cuts.

First we give a high-level description of why branch cuts are necessary. We have just described for fixed \(w_{0}\neq0\), exactly what the possible numbers are such that \(e^{z}=w_{0}\). These values of \(z\) could all reasonably be called \(\log(w_{0})\). Can we make this choice of solution to \(e^{z}=w_{0}\) vary nicely as we move \(w_{0}\) a little bit? 3. Of course, we will run into trouble at \(0\), since \(e^{z}=0\) has no solutions. Bearing this in mind, could we at least define a ‘\(\log\)’ function that is ‘continuous’ on \(\mathbb{C}\setminus \{0\}\)? Since we don’t know the definition of continuous yet, let us just ask that the function should not jump abruptly when we move from a point to a nearby one.

The answer is no, and let’s see why by trying to come up with one.

Let’s suppose \(w_{0}=1\) and we pick a solution to \(e^{z}=w_{0}\). The obvious one is \(z=0\) so let’s pick that. Now let \(w\) be close to \(1\). If we’ve found a solution to \(e^{z}=w\) then we know from Lemma [lem:Invertingexp] that it must be of the form

\[z=\log|w|+i(\mathrm{Arg}(w)+2\pi k)\] for some \(k\in\mathbb{Z}\). Since \(w\) is close to \(1\), we know \(\log|w|\) is close to \(0\) and \(\mathrm{Arg}(w)\) is close to \(0\). Now, if \(k\) is not zero, then \(z\) will not be close to \(0\) (any integer that is not zero has absolute value at least one!). So \(k\) must be zero for \(z\) to be close to \(0\). We just argued that if we move \(w\) a little, and if we want \(\log\) to be ‘continuous’, we must choose the argument of \(w\) ‘continuously’.

Now let’s take our idea a little further. If we move \(w\) on a path beginning at \(1\), following the unit circle anticlockwise, until we reach \(1\) again, what happens to our solutions to \(e^{z}=w\) if we are choosing them continuously as above? In other words, what happens to the argument of \(w\)? Since we are moving anticlockwise and we are choosing the argument continuously, it is increasing as we go around the unit circle. So just before we complete the circle, the argument of \(w\) is just below \(2\pi\). This is a big problem, since it means there are values of \(w\) just below \(1\) on the unit circle where we have been forced to set \(\log(w)\) very close to \(2\pi i\). On the other hand, we began by assuming \(\log(1)=0\). So the way we have tried to do things, our \(\log\) function is going to have a jump discontinuity below 1.

No matter what way we try to define \(\log\) continuously on \(\mathbb{C}\setminus \{0\}\), we will run into a similar problem. On the other hand, if \(R_{\theta}\) is any ray of the form

\[R_{\theta}=\{re^{i\theta}\::\:r\in\mathbb{R},\:r\geq0\}\subset\mathbb{C}\] then it is possible to define a continuous function \(\log(z)\) on \(\mathbb{C}\setminus R_{\theta}\). (One can think of cutting out this ray as cutting out the points where \(\log\) will have a jump discontinuity).

[Complex logarithm functions]

For any two real numbers \(\theta_{1}<\theta_{2}\) with \(\theta_{2}-\theta_{1}=2\pi\), let \(\arg\) be the choice of argument function with values in \((\theta_{1},\theta_{2}]\). Then the function

\[\log(z):=\log|z|+i\arg(z)\] is called a branch of logarithm. It has a jump discontinuity along the ray \(R_{\theta_{1}}=R_{\theta_{2}}\). This ray is called a branch cut.

If we choose \(\arg(z)=\mathrm{Arg}(z)\in(-\pi,\pi]\), then we obtain a branch of logarithm called the principal branch of log. We write \(\mathrm{Log}\) for this principal branch: it is given by the formula \[\mathrm{Log}(z):=\log|z|+i\mathrm{Arg}(z).\] The principal branch of logarithm has a ‘jump discontinuity’ along the ray given by the non-positive real axis \(\mathbb{R}_{\leq0}\).

  • Any time one talks about a function called \(\log\), one has to declare which branch of log we use. This is normally done simply by stating the interval \((\theta_{1},\theta_{2}]\) where \(\arg(z)\) lives.

  • As soon as we define continuous functions, we will easily be able to see that the branch of log corresponding to \(\arg(z)\in(\theta_{1},\theta_{2}]\) is continuous on \(\mathbb{C}\setminus R_{\theta_{1}}\).

  • The principal branch, \(\mathrm{Log}\), agrees with the natural logarithm \(\log\) on the real line; that is, for \(x>0\) we have \(\mathrm{Log}x=\log x\). For this reason we will always use the principal branch unless otherwise stated.

[Properties of logarithms] We have the following properties when using any given branch of logarithm:

  1. \(e^{\log z}=z\) for any \(z\in\mathbb{C}\setminus \{0\}\), but,

  2. in general, \(\log(zw)\neq\log z+\log w\), and

  3. in general, \(\log(e^{z})\neq z\).

We would now like to define functions giving powers of complex numbers. We already know from the world of real numbers that to define a function giving for example, a square root of a positive real number, we have to make a choice of whether to take the positive or negative root. To take a root of a complex number, we have to make a similar choice, but we have more freedom.

[Complex powers] For \(w\in\mathbb{C}\) fixed, by choosing any branch of \(\log\) we can define a branch of the function \(z\mapsto z^{w}\) by the expression \[z^{w}:=\exp({w\log z}).\] For example, if \(w=1/n\) and we use the principal branch we get \[z^{1/n}=e^{(\log|z|+i\mathrm{Arg}(z))/n}=|z|^{1/n}e^{i\mathrm{Arg}(z)/n}.\]

Warning: different branches of log can give different power functions! So we must always specify which branch of \(\log\) we are using.

Now that we have defined complex powers, we should check that our exponential function matches up with the concept of ‘raising \(e\) to the power \(z\)’ for a suitable choice of \(\log e\). The natural choice of \(\log e\) is \(1\). Then, \(e\) raised to the power \(z\) should agree with computing \(\exp(z\log e)=\exp(z)\) as we expected.

Finding values of log and complex powers (examples)

\((a)\) Using the principal branch of \(\log\), we find \(\log(1-i)\) and \((1-i)^{1/2}\). We have \(|1-i|=\sqrt{2}\) and \(\mathrm{Arg}(1-i)=-\pi/4\). Thus, \(1-i=\sqrt{2}e^{-i\pi/4}\). Therefore, \(\mathrm{Log}(1-i)=\log|1-i|+i\mathrm{Arg}(1-i)=\log\sqrt{2}-i\pi/4\), and \[(1-i)^{1/2}=\exp\left(\frac{1}{2}\,\mathrm{Log}(1-i)\right)=\exp\left(\frac{1}{2}\,\left(\log\sqrt{2}-i\frac{\pi}{4}\right)\right)=\exp\left(\log\sqrt[4]{2}-i\frac{\pi}{8}\right)=\sqrt[4]{2}e^{-i\frac{\pi}{8}}.\] \((b)\) Using the principal branch and the previous example \[(1-i)^{i}=\exp(i\mathrm{Log}(1-i))=\exp\left(\frac{\pi}{4}+i\log\sqrt{2}\right)=e^{\pi/4}e^{i\log\sqrt{2}}.\] \((c)\) Again, using the principal branch \[2^{1/2}=\exp\left(\frac{1}{2}\log2\right)=\exp(\log\sqrt{2})=\sqrt{2}.\] What about the other root? It comes from using a different branch: if we let \(\log\) be the branch of logarithm corresponding to \(\arg(z)\in(\pi,3\pi]\) then \(\log(z)=\log|z|+i(\mathrm{Arg}(z)+2\pi)\), so we have \(\log(2)=\log2+i2\pi\) and \[2^{1/2}=\exp\left(\frac{1}{2}(\log2+i2\pi)\right)=\exp(\log\sqrt{2}+i\pi)=\sqrt{2}e^{i\pi}=-\sqrt{2}.\]

All \(n\)th roots can be found this way (see Sheet 1 Q23). In particular, for \(z\in\mathbb{C}^*\) all \(n\)th roots of \(z\) are of the form \[z^{1/n}=|z|^{1/n}\exp\left(i\frac{\mathrm{Arg}(z)}{n}+\frac{2k\pi i}{n}\right)\quad\text{ for }k=0,\ldots,n-1.\]

Visualizing complex functions

The ‘graph’ of a complex-valued function \(f:\mathbb{C}\to\mathbb{C}\) is 4-dimensional, so difficult to visualise - we certainly can’t draw it. We can employ other techniques to get a grasp on complex functions:

  • We can graph the real-valued function \(|f|:\mathbb{C}\to\mathbb{R}\). For example, consider the complex function \(\cos z\). When \(z=x\) is purely real, we have that \(|f(z)|=|\cos(x)|\) is obviously just the modulus of the real cosine function. But for \(z=iy\) purely complex we have \(|f(z)|=|\cos(iy)|=|\cosh y|=\cosh y.\) So in the imaginary direction \(f\) simply looks like cosh!

  • It is often useful to visualise complex functions by considering how they map regions of the complex plane. Consider the image of the ‘right half-plane’ \(\mathbb{H}_{R}:=\{z\in\mathbb{C}:\mathrm{Re}(z)>0\}\) under the map \(f(z)=z^{2}\). Note that \(\mathbb{H}_{R}=\{z\in\mathbb{C}:-\pi/2<\mathrm{Arg}(z)<\pi/2\}\). If \(z=re^{i\theta}\in\mathbb{H}_{R}\) then \(z^{2}=zz=r^{2}e^{i2\theta}\) has argument \(2\theta\in(-\pi,\pi)\). Thus, \(f\) maps \(\mathbb{H}_{R}\) to \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\), where \(\mathbb{R}_{\leq0}\) denotes the negative real axis (including \(0\)). The map is onto, since for every \(w=se^{i\phi}\in\mathbb{C}\setminus\mathbb{R}_{\leq0}\) (so \(\phi\in(-\pi,\pi)\)) we can find \(z\in\mathbb{H}_{R}\) such that \(f(z)=w\); namely we can choose \(z=\sqrt{s}e^{i\phi/2}\in\mathbb{H}_{R}\).

    Similarly, the left half plane \(\mathbb{H}_{L}=\{z\in\mathbb{C}:\mathrm{Re}(z)<0\}\) is mapped to \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\). Moreover, \(f\) maps both the strictly positive imaginary axis \(i\mathbb{R}_{>0}\) and the strictly negative imaginary axis \(i\mathbb{R}_{<0}\) (given by \(i\mathbb{R}_{>0}=\{iy\in\mathbb{C}:y>0\}\) and \(i\mathbb{R}_{<0}=\{iy\in\mathbb{C}:y<0\}\) respectively) to the strictly negative real axis \(\mathbb{R}_{<0}\).

    Adding the observation that \(f(0)=0\), we have that \(f(z)=z^{2}\) in essence maps \(\mathbb{C}\) to two copies of itself (except for the origin, which is only attained once in the image - remember this different behaviour at the origin later!)

  • Branches of log. Let \(\log\) be the branch of logarithm corresponding to \(\arg(z)\in(\theta_{1},\theta_{2}]\). Then \(\log\) maps \(\mathbb{C}\setminus R_{\theta_{1}}\) to the infinite horizontal strip \[\{z\in\mathbb{C}\::\:\theta_{1}<\mathrm{Im}(z)\leq\theta_{2}\}.\] Infinite rays emanating from \(0\) map to horizontal lines, and circles centred at zero, minus their interection with \(R_{\theta_{1}}\), map to vertical line segments.

1.4 The Riemann Sphere and extended complex plane

It is very useful at various points of the course to extend the complex plane by adding a point ‘at infinity’. To do this, we create a new object called ‘infinity’, denoted \(\infty\), and consider the set

\[\hat{\mathbb{C}}:=\mathbb{C}\cup\{\infty\}.\] At the moment, we have accomplished nothing really. What will be useful later is that we can think of the point \(\infty\) as being glued ‘nicely’ onto \(\mathbb{C}\). The correct way to do this is by introducing the Riemann sphere.

The Riemann sphere

Consider the unit sphere \(S^{2}:=\{(x,y,s)\in\mathbb{R}^{3}:x^{2}+y^{2}+s^{2}=1\}\) in \(\mathbb{R}^{3}\) and consider a copy of \(\mathbb{C}\) embedded in \(\mathbb{R}^{3}\) by identifying \(\mathbb{C}=\mathbb{R}^{2}\) with the \((x,y)\)-plane. Explicitly, a point \(x+iy\in\mathbb{C}\) corresponds to the point \((x,y,0)\in\mathbb{R}^{3}\).

Let \(N=(0,0,1)\in S^{2}\) denote the ‘north pole’. For any point \(v\in S^{2}\setminus\{N\}\), there is a unique straight line \(L_{N,v}\) passing through \(N\) and \(v\). Since \(v\neq N\), this line is not parallel to the \((x,y)\)-plane. Hence it intersects the \((x,y)\)-plane in a unique point \((x,y,0)\). This corresponds to the point \(x+iy\in\mathbb{C}\). We have defined a map \(P:S^{2}\setminus\{N\}\to\mathbb{C}\) by \(P(v)=x+iy\) in the notation of the preceding discussion. The map \(P\) is called the stereographic projection (from the north pole).

What is the formula for stereographic projection? Let \((x,y,s)\in S^{2}\setminus\{N\}\). Note that \(s\neq1\). The equation of the line passing through the point \((x,y,s)\in S^{2}\) and the North Pole \(N=(0,0,1)\in S^{2}\) is given by \[\gamma(t)\:=\:N+\left(\begin{pmatrix}x\\ y\\ s \end{pmatrix}-N\right)\,t\:=\:\begin{pmatrix}0\\ 0\\ 1 \end{pmatrix}+\begin{pmatrix}x\\ y\\ s-1 \end{pmatrix}\,t,\quad\quad\quad(t\in\mathbb{R}).\] This clearly intersects the plane when \(t=\frac{1}{1-s}\). Thus \[P:(x,y,s)\mapsto\frac{x}{1-s}+\frac{iy}{1-s}.\]

It is possible to find an inverse to \(P\): given any point \(z\in\mathbb{C}\), draw the straight line passing through \(N\) and \(z\) (thinking of \(\mathbb{C}\) as the \((x,y)\)-plane as before). This straight line passes through \(S^{2}\) in exactly one point. Hence \(P\) is a bijection that identifies \(S^{2}\setminus\{N\}\) with \(\mathbb{C}\).

Now we have identified \(\mathbb{C}\) with \(S^{2}\setminus\{N\}\) via the map \(P\), it gives us a natural way to view the added point \(\infty\) of \(\hat{\mathbb{C}}\). It should correspond to adding back in the north pole to \(S^{2}\setminus\{N\}\). In other words, we should think of \(\hat{\mathbb{C}}\) simply as the entire sphere \(S^{2}\)!

In fact, one can show that we have the following correspondences/mappings: \[\begin{align} \underline{\text{In } S^{2}} & & \underline{\text{In } \hat{\mathbb{C}}}\label{eq:s2-table1}\\ N & \longleftrightarrow & \infty\nonumber \\ S & \longleftrightarrow & 0\nonumber \\ \text{Equator } & \longleftrightarrow & \text{Unit circle }\{z\in\mathbb{C}:|z|=1\}\nonumber \\ \text{(open) Southern hemisphere} & \longleftrightarrow & \mathbb{D}:=\{z\in\mathbb{C}:|z|<1\}\nonumber \\ \text{(open) Northern hemisphere} & \longleftrightarrow & \hat{\mathbb{C}}\setminus\overline{B}_{1}(0)=\hat{\mathbb{C}}\setminus\{z\in\mathbb{C}:|z|\leq1\}\nonumber \\ & \text{Algebraically:}\nonumber \\ (x,y,s) & \longleftrightarrow & \frac{x+iy}{1-s}\quad\:\:\text{(Stereo. Proj.)}\nonumber \\ \frac{1}{|z|^{2}+1}\left(2\mathrm{Re}(z),2\mathrm{Im}(z),|z|^{2}-1\right) & \longleftrightarrow & z\quad\quad\quad\quad\text{(Inverse Stereo. Proj.)}\nonumber\end{align}\]

Note that we could have used the south pole \(S=(0,0,-1)\), rather than \(N\), to define the projection. In that case we would have the correspondence \((x,y,s)\mapsto\frac{x+iy}{1+s}\) (and you can check that the map \(f(z)=1/\bar{z}\) takes \(\frac{x+iy}{1-s}\) to \(\frac{x+iy}{1+s}\)).

The Riemann sphere is the unit sphere \(S^{2}\subset\mathbb{R}^{3}\) along with the stereographic projections from the north and south pole.

Later in your studies you might learn that the Riemann sphere is a special example of a Riemann surface. The purpose of considering the two stereographic projection maps as part of the definition is that any point in \(S^{2}\) is in the domain of one of the projections, so informally speaking, the maps allow us to think of a region nearby to any point in \(S^{2}\) as a region inside \(\mathbb{C}\).

2 Metric Spaces

2.1 Metric spaces

Now we have another way of thinking about \(\hat{\mathbb{C}}\) - as a sphere in \(\mathbb{R}^{3}\) - it looks like there might be (at least!) two natural ways to measure the distance between two points \(z\) and \(w\) in the extended complex plane (or indeed in \(\mathbb{C}\)).

In both cases, the set is the same, but there is a different notion of distance. A metric space is a set together with a ‘distance’ function that satisfies certain axioms.

[Metric spaces] A metric space is a set \(X\) together with a function \(d:X\times X\to\mathbb{R}_{\geq0}\) such that for all \(x,y,z\in X\)

  • (D1) Positivity. \(d(x,y)\geq0\) and \(d(x,y)=0\Longleftrightarrow x=y\);

  • (D2) Symmetry. \(d(x,y)=d(y,x)\);

  • (D3) Triangle inequality. \(d(x,y)\leq d(x,z)+d(z,y).\)

The function \(d\) is called a metric and we will often denote a metric space by \((X,d)\).

Examples of Metrics

  1. The metric induced by the modulus function \(|\,.\,|\) on \(\mathbb{R}\) or \(\mathbb{C}\). We can define a distance function \(d\) on \(\mathbb{R}\times\mathbb{R}\) or \(\mathbb{C}\times\mathbb{C}\) by the formula \(d(x,y)=|x-y|\). This metric satisfies (D1)-(D3) by the properties \(1.\) and \(2.\) of the modulus we gave just after Corollary [cor:norm-mult].

  2. The Euclidean norm on \(\mathbb{R}^n\) or \(\mathbb{C}^n\) For vectors \(\boldsymbol{x}\) in \(\mathbb{R}^{n}\) (or \(\mathbb{C}^{n}\)), the function \[d(\boldsymbol{x},\boldsymbol{y}):=\|\boldsymbol{x}-\boldsymbol{y}\|_{2}=\sqrt{\sum_{i=1}^{n}|x_{i}-y_{i}|^{2}}\] is a metric. It is easy to check properties \((D1)-(D3)\). Note that this norm comes from an inner-product. For example, for \(n=2\), the real Euclidean norm on \(\mathbb{R}^{2}\) comes from the usual dot product \(\|\boldsymbol{x}\|_{2}=\sqrt{\boldsymbol{x}\cdot\boldsymbol{x}}\) and the complex Euclidean norm on \(\mathbb{C}^{2}\) comes from the inner product \(\left<z,w\right>=z_{1}\bar{w}_{1}+z_{2}\bar{w}_{2}\); that is \(\|z\|_{2}=\sqrt{\left<z,z\right>}\).

    • (a) Metrics induced from inner products in vector spaces

      More generally, given any finite dimensional real vector space \(V\) with a (positive definite) inner product \(\left<\,.\,\right>\), then \[d(v,w):\,=\,\|v-w\|\,=\,\sqrt{\left<v-w,v-w\right>}\quad\quad(v,w\in V)\] is a metric. Properties (D1) and (D2) are obvious, property (D3) follows from Cauchy-Schwarz: \(|\left<v,w\right>|\leq\|v\|\cdot\|w\|\) - see sheet 2.

    • (b) Metrics induced from norms in vector spaces

      Even more generally, so long as a vector space has a ‘nice’ notion of the ‘size’ of each vector, we can define a metric in the obvious way. Such a notion is in generality referred to as a norm:

      [Norms and normed vector spaces] Given any real or complex vector space \(V\), a function \(\|\,.\,\|:V\to\mathbb{R}_{\geq0}\) is a norm if it satisfies (for \(v,w\in V\))

      • (N1) \(\|v\|\geq0\) and \(\|v\|=0\quad\Longleftrightarrow\quad v=0;\)

      • (N2) \(\|\lambda v\|=|\lambda|\cdot\|v\|\quad\) for \(\lambda\in\mathbb{R}\) or \(\mathbb{C}\);

      • (N3) \(\|v+w\|\leq\|v\|+\|w\|\) (the triangle inequality).

      Note that (N3) implies \(\|v-w\|\geq\left|\,\|v\|-\|w\|\,\right|\) (the reverse triangle inequality). A vector space equipped with a norm is called a normed vector space. The metric given by \(d(v,w):=\|v-w\|\) then always defines a metric (it is easy to check properties (D1)-(D3)). In particular, since the modulus function on \(\mathbb{C}\) is a norm, the metric we get from the modulus function comes from a norm.

  3. \(\ell_p\)-norm on \(\mathbb{R}^n\) or \(\mathbb{C}^n \, (p \geq 1)\)

    The above suggests that a vector space could be home to many different norms (so many different metrics). But, not all norms arise from an inner product as in \((3a)\); for example, for vectors \(\boldsymbol{x}\) in \(\mathbb{R}^{n}\) or \(\mathbb{C}^{n}\), the function \[\|\boldsymbol{x}\|_{p}:=\sqrt[p]{\sum_{i=1}^{n}|x_{i}|^{p}}\] defines a norm for every \(p\geq1\), called the \(\boldsymbol{\ell_{p}}\)-norm. But, for \(p\ne2\) (the Euclidean norm) it does not arise from an inner product. When \(p=1\), the \(\ell_{1}\)-norm is simply given by the sum of the size of the components \(\|\boldsymbol{x}\|_{1}=\sum_{i=1}^{n}|x_{i}|\) and is sometimes referred to as the Taxicab norm.

  4. \(\ell_\infty\)-norm on \(\mathbb{R}^n\) (or \(\mathbb{C}^n\))

    The function \[\|\boldsymbol{x}\|_{\infty}:=\max_{i=1,\ldots,n}{|x_{i}|}\] also defines a norm, called the \(\boldsymbol{\ell_{\infty}}\)-norm (or the sup-norm), thus it also defines a metric. It is in some sense the ‘limiting notion’ of the \(\ell_{p}\) norms.

  5. Riemannian (chordal) metric on \(\hat{\mathbb{C}}\)

    Let \(f:\hat{\mathbb{C}}\to S^{2}\) be the (inverse of the) stereographic projection. Then the function \[d(z,w):=\|f(z)-f(w)\|_{2}\quad\quad(z,w\in\hat{\mathbb{C}}),\] where \(\|\,.\,\|_{2}\) is Euclidean norm in \(\mathbb{R}^{3}\) (so, the \(\ell_{2}\)-norm), is a metric on \(\hat{\mathbb{C}}\). It is called the Riemannian metric (or chordal metric). Note that with respect to this metric, the distance from \(0\) to \(i\) is the same as the distance from \(i\) to \(\infty\), for example!

  6. Discrete metric

    Let \(X\) be a non-empty finite set. Then (for \(x,y\in X\)) the function \[d(x,y):=\begin{cases} 0\ & \text{ if }\quad x=y,\\ 1\ & \text{ if }\quad x\neq y, \end{cases}\] defines a metric, called the discrete metric. It is easy to check (D1)-(D3). In this case, \((X,d)\) is called a discrete metric space.

  7. Function spaces

    There are many of these, such as the space \(X=C([a,b])\) of continuous functions on an interval \([a,b]\). The function \[\|f\|:=\max_{x\in[a,b]}|f(x)|\] defines a norm, and thus a metric (see Analysis III for more examples).

  8. Subspace metric

    Any non-zero subset \(Y\subset X\) of a metric space \(X\) is itself a metric space with respect to the same metric (this is easy to check). The metric restricted to the set \(Y\) is then called the subspace metric [There is actually much more than meets the eye with this metric - see sheets 2 and 3.]

2.2 Open and closed sets

Since we have a general notion of distance in any metric space \(X\), we can define balls in the space. These will be the key to understanding the topology of the space.

[Balls in a metric space] Let \((X,d)\) be a metric space, \(x\in X\) and let \(r>0\) be a real number. Then:

  • The open ball \(\boldsymbol{B_{r}(x)}\) of radius \(\boldsymbol{r}\) centred at \(\boldsymbol{x}\) is \[B_{r}(x):=\{y\in X:\:d(x,y)<r\}.\]

  • The closed ball \(\boldsymbol{\bar{B}_{r}(x)}\) of radius \(\boldsymbol{r}\) centred at \(\boldsymbol{x}\) is \[\bar{B}_{r}(x):=\{y\in X:\:d(x,y)\leq r\}.\]

Visualizing balls (examples)

  1. Let \(X=\mathbb{C}\) and \(d(z,w)=|z-w|\), then \(B_{1}(0)=\mathbb{D}=\{z\::\:|z|<1\}\) as before. More generally \(B_{r}(z_{0})\) is the usual ball of radius \(r\) around \(z_{0}\), not including its boundary circle. \(\bar{B}_{r}(z_{0})\) is the ball of radius \(r\) around \(z_{0}\), including its boundary circle. This is the most important example from the point of view of Complex Analysis.

  2. Let us consider the unit ball \(B_{1}(\mathbf{0})\) in \(\mathbb{R}^{2}\) with respect to the \(\ell_{p}\)-norms, for \(p=1,2\) and \(\infty\).

    For \(p=2\) the unit ball \(B_{1}(\mathbf{0})\) is the usual Euclidean ball - so the inside of the unit circle centred at the origin. For \(p=\infty\), the equation \(\max\{|x|,|y|\}<1\) (for \((x,y)\in\mathbb{R}^{2}\)) clearly defines the interior of a square with vertices \((1,1),(-1,1),(1,-1)\) and \((-1,-1)\).

    For the \(\ell_{1}\)-norm a little care is needed. We are interested in the points \((x,y)\in\mathbb{R}^{2}\) with \(|x|+|y|<1\). In the 1st quadrant this means \(y<1-x\), in the 2nd it means \(y<1+x\), in the 3rd we have \(y>-1-x\) and in the 4th its \(y>x-1\). Thus, the unit ball is the interior of a diamond with vertices \((1,0),(0,1),(-1,0)\) and \((0,-1)\).

[Open/closed sets in a metric space] Let \((X,d)\) be a metric space. Then:

  • A subset \(U\subseteq X\) is open (in \(\boldsymbol{X}\)) if for every point \(x\in U\) there exists \(\epsilon>0\) such that \(B_{\epsilon}(x)\subset U\).

  • A subset \(U\subseteq X\) is closed (in \(\boldsymbol{X}\)) if its complement \(X\setminus U\) is open.

Sets in a metric space can be open and closed at the same time! For example, the empty set \(\emptyset\) and the whole metric space \(X\) are both open and closed. Such sets are referred to as ‘clopen’.

[Open balls are open] In a metric space, the open ball \(B_{r}(x)\) is open!

Let \(y\in B_{r}(x)\) with \(d(x,y)=s\) (and so \(s<r\)). We need to show there exists \(\epsilon>0\) such that \(B_{\epsilon}(y)\subseteq B_{r}(x)\). Simply take \(\epsilon=r-s>0\). Then for every \(z\in B_{\epsilon}(y)\) we have \[d(x,z)\:\stackrel{(D3)}{\leq}\:d(x,y)+d(y,z)\:<\:s+\epsilon\:=\:r.\] Thus, \(z\in B_{r}(x)\) as required.

It can also be shown that in a metric space the closed ball \(\overline{B}_{r}(x)\) is closed (see Sheet 2).

Open sets (examples/warnings)

  1. All of the previously encountered subsets \(\mathbb{H},\mathbb{D},\mathbb{C}^{\ast}\) and \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\) of the complex plane are open (see sheet 2). The 1st quadrant \(\Omega_{1}:=\{z\in\mathbb{C}:\mathrm{Re}(z)>~0,\mathrm{Im}(z)>0\}\) is open. To see this, for \(z\in\Omega_{1}\) consider the ball \(B_{r}(z)\) where \(r=\min(\mathrm{Re}(z),\mathrm{Im}(z))/2\).

  2. Let \(X\) be a discrete metric space, so \(d(x,y):=\begin{cases} 0\ & \text{ if }\quad x=y.\\ 1\ & \text{ if }\quad x\neq y. \end{cases}\quad\) Then, for \(x\in X\) and \(r>0\) we have \[B_{r}(x):=\begin{cases} \{x\}\ & \text{ if }\quad r\leq1.\\ X\ & \text{ if }\quad r>1. \end{cases}\] Therefore, (by Lemma [lem:open-balls-are-open]) every singleton \(\{x\}\) is an open set with respect to the discrete metric. Moreover, the complement \(X\setminus\{x\}\) is also open, since for any \(y\in X\setminus\{x\}\) (that is, any \(y\neq x\) in \(X\)) and any \(r<1\) the open ball \(B_{r}(y)=\{y\}\) is contained in \(X\setminus\{x\}\). Thus, all balls are clopen with respect to the discrete metric! In fact, any subset \(Y\subseteq X\) of a discrete metric space is clopen!

  3. Sets don’t have to be either open or closed. For example, \([0,1)\) is neither open nor closed in \(\mathbb{R}\) (with respect to the standard metric \(|\,.\,|\)) - simply check the point \(x=0\) in \([0,1)\) and the point \(x=1\) in the complement \((-\infty,0)\cup[1,\infty)\). However, recall that any subset of a metric space is itself a metric space (w.r.t the same metric) - the subspace metric. Thus the pair \(([0,1),|\,.\,|)\) is a metric space - but then (by the remark after Definition [def:open-and-closed-sets]) the set \([0,1)\) is open!

    Key: Open and closed sets are really relative notions, depending on the ambient space (as well as the metric).

    Notation: When we say a subset of \(\mathbb{R}\) or \(\mathbb{R}^{n}\) or \(\mathbb{C}\) are open/closed, we will mean with respect to the standard norms \(|\,.\,|\) and \(\|\,.\,\|_{n}\) and \(|\,.\,|\) respectively. Most sets we encounter do not simply look like open/closed balls, so it will be useful to have rules for union and intersection:

[Unions and intersections of open sets] Let \((X,d)\) be a metric space. Then:

  1. Arbitrary unions of open sets are open; that is \[\bigcup_{i\in I}\,U_{i}\quad\text{is open,}\quad\text{ for any (possibly infinite) collection of open sets }U_{i}.\]

  2. Finite intersections of open sets are open; that is \[\bigcap_{i=1}^{n}\,U_{i}\quad\text{is open,}\quad\text{ for any finite collection of open sets }U_{i}.\quad\quad\quad\quad\quad\quad\quad\]

  1. Let \(x\in\bigcup_{i\in I}U_{i}\). Then, by definition, it must be contained in the set \(U_{j}\) for some \(j\in I\). Since \(U_{j}\) is open there must exist a ball \(B_{\epsilon}(x)\) centred at \(x\) lying in \(U_{j}\). But then \(B_{\epsilon}(x)\,\subseteq\,U_{j}\,\subseteq\,\bigcup_{i\in I}U_{i}\) as required.

  2. Let \(x\in\bigcap_{i=1}^{n}U_{i}\). By definition \(x\in U_{i}\) for every \(i=1,\ldots,n\). But, since they are all open, for every \(U_{i}\) there must exist \(r_{i}>0\) such that \(B_{r_{i}}(x)\subset U_{i}\). Now simply take \(\epsilon={\displaystyle \min_{i=1,\ldots n}(r_{i})}\). Then for every \(i\) we have \(B_{\epsilon}(x)\subseteq B_{r_{i}}(x)\) and so \[B_{\epsilon}(x)\,\subseteq\,\bigcap_{i=1}^{n}B_{r_{i}}(x)\,\subseteq\,\bigcap_{i=1}^{n}U_{i}.\]

[Unions and intersections of closed sets] Let \((X,d)\) be a metric space. Then:

  1. Finite unions of closed sets are closed.

  2. Arbitrary intersections of closed sets are closed.

De Morgan’s laws together with Lemma [lem:metrics-give-topology].

We have the following remarks about open and closed sets.

  • An infinite intersection of open sets is not necessarily open (see Sheet 2). Similarly, an infinite union of closed sets is not necessarily closed: e.g., the union of closed intervals in \(\mathbb{R}\), \[\bigcup_{i=1}^{\infty}\left[\frac{1}{i},\:1-\frac{1}{i}\right]\,=\,(0,1)\quad\text{ is open in }\mathbb{R}.\]

  • The next generalisation of a metric space you will encounter (see Topology III) is called a Topological space \(\mathcal{T}\). There, the only stipulations are the existence of open sets such that \[(i)\:\emptyset\text{ and }\mathcal{T}\text{ are open };\quad\quad(ii)\:\text{Lemma 2.8 holds.} %\ref{lem:metrics-give-topology} holds.} %(open balls are open).}\] We have a hierarchy: \[\text{Inner-product space}\,\Longrightarrow\,\text{Normed space}\,\Longrightarrow\,\text{Metric space}\,\Longrightarrow\,\text{Topological space}.\]

  • Why have we been looking at examples in \(\mathbb{R}^{2}\) rather than \(\mathbb{C}\)? It turns out that both these spaces are ‘topologically equivalent’, that is, they have the same open sets - this is obvious since the complex modulus is essentially just the Euclidean norm on \(\mathbb{R}^{2}\).

As we have seen, some sets are neither open nor closed. It will be useful to ask what the largest possible open set is inside a given set. Similarly, what is the smallest closed set containing a given set?

[Interior points, closure, boundary, exterior] Let \(A\) be a subset of a metric space \((X,d)\).

  • The interior \(A^{0}\) of \(A\) is defined by \[A^{0}:=\{x\in A:\,\text{there exists an open set }U\subseteq A\text{ such that }x\in U\}.\]

  • The closure \({\bar{A}}\) of \(A\) is the complement of the interior of the complement: \[\bar{A}:=\{x\in X:\,U\cap A\neq\emptyset\text{ for every open set }U\text{ with }x\in U\}.\]

  • The boundary \({\partial A}\) of \(A\) is the closure without the interior: \[\partial A:=\bar{A}\setminus A^{0}\quad\quad\left[=X\setminus(A^{0}\cup(X\setminus A)^{0})\right].\]

  • The exterior \({A^{e}}\) of \(A\) is the complement of the closure: \[A^{e}:=X\setminus\bar{A}\quad\quad\left[=X\setminus(A^{0}\cup\partial A)=(X\setminus A)^{0}\right].\]

  • Don’t confuse closure with conjugation! Closure concerns sets in any metric space, conjugation concerns points in \(\mathbb{C}\).

  • Clearly the interior and exterior are open and clearly the boundary is closed. The closure is also closed (see sheet 2) - in fact, it is often defined more simply as \(\bar{A}:=X\setminus(X\setminus A)^{0}\), from which the closedness is obvious - it is the complement of an open set.

  • The boundary matches our naive notion. Broadly speaking, the interior of a set consists of all the points that are not on its ‘edge’, and to form the closure of a set you simply add all the missing edge points. Indeed, we have the following additional properties of a subset \(A\subset X\) (see sheet 2): \[\begin{aligned} (a) & \:A\text{ is open } & \iff & \:\:\partial A\cap A=\emptyset & \iff & \:A=A^{0}; & \text{In fact } & \quad A^{0}={\displaystyle \bigcup_{\substack{U\subseteq A\\ U\text{ open} } }U;}\\ (b) & \:A\text{ is closed } & \iff & \quad\partial A\subseteq A & \iff & \:A=\bar{A}; & \text{In fact } & \quad\bar{A}={\displaystyle \bigcap_{\substack{A\subseteq F\\ F\text{ closed} } }F;\ }\end{aligned}\] That is, the interior \(A^{0}\) is the largest open set contained in \(A\) and the closure \(\bar{A}\) is the smallest closed set containing \(A\). Convince yourself that all the definitions reflect your intuitive notions, say, for the plane!

  • In \(\mathbb{R}^{n}\) or \(\mathbb{C}^{n}\) for simple sets we only have to replace strict inequality with equality (or vice versa) to obtain the closure (or interior). For example, for \(A = \{ z \in \mathbb{C} : 1 < \vert z \vert \leq 3\}\), we have \(A^{0} = \{ z \in \mathbb{C} : 1 < \vert z \vert < 3\}\), \(\bar{A} = \{ z \in \mathbb{C} : 1 \leq \vert z \vert \leq 3\}\), and \(\partial A = \{ z \in \mathbb{C} : \vert z \vert =1 \} \cup \{ z \in \mathbb{C} : \vert z \vert =3\}\).

    Similarly \[\overline{\{z\in\mathbb{C}:\,1<\mathrm{Re}(z)\leq3,|\mathrm{Im}(z)|<1\}}=\{z\in\mathbb{C}:\,1\leq\mathrm{Re}(z)\leq3,|\mathrm{Im}(z)|\leq1\}\] and \[\{z\in\mathbb{C}:\,1<\mathrm{Re}(z)\leq3,|\mathrm{Im}(z)|<1\}^{0}=\{z\in\mathbb{C}:\,1<\mathrm{Re}(z)<3,|\mathrm{Im}(z)|<1\}.\] In fact, \(\overline{B_{r}(x)}=\bar{B}_{r}(x)\) for any ball in \(\mathbb{R}^{n}\) or \(\mathbb{C}^{n}\).

    However, this is not true in every metric space (see sheet 2) - there are metric spaces for which \(\overline{B_{r}(x)}\neq\bar{B}_{r}(x)\); that is, the smallest closed set containing the open ball \(B_{r}(x)\) is not the closed ball \(\bar{B}_{r}(x)\)! [Hint: what if the open ball is already closed!?]

2.3 Convergence and continuity

With our general notion of distance (a metric) comes a natural notion of convergence.

[Limits and convergence in a metric space] We say a sequence \(\{x_{n}\}\) in a metric space \((X,d)\) converges to \(x\in X\) if we have \[{\displaystyle \lim_{n\to\infty}d(x_{n},x)=0.}\] That is, \[\text{ for every }\epsilon>0\text{ there exists }N\in\mathbb{N}\text{ such that }d(x_{n},x)<\epsilon\text{ for every }n>N.\] We write “\(x_{n}\rightarrow x\) as \(n\rightarrow\infty\)”, or “\({\displaystyle \lim_{n\to\infty}x_{n}=x}\)”.

Convergent sequences (example)

As mentioned, the chordal metric on \(\hat{\mathbb{C}}\) is \(d(z,w)=\|f(z)-f(w)\|_{2}\), where \(\|\,.\,\|_{2}\) is the Euclidean norm on \(\mathbb{R}^{3}\) and \(f\) is the inverse Stereographic projection given by \[f(z)=\left(\frac{2\mathrm{Re}(z)}{|z|^{2}+1},\:\frac{2\mathrm{Im}(z)}{|z|^{2}+1},\:\frac{|z|^{2}-1}{|z|^{2}+1}\right).\] Show that with respect to this metric the sequence \(\{ki\}_{k\in\mathbb{N}}\) in \(\hat{\mathbb{C}}\) converges to \(\infty\in\hat{\mathbb{C}}\).

Since \(|ki|=k\) and \(f(\infty)=(0,0,1)\) we have \[\begin{aligned} d(ki,\infty)&=\|f(ki)-f(\infty)\|_{2} = \left\Vert \left(0,\frac{2k}{k^{2}+1},\frac{k^{2}-1}{k^{2}+1}\right)-(0,0,1)\right\Vert _{2}\\ & = \left\Vert \left(0,\frac{2k}{k^{2}+1},-\frac{2}{k^{2}+1}\right)\right\Vert _{2}\\ & = \sqrt{\left(\frac{2k}{k^{2}+1}\right)^{2}+\left(\frac{-2}{k^{2}+1}\right)^{2}}\quad\longrightarrow\quad0\quad\text{ as }k\rightarrow\infty.\end{aligned}\] Thus, the sequence indeed converges to \(\infty\). This is quite an odd notion as we are used to saying sequences ‘diverge’ if they tend to infinity. The key is that convergence depends on the metric being used.

Limits in \(\mathbb{C}\) with the standard metric (Very important!).

The above definition of limit in a metric space also gives us an example of limits in \(\mathbb{C}\) with the standard metric. This says that if \(z_{n}\) is a sequence of complex numbers, then \(\lim_{n\to\infty}z_{n}=z\) if and only if

‘for all \(\epsilon>0\), there exists \(N>0\) such that for all \(n\geq N\), \(|z_{n}-z|<\epsilon\)’.

Note this is the same definition as in Analysis I, but replacing the absolute value on the real line by the modulus on \(\mathbb{C}\). Importantly, by the same proofs as in Analysis I, limits in the complex plane follow the COLT rules.

Furthermore there is a very important link between convergence in the complex plane and real convergence (see Analysis I for the proof). Let \(\{z_{n}\}_{n\in\mathbb{N}}\) be a sequence of complex numbers \(z_{n}=x_{n}+iy_{n}\). Then, for any fixed \(z_{0}=x_{0}+iy_{0}\in\mathbb{C}\) we have \[\lim_{n\rightarrow\infty}z_{n}=z_{0}\quad\iff\quad\lim_{n\rightarrow\infty}x_{n}=x_{0}\:\:\text{ and }\:\:\lim_{n\rightarrow\infty}y_{n}=y_{0}.\] In other words, the sequence \(\{z_{n}\}\) converges iff the real sequences \(\{\mathrm{Re}(z_{n})\}\) and \(\{\mathrm{Im}(z_{n})\}\) converge (see Sheet 3).

Let’s return to our general setting of metric spaces and prove some properties of limits.

[Limits and open sets] Let \((X,d)\) be a metric space. Then:

  1. A sequence can have at most one limit.

  2. We have \[\lim_{n\to\infty}x_{n}=x\quad\Longleftrightarrow\quad\forall\text{ open }U\text{ with }x\in U,\:\exists\,N\in\mathbb{N}\text{ such that }\:\forall n>N\quad x_{n}\in U.\] Hence the notion of a limit in a metric space can be stated in terms only of its open sets.

  1. Assume \({\displaystyle \lim_{n\to\infty}x_{n}=x}\) and \({\displaystyle \lim_{n\to\infty}x_{n}=y}\). Then we have for each \(n\), by the triangle inequality \(d(x,y)\leq d(x,x_{n})+d(x_{n},y)\), so taking the limit as \(n\to\infty\) gives \[d(x,y)\leq\lim_{n\to\infty}d(x,x_{n})+\lim_{n\to\infty}d(x_{n},y)=0+0=0,\] so \(d(x,y)=0\), hence \(x=y\) by property (D1) of metric spaces.

  2. (\(\Rightarrow\)): Assume \({\displaystyle \lim_{n\to\infty}x_{n}=x}\) and that \(U\) is open with \(x\in U\). By definition there exists \(r>0\) such that \(B_{r}(x)\subseteq U\) and an \(N\in\mathbb{N}\) such that \(d(x_{n},x)<r\) for every \(n>N\). Thus, \(x_{n}\in B_{r}(x)\subseteq U\) for every \(n>N\).

    (\(\Leftarrow\)): Let \(\epsilon>0\). We wish to find \(N\in\mathbb{N}\) such that \(d(x_{n},x)<\epsilon\) for \(n>N\). Consider the ball \(B_{\epsilon}(x)\). It is open and contains \(x\), and so there exists \(N\in\mathbb{N}\) such that \(x_{n}\in B_{\epsilon}(x)\) for every \(n>N\). This is precisely the statement that \(d(x_{n},x)<\epsilon\) for \(n>N\).

The key to completing proofs of this type is to write down the definitions in your assumptions and also write precisely what you need to prove. Usually, doing this leads very quickly to the proof.

We can now define what it means for a function between two metric spaces to be continuous (this will incorporate many of the functions we have already encountered; e.g., \(f:\mathbb{C}\to\mathbb{C};\:\mathbb{R}\to\mathbb{C};\) or \(\mathbb{C}\to\mathbb{R}\)).

[Continuity] A map \(f:(X_{1},d_{1})\to(X_{2},d_{2})\) between two metric spaces is called continuous at \(x_{0}\in X_{1}\) if \[\forall\,\epsilon>0\quad\exists\,\delta>0\quad\text{such that}\quad\forall\,x\in X_{1}\quad\text{we have}\quad d_{1}(x,x_{0})<\delta\quad\Rightarrow\quad d_{2}(f(x),f(x_{0}))<\epsilon.\] We say a function \(f\) is continuous on \(X_{1}\) if it is continuous at every point \(x_{0}\in X_{1}\). [Note that this is exactly the same as in Analysis I.]

Equivalently, one could write \[\forall\,\epsilon>0\quad\exists\,\delta>0\quad\text{such that}\quad x\in B_{\delta}(x_{0})\Rightarrow\quad f(x)\in B_{\epsilon}(f(x_{0})),\] where the first ball is in \(X_{1}\) and the second ball is in \(X_{2}\).

[Basic properties of continuous functions]

  1. Products, sum, quotients of real/complex valued continuous functions on a metric space \(X\) are continuous. E.g., if \(f:X\to\mathbb{C}\) and \(g:X\to\mathbb{C}\) are continuous, then \(f+g\) and \(fg\) and \(f/g\) are continuous (where defined).

  2. Compositions of continuous functions are continuous. I.e., if \(f:X_{1}\to X_{2}\) and \(g:X_{2}\to X_{3}\) are continuous maps between metric spaces, then \(g\circ f:X_{1}\to X_{3}\) is continuous.

Almost word-for-word from Analysis I.

Examples of continuous functions on the complex plane (with the standard metric)

  • The identity function is continuous.

  • Constant functions are continuous.

  • The functions \(\mathrm{Re},\mathrm{Im}:\mathbb{C}\to\mathbb{R}\) are continuous.

  • The complex conjugation \(z\mapsto\bar{z}\) is continuous as a map from \(\mathbb{C}\to\mathbb{C}\).

  • The modulus function \(z\mapsto|z|\) is continuous as a map from \(\mathbb{C}\to\mathbb{R}\).

  • All of \(\exp,\sin,\cos,\sinh\) and \(\cosh\) are continuous on \(\mathbb{C}\), as are all polynomials.

  • If \(\arg\) is the choice of argument function with values in \((\theta_{1},\theta_{2}]\) then \(\arg\) is continuous on \(\mathbb{C}\backslash R_{_{\theta_{1}}}\) (recall \(R_{\theta_{1}}\) is the ray with angle \(\theta_{1}\))

  • If \(\log\) is a branch of \(\log\) corresponding to an argument function as above, then \(\log\) is continuous on \(\mathbb{C}\backslash R_{\theta_{1}}\).

As with limits, it will be useful to restate continuity in terms of open sets. First, recall that for any function \(f:X_{1}\to X_{2}\) and any set \(U\subseteq X_{2}\) we define the preimage \(f^{-1}(U)\) of \(U\) under \(f\) by \(f^{-1}(U):=\{x\in X_{1}:f(x)\in U\}\).

[Continuity via open sets] Let \(X_{1}\) and \(X_{2}\) be metric spaces. Then: \[\begin{aligned} f:X_{1}\to X_{2}\;\text{continuous} & \quad\Longleftrightarrow\quad f^{-1}(U) \text{ is open in } X_{1} \text{ for every open set } U \text{ in } X_{2}\\ & \quad\Longleftrightarrow\quad f^{-1}(F) \text{ is closed in } X_{1} \text{ for every closed set } F \text{ in } X_{2}.\end{aligned}\]

For open sets:

  • (\(\Rightarrow\)): Let \(U\) be open in \(X_{2}\), and pick \(x\in f^{-1}(U)\). Since \(U\) is open there exists \(\epsilon>0\) such that \(B_{\epsilon}(f(x))\subseteq U\) (since \(f(x)\in U\)). But \(f\) is continuous, so by definition there exists \(\delta>0\) such that if \(y\in B_{\delta}(x)\) we have \(f(y)\in B_{\epsilon}(f(x))\). Thus \(f(y)\in U\), and so \(y\in f^{-1}(U)\); since this is true for every \(y\in B_{\delta}(x)\) we have shown \(B_{\delta}(x)\subseteq f^{-1}(U)\) and so the set \(f^{-1}(U)\) is open.

  • (\(\Leftarrow\)): Let \(x\in X_{1}\) and \(\epsilon>0\). We need to find a \(\delta>0\) such that \(y\in B_{\delta}(x)\Rightarrow f(y)\in B_{\epsilon}(f(x))\). First notice that (by Lemma [lem:open-balls-are-open]) the ball \(B_{\epsilon}(f(x))\) is open. By assumption, the preimage \(f^{-1}(B_{\epsilon}(f(x)))\) of this ball is also open. The point \(x\) must be in this preimage (because the centre \(f(x)\) is certainly in \(B_{\epsilon}(f(x))\)). Then, since the preimage is open there must exist an open ball \(B_{\delta}(x)\) around \(x\) contained in \(f^{-1}(B_{\epsilon}(f(x)))\). But, this is precisely the statement that \(y\in B_{\delta}(x)\Rightarrow f(y)\in B_{\epsilon}(f(x))\) as required.

For closed sets see sheet 3.

  • From the proof we see that we can be slightly more precise than the statement of the theorem. We have, for example \[\begin{aligned} f:X_{1}\to X_{2}\;\text{continuous} & \quad\Longleftrightarrow\quad & f^{-1}(U) \text{ is open in } X_{1} \text{ for every open set } U \text{ in} X_{2}\\ \text{at }x\in X_{1} & & \text{containing }f(x)\end{aligned}\]

  • Note that for the \((\Leftarrow)\) direction we only actually needed the fact that \(f^{-1}(B)\) was open for any open ball \(B\) in \(X_{2}\). It turns out that the open balls ‘generate’ all the open sets (via Lemma [lem:metrics-give-topology]) - see Topology/Analysis III!

  • Note that, by the second part of Theorem [thm:continuity-open-sets], if a function \(f:X_{1}\to X_{2}\) is continuous then \(f^{-1}(\{x\})\) is closed for any \(x\in X_{2}\).

As with limits, it turned out that continuity depends only upon the open sets in the respective metric spaces. This means we can use the continuity of known functions to prove the openness of very complicated sets.

Showing sets are open using continuity (examples)

  • Show the following set is open: \[U=\{(x,y)\in\mathbb{R}^{2}:(x^{2}+y^{2})\sin^{3}(\sqrt{x^{2}+7})>2\}.\] Well, the function \[f:\mathbb{R}^{2}\to\mathbb{R}:\:(x,y)\mapsto(x^{2}+y^{2})\sin^{3}(\sqrt{x^{2}+7})\] is continuous by Lemma [lem:combining-continuous-function], because it is the product/composition of real valued continuous functions. Moreover, \[U=\{(x,y)\in\mathbb{R}^{2}:f((x,y))>2\}\:=\:f^{-1}((2,\infty)).\] Since \((2,\infty)\) is open in \(\mathbb{R}\) (see sheet 2), the set \(U\) is the preimage of an open set under a continuous map and by Theorem [thm:continuity-open-sets] it is therefore open.

  • We can actually do a little more using the following useful properties of the preimage (from Analysis I):

    Useful properties of preimage

    • \(f^{-1}(A\cup B)=f^{-1}(A)\cup f^{-1}(B)\).

    • \(f^{-1}(A\cap B)=f^{-1}(A)\cap f^{-1}(B)\).

    • \(f^{-1}(A\setminus B)=f^{-1}(A)\setminus f^{-1}(B)\).

    Show the following set is open: \[U=\{(x,y)\in\mathbb{R}^{2}:xy>1,\:x^{2}+y^{2}>3\}.\] Both \[f(x,y)=xy\quad\text{and}\quad g(x,y)=x^{2}+y^{2}\] are continuous as functions \(\mathbb{R}^{2}\to\mathbb{R}\), and \(U=f^{-1}((1,\infty))\:\cap\:g^{-1}((3,\infty))\).

    Since \(f\) and \(g\) are continuous and both \((1,\infty)\) and \((3,\infty)\) are open in \(\mathbb{R}\), the preimages \(f^{-1}((1,\infty))\) and \(g^{-1}((3,\infty))\) are open (by Theorem [thm:continuity-open-sets]). By Lemma [lem:metrics-give-topology] we have that \(U\) is open, since it is the intersection of two open sets.

Showing functions are not continuous using open sets (example)

  • We can also use openness to prove a function is not continuous. Indeed, for \(f :X_1 \to X_2\), if there exists an open set \(U\) in \(X_2\) such that \(f^{-1}(U)\) is not open in \(X_1\), then \(f\) is not continuous. For example, the function \(f:\mathbb{R}^{2}\to\mathbb{R}\) defined by \[f(x,y)=\begin{cases} \frac{xy}{x^{2}+y^{2}}, & \text{ if }(x,y)\neq0,\\ 0 & \text{ otherwise},\ \end{cases}\quad\] isn’t continuous at \((0,0)\). Why? Consider the preimage \(f^{-1}((-\epsilon,\epsilon))\). Claim: This preimage is not open for \(\epsilon\) sufficiently small.

    First note that the preimage in question contains \((0,0)\) since \(f((0,0))=0\). To show the preimage is not open it is enough to show that any open ball in \(\mathbb{R}^{2}\) centred at \((0,0)\) is not contained in \(f^{-1}((-\epsilon,\epsilon))\): Let \(\epsilon<1/4\), say, and for any \(\delta>0\) consider the ball \(B_{\delta}((0,0))\) centred at \((0,0)\). The point \((\delta/2,\delta/2)\) is in \(B_{\delta}((0,0))\) since \(\|(\frac{\delta}{2},\frac{\delta}{2})-(0,0)\|_{2}=\|(\frac{\delta}{2},\frac{\delta}{2})\|_{2}=\sqrt{\frac{\delta^{2}}{4}+\frac{\delta^{2}}{4}}=\frac{\delta}{\sqrt{2}}<\delta\). But \[f\left(\left(\frac{\delta}{2},\frac{\delta}{2}\right)\right)\quad=\quad\frac{\frac{\delta}{2}\cdot\frac{\delta}{2}}{\frac{\delta^{2}}{4}+\frac{\delta^{2}}{4}}\quad=\quad\frac{1}{2}\quad>\quad\epsilon,\] so \((\delta/2,\delta/2)\) is not in \(f^{-1}((-\epsilon,\epsilon))\). Thus, for any \(\delta>0\) the ball \(B_{\delta}((0,0))\) is not contained in \(f^{-1}((-\epsilon,\epsilon))\) and so this preimage is not open.

    Since \((-\epsilon,\epsilon)\) is open in \(\mathbb{R}\) it follows from Theorem [thm:continuity-open-sets] that \(f\) is not continuous.

Why preimages?

Note that the use of preimages in Theorem [thm:continuity-open-sets], rather than images, is important. The same result is not true of images. E.g., the function \(f(z)=|z|\) is continuous as a function \(\mathbb{C}\to\mathbb{R}\), but it maps an open set in the complex plane \(f:\mathbb{D}\to[0,1)\) to an interval that is neither open nor closed in \(\mathbb{R}\).

Note that this \(f\) is actually a bijection from \(\mathbb{R}_{\geq 0}\) to \(\mathbb{R}_{\geq0}\), and in \(\mathbb{R}_{\geq0}\) the interval \([0,1)\) is open! So, is the problem that we need the function to be bijective? No. For example, consider the metric spaces \(X_{1}=[0,1)\cup[2,3]\) and \(X_{2}=[0,2]\) with the usual (subspace) metric coming from the absolute value on \(\mathbb{R}\). Define \[f:X_{1}\to X_{2}:\:x\mapsto\begin{cases} x, & \text{ if }x\in[0,1).\\ x-1, & \text{ if }x\in[2,3].\ \end{cases}\] It is easy to check that \(f\) is a bijection and is continuous on its domain: (Continuity is trivial for \(x\neq2\). For \(x=2\), pick \(\epsilon>0\), then for any \(0<\delta<1\) we have \(B_{\delta}(2)=[2,2+\delta)\). Note that \(f(2)=1\) and so \(B_{\epsilon}(f(2))=(1-\epsilon,1+\epsilon)\). To show \(f\) is continuous we must therefore find a \(\delta\) so that \(f(x)\in(1-\epsilon,1+\epsilon)\) if \(x\in[2,2+\delta)\). Simply pick any \(\delta<\epsilon\), for then: \[x\in[2,2+\delta)\quad\Rightarrow\quad f(x)=x-1\in[1,1+\delta)\subset(1-\delta,1+\delta)\subset(1-\epsilon,1+\epsilon),\] as required.) But, the set \([2,3]\) is open in \(X_{1}\) (see sheet 3) and its image \(f([2,3])=[1,2]\) is not open in \(X_{2}\).

Thus, we genuinely do need to use preimages. When can we use images of continuous functions to preserve properties of the sets in question? When can we find the maximum/minimum value taken by a function on a set? It turns out a key concept is that of compactness.

2.4 Compactness

[Compactness] A non-empty subset \(K\) of a metric space \(X\) is called (sequentially) compact if for any sequence \(\{x_{k}\}_{k\in\mathbb{N}}\) in \(K\) there exists a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit in \(K\).

Note that the initial sequence in the definition does not have to converge. But what if it does? What is the link between a convergent sequence and its subsequences?

If \(\{x_{k}\}_{k\in\mathbb{N}}\) is a convergent sequence in a metric space \(X\), then any subsequence converges to the same limit.

Assume \(x_{k}\to x\). Let \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) be a subsequence (here \(n_{1}<n_{2}<\cdots\)). We know that for every \(\epsilon>0\) there exists \(N\in\mathbb{N}\) such that \(x_{k}\in B_{\epsilon}(x)\) for every \(k>N\). So simply note that \(n_{k}>k\), whence \(x_{n_{k}}\in B_{\epsilon}(x)\) for \(k>N\). Thus, \(x_{n_{k}}\to x\).

It seems quite daunting to have to check every sequence in a set for convergent subsequences, so it will be useful to re-express compactness in terms of our basic building blocks; that is, open and closed sets. However, exploring openness doesn’t seem particularly fruitful; for example, consider the set \((0,1)\) in \(\mathbb{R}\); the sequence \(\{1/n\}\) lies in \((0,1)\), but its limit is \(0\notin(0,1)\). Moreover, any subsequence must have the same limit \(0\) by Lemma [lem:subsequences-have-same-limit], and so \((0,1)\) is not compact. Let’s try closedness.

[Closed sets and limits of sequences] We have \[\begin{aligned} F\subset X \text{ is closed}\quad\Longleftrightarrow & \quad\text{Every sequence in }F \text{ which converges in }X\text{ has its limit point in }F.\\ & \quad\text{(that is, if }x_{n}\in F \text{ and }x_{n}\to x\text{ for some }x\in X\text{, then }x\in F.)\end{aligned}\]

\(\quad\)

  • (\(\Rightarrow\)): Assume \(F\) is closed and let \(\{x_{n}\}_{n\in\mathbb{N}}\) be a sequence with \(x_{n}\in F\) that converges to \(x\in X\). We wish to show \(x\in F\). For a contradiction, assume \(x\notin F\); that is, assume \(x\in X\setminus F\). We know \(X\setminus F\) is open, so by definition there must exist an open ball \(B_{\epsilon}(x)\) centred at \(x\) and contained in \(X\setminus F\). But \(x_{n}\to x\), so there exists \(N\in\mathbb{N}\) such that \(x_{n}\in B_{\epsilon}(x)\) for \(n>N\). Thus \(x_{n}\in X\setminus F\) for \(n>N\) (that is, \(x_{n}\notin F\) for \(n>N\)), which is a contradiction.

  • (\(\Leftarrow\)): We need to show \(X\setminus F\) is open. Let \(x\in X\setminus F\). We wish to show we can always find a ball \(B_{\epsilon}(x)\) centred at \(x\) and contained in \(X\setminus F\). If for some \(n\in\mathbb{N}\) we have \(B_{1/n}(x)\subseteq X\setminus F\) we are done. Otherwise, pick an element \(x_{n}\in B_{1/n}(x)\cap F\) for each \(n\). But then \(x_{n}\to x\) and \(x_{n}\in F\), but by assumption \(x\notin F\). This contradicts the premise of the proof.

[Relationship between compactness and closedness]

  1. Compact sets are closed.

  2. Any closed subset of a compact subset is compact.

  1. If \(K\) is compact and \(\{x_{k}\}_{k\in\mathbb{N}}\) is a convergent sequence with \(x_{k}\in F\) and limit \(x\in X\), then by definition there must exist a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit \(x_{0}\in F\). But, by Lemma [lem:subsequences-have-same-limit], these limits must be the same; i.e., \(x=x_{0}\). Thus \(x\in F\) and by Proposition [prop:closed-sets-by-convergence] the set \(F\) is closed.

  2. Assume \(F\subseteq K\) is closed and \(K\) is compact. Let \(\{x_{k}\}_{k\in\mathbb{N}}\) be any sequence in \(F\). Since each \(x_{k}\in K\) there must exist a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit \(x\in K\). But, by Proposition [prop:closed-sets-by-convergence] this limit must be in \(F\). This shows \(F\) is compact.

Are all closed sets compact? No: e.g., \([0,\infty)\) is closed in \(\mathbb{R}\), but \(x_{n}=n\) has no convergent subsequence. The problem here is that \([0,\infty)\) is unbounded.

[Bounded sets] A subset \(A\subseteq X\) of a metric space \(X\) is bounded if \[\text{there exists }R>0\text{ and }x\in X\text{ such that }A\subseteq B_{R}(x).\]

[Compact sets are bounded] Let \(K\subseteq X\) be a compact subset of a metric space \(X\). Then \(K\) is bounded.

We employ a contrapositive argument. Assume \(K\) is not bounded and fix \(x\in K\). For each \(k\in\mathbb{N}\) we can therefore find a point \(x_{k}\in K\) such that \(d(x_{k},x)\geq k\) (since \(B_{k}(x)\) does not contain \(K\)). However, the sequence \(\{x_{k}\}_{k\in\mathbb{N}}\) cannot have a convergent subsequence; given any potential limit point \(x_{0}\) of \(x_{n_{k}}\) we have \[d(x_{n_{k}},x_{0})\quad\stackrel{(D3)}{\geq}\quad d(x_{n_{k}},x)-d(x,x_{0})\quad\geq\quad n_{k}-d(x,x_{0})\quad\to\infty\text{ as }k\to\infty.\]

So, we know that compact sets are closed and bounded. Is that enough? Recall, the theorem of Bolzano-Weierstrass from Analysis I. It states precisely that the closed and bounded intervals \([a,b]\) in \(\mathbb{R}\) are compact.

[Heine-Borel for \(\mathbb{R}^{n}\)] \[\text{A subset }K\text{ of }\mathbb{R}^{n}\text{ is compact}\qquad\Longleftrightarrow\qquad K\text{ is closed and bounded.}\]

Heine-Borel does not hold for arbitrary metric spaces.

The proof is simply induction on \(n\), starting from the base ‘\(n=1\)’ case for subsets of \(\mathbb{R}\). For this reason (and since this is a Complex Analysis course), we will prove it for \(\mathbb{C}\) and leave the proof in higher dimensions for the enthusiastic reader - the statement for \(\mathbb{C}\) corresponds to that of \(\mathbb{R}^{2}\) in the statement of Theorem [thm:Heine-Borel], since \(\mathbb{C}\) with its standard metric can be identified with \(\mathbb{R}^{2}\) with its standard metric.

[Heine-Borel for \(\mathbb{C}\)] \[\text{A subset }K\text{ of }\mathbb{C}\text{ is compact}\qquad\Longleftrightarrow\qquad K\text{ is closed and bounded.}\]

Next, let \(K\subseteq\mathbb{C}\) be a closed and bounded subset of \(\mathbb{C}\). It is very easy to show that if \(K\) is bounded then there exists \(R>0\) such that \(K\subseteq B_{R}(0)\). Let \(R>0\) be such a number (so that \(|z|<R\) for every \(z\in K\)) and let \(\{z_{k}\}_{k\in\mathbb{N}}\) be a sequence in \(\mathbb{C}\) with \(z_{n}=x_{n}+iy_{n}\). Note that \(|x_{n}|<R\) and \(|y_{n}|<R\). We wish to show there is a convergent subsequence with limit in \(K\).

The real interval \([-R,R]\) is closed and bounded, so by Heine-Borel for \(\mathbb{R}\) the interval \([-R,R]\) is compact. Notice that \(x_{k}\in(-R,R)\subset[-R,R]\). Since \(\{x_{k}\}_{k\in\mathbb{N}}\) is a sequence in \([-R,R]\) there must be a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) with limit in \([-R,R]\), say \(x_{n_{k}}\to x\in[-R,R]\).

Consider the corresponding complex subsequence \(\{z_{n_{k}}\}_{k\in\mathbb{N}}\), where \(z_{n_{k}}=x_{n_{k}}+iy_{n_{k}}\). Its imaginary part, the real sequence \(\{y_{n_{k}}\}_{k\in\mathbb{N}}\), also lies in the compact set \([-R,R]\). So, it also must has a convergent subsequence \(\{y_{n_{m_{k}}}\}_{k\in\mathbb{N}}\) (so a subsequence of a subsequence!!) which converges to some \(y\in[-R,R]\).

Finally, take the (sub)subsequence \(\{z_{n_{m_{k}}}\}_{k\in\mathbb{N}}\), where \(z_{n_{m_{k}}}=x_{n_{m_{k}}}+iy_{n_{m_{k}}}\). The imaginary part converges to \(y\) and, by Lemma [lem:subsequences-have-same-limit], the real part converges to \(x\). Then, by the facts we discussed about convergence of complex sequences, the subsequence \(\{z_{n_{m_{k}}}\}_{k\in\mathbb{N}}\) converges with limit \(x+iy\). Since \(K\) is closed by assumption, it follows from Proposition [prop:closed-sets-by-convergence] that this limit \(x+iy\) is in \(K\). This proves \(K\) is compact.

  • The complex plane \(\mathbb{C}\) is not compact with respect to the standard metric; e.g., the sequence \(\{ik\}_{k\in\mathbb{N}}\) has no convergent subsequence. (Neither is \(\mathbb{R}^{n}\).)

  • The Riemann sphere \(S^{2}\) in \(\mathbb{R}^{3}\) is compact as a subset of \(\mathbb{R}^{3}\) (with the usual Euclidean metric). Hence \(\hat{\mathbb{C}}=\mathbb{C}\cup\{\infty\}\) is compact with respect to the chordal metric. (One way to prove this formally will be Theorem [thm:cont-image-of-compact-is-compact] below).

  • The orthogonal group \({\rm O}(n)\) and the unitary group \({\rm U}(n)\) are compact; \({\rm SL}_{n}(\mathbb{R})\) and \({\rm GL}_{n}(\mathbb{R})\) are not (the same holds for matrices with entries in \(\mathbb{C}\)) - see sheet 3.

Finally, before stating the connection between continuous functions and compact sets, we restate continuity in terms of convergent sequences:

A function \(f:X\to Y\) between two metric spaces is continuous at \(x\in X\) if and only if \[\lim_{n\to\infty}f(x_{n})=f(x)\quad\text{ for every convergent sequence }\{x_{n}\}_{n\in\mathbb{N}}\text{ in }X\text{ with }x_{n}\to x.\]

See sheet 3.

Recall (from Analysis I) that non-empty compact sets in \(\mathbb{R}\) have a minimal and maximal element. [This is because \(f(x)=|x|\) is continuous - the statement you have seen says continuous functions on compact sets attain their max/min.]

Let \(f:X\to Y\) be a map between two metric spaces. Then, \[K\subset X\text{ is compact and }f\text{ is continuous}\quad\Longrightarrow\quad\text{the image }f(K) \text{ is compact in }Y.\] In particular, for \(Y=\mathbb{R}\), any continuous real-valued function on a metric space \(X\) attains minima and maxima on compact sets.

Let \(\{y_{k}\}_{k\in\mathbb{N}}\) be a sequence in \(f(K)\), say with \(f(x_{k})=y_{k}\). We wish to show it has a convergent subsequence with limit in \(f(K)\). Since \(K\) is compact, there must be a convergent subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) (of the sequence \(\{x_{k}\}_{k\in\mathbb{N}}\)) with some limit \(x\) in \(K\). Since \(f\) is continuous, we have by Lemma [lem:continuity-by-convergent-sequences] that \(x_{n_{k}}\to x\) implies \(y_{n_{k}}=f(x_{n_{k}})\to f(x)\). Thus \(\{y_{n_{k}}\}_{k\in\mathbb{N}}\) has limit \(f(x)\), and because \(x\in K\) we must have \(f(x)\in f(K)\).

It follows that if \(K\) is a compact subset of \(\mathbb{C}\), then \(\mathrm{Re}(z),\mathrm{Im}(z)\) and \(|z|\) all attain maximum and minimum values on \(K\). In term 2, we will find out a stronger statement under the additional assumptions that \(f\) is (complex) differentiable and \(K\) is ‘nice’; then the maximum modulus of a function occurs on the boundary of \(K\). This is called the maximum modulus theorem.

3 Complex Differentiation

3.1 Complex differentiability

Differentiation of a complex function is defined in a similar way to that in \(\mathbb{R}\), except using complex limits.

[Complex differentiability] A function \(f:U\rightarrow\mathbb{C}\) defined on an open set \(U\) in \(\mathbb{C}\) is (complex) differentiable at \(\boldsymbol{z_{0}\in U}\) if \[\lim_{z\to z_{0}}\frac{f(z)-f(z_{0})}{z-z_{0}}\quad\text{exists}.\] We call this limit the derivative of \(\boldsymbol{f}\) at \(\boldsymbol{z_{0}}\) and write \(\boldsymbol{f'(z_{0})}\) for the limit, i.e. \[f'(z_{0})=\lim_{h\to0}\frac{f(z_{0}+h)-f(z_{0})}{h}.\]

  • In the second formulation the quantity \(h\) is a complex number (not a real number), so the limit must exist from every direction. The second is often the more useful expression of the two.

  • Note that if a function \(f\) is complex differentiable at \(z\) then it is continuous at \(z\).

Differentiating complex functions from first principles (examples)

  1. Rules for differentiating polynomials are the same. For example, consider \(f(z)=z^{2}\) on \(\mathbb{C}\). For any \(z\in\mathbb{C}\) we have \[\lim_{h\to0}\frac{(z+h)^{2}-z^{2}}{h}=\lim_{h\to0}\frac{z^{2}+2hz+h^{2}-z^{2}}{h}=\lim_{h\to0}(2z+h)=2z.\] Thus, \(f\) is differentiable on \(\mathbb{C}\) and \(f'(z)=2z\) as expected.

  2. Consider \(f(z)=\bar{z}\). For it to be differentiable we must obtain the same limit from every direction. But, considering limits from the purely real and purely imaginary directions, for every \(z\in\mathbb{C}\) we have \[\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{\overline{z+h}-\bar{z}}{h}=\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{h}{h}=1,\] yet \[\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{\overline{z+ih}-\bar{z}}{ih}=\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{-ih}{ih}=-1.\] Since \(z\) was arbitrary this shows \(f\) is not differentiable anywhere.

  3. As in the real case, sums/products/quotients of complex differentiable functions are complex differentiable where defined (e.g., all polynomials/rational functions). In particular, the product and quotient rules hold for complex derivatives.

  4. Composition of differentiable functions are complex differentiable where defined. In particular, the chain rule holds for complex derivatives. The proofs of \(3.\) and \(4.\) here are almost identical to those from Analysis I, so are excluded.

  5. Generally, non-constant purely real/imaginary functions are not complex differentiable ; e.g., \(\mathrm{Re}(z),\mathrm{Im}(z),|z|\) are nowhere differentiable as functions from \(\mathbb{C}\to\mathbb{C}\).

3.2 Cauchy-Riemann equations

We have a basic way of determining whether a function is complex differentiable, but for more complicated functions we wish to find some nicer conditions. We consider a complex valued function \(f\), defined for \(z=x+iy\) in some set \(U\subseteq\mathbb{C}\), as a function of the two real variables \(x\) and \(y\). We write \[f(z)=u(x,y)+iv(x,y)\] and call \(\mathrm{Re}(f)=u(x,y)\) the real part of \(f\) and \(\mathrm{Im}(f)=v(x,y)\) the imaginary part of \(f\).

For example, if \(f(z)=z^{2}\) then \(z^{2}=(x+iy)^{2}=x^{2}-y^{2}+i2xy\), so \(u(x,y)=x^{2}-y^{2}\) and \(v(x,y)=2xy\).

When considered as real functions \(\mathbb{R}^{2}\to\mathbb{R}\) there are naturally two partial derivatives associated with each of \(u\) and \(v\): \[\begin{aligned} u_{x}(x,y)=\frac{\partial}{\partial x}u(x,y):=\lim_{h\to0}\frac{u(x+h,y)-u(x,y)}{h},\quad\quad & u_{y}(x,y):=\lim_{h\to0}\frac{u(x,y+h)-u(x,y)}{h},\\ v_{x}(x,y):=\lim_{h\to0}\frac{v(x+h,y)-v(x,y)}{h},\quad\quad & v_{y}(x,y):=\lim_{h\to0}\frac{v(x,y+h)-v(x,y)}{h}.\end{aligned}\] Note that the limits above are standard real limits in \(\mathbb{R}\) (so \(h\) is real).

[Cauchy-Riemann equations] Let \(f=u+iv\) be complex differentiable at \(z_{0}\). Then the real partial derivatives \(u_{x},u_{y},v_{x},v_{y}\) exist at \(z_{0}\) and satisfy the Cauchy-Riemann equations: \[u_{x}(z_{0})=v_{y}(z_{0})\qquad\qquad u_{y}(z_{0})=-v_{x}(z_{0}).\] Furthermore, the derivative of \(f\) at \(z_{0}\) can be written as \[\begin{aligned} f'(z_{0}) & =u_{x}(z_{0})+iv_{x}(z_{0})=v_{y}(z_{0})-iu_{y}(z_{0})\\ & =u_{x}(z_{0})-iu_{y}(z_{0})=v_{y}(z_{0})+iv_{x}(z_{0}).\end{aligned}\]

Since \(f\) is complex differentiable at \(z_{0}=x_{0}+iy_{0}\), the limit in the definition must exist and agree no matter which way we approach \(z_{0}\). Choosing the purely real and purely imaginary directions we have \[\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{f(z_{0}+h)-f(z_{0})}{h}\quad=\quad f'(z_{0})\quad=\quad\lim_{\substack{h\rightarrow0\\ h\in\mathbb{R} } }\frac{f(z_{0}+ih)-f(z_{0})}{ih}.\] We may split each limit into real and imaginary parts, by standard properties of complex limits. Since, when \(h\) is real, \[\begin{aligned} f(z_{0}+h) & =f((x_{0}+h)+iy_{0})=u(x_{0}+h,y_{0})+iv(x_{0}+h,y_{0}),\quad\text{ and}\\ f(z_{0}+ih) & =f(x_{0}+i(y_{0}+h))=u(x_{0},y_{0}+h)+iv(x_{0},y_{0}+h),\end{aligned}\] the above reads \[\begin{aligned} & \quad\lim_{h\rightarrow0}\frac{u(x_{0}+h,y_{0})-u(x_{0},y_{0})}{h}+i\lim_{h\rightarrow0}\frac{v(x_{0}+h,y_{0})-v(x_{0},y_{0})}{h}\quad & =\quad f'(z_{0})\\ = & \quad\frac{1}{i}\lim_{h\rightarrow0}\frac{u(x_{0},y_{0}+h)-u(x_{0},y_{0})}{h}+\frac{i}{i}\lim_{h\rightarrow0}\frac{v(x_{0},y_{0}+h)-v(x_{0},y_{0})}{h}.\end{aligned}\] In other words, \(u_{x}(z_{0})+iv_{x}(z_{0})=f'(z_{0})=(1/i)u_{y}(z_{0})+v_{y}(z_{0})\). Since \(1/i=-i\) the statement follows by comparing real and imaginary parts.

Returning to the function \(f(z)=z^{2}\), note that the C-R equations indeed hold: We have \(u(x,y)=x^{2}-y^{2}\) and \(v(x,y)=2xy\) and so \[u_{x}=2x,\quad v_{y}=2x,\quad u_{y}=-2y,\quad v_{x}=2y.\]

[Warning] Note that Proposition [prop:complex-diff-implies-CR] provides a way of showing a function is not differentiable at a given point. It cannot be used to prove a function is differentiable. I.e., it is only a necessary condition, not a sufficient one.

It turns out, by considering a complex function \(f:\mathbb{C}\to\mathbb{C}\) as a real function \(f:\mathbb{R}^{2}\to\mathbb{R}^{2}:(x,y)\mapsto(u(x,y),v(x,y))\), that we have precisely \[f\text{ is complex differentiable}\quad\iff\quad f\text{ is real differentiable}+\text{ C-R equations hold;}\] so indeed we do have a converse:

Let \(f=u+iv\) be defined on an open subset \(U\) of \(\mathbb{C}\). Assume the partial derivatives \(u_{x},u_{y},v_{x},v_{y}\) exist, are continuous, and satisfy the Cauchy-Riemann equations at \(z_{0}\in U\). Then \(f\) is complex differentiable at \(z_{0}\).

Omitted. (If you’re interested to see a proof of this theorem, then you can find one in Section 5.6, page 59, of the book “Introduction to Complex Analysis” by H. A. Priestley, Second Edition, Oxford University Press, 2009.)

Determining complex differentiability via Cauchy-Riemann (examples)

  1. Let \(f(z)=\exp(z)=e^{x}\cos y+ie^{x}\sin y\). Then \[u_{x}=e^{x}\cos y,\quad v_{y}=e^{x}\cos y,\quad u_{y}=-e^{x}\sin y,\quad v_{x}=e^{x}\sin y.\] All these functions are continuous as real functions (see Calculus I/AMV II) and the C-R equations hold. Thus, by Theorem [thm:Cauchy-Riemann-implies-complex-differentiability], \(\exp\) is differentiable everywhere in \(\mathbb{C}\) and by Proposition [prop:complex-diff-implies-CR] \[\exp'(z)=u_{x}+iv_{x}=e^{x}\cos y+ie^{x}\sin y=\exp(z).\]

  2. By the chain rule, \(f(z)=e^{iz}\) is differentiable and \(f'(z)=ie^{iz}\). Since they are just sums of \(\exp\), all of the functions \(\sin,\cos,\sinh,\cosh\) are differentiable everywhere in \(\mathbb{C}\) and you can verify \[\begin{aligned} \sin'(z)=\cos z,\quad\cos'(z)=-\sin z,\quad\sinh'(z)=\cosh z,\quad\cosh'(z)=\sinh z.\end{aligned}\] Similarly, all polynomials/rational functions are differentiable with the usual formulae: e.g., for \(a_{0}\ldots a_{n}\) complex; \[(a_{n}z^{n}+\cdots+a_{2}z^{2}+a_{1}z+a_{0})'=na_{n}z^{n-1}+\cdots+2a_{2}z+a_{1}.\] For the branch of log corresponding to arguments in \((\theta_{1},\theta_{2}]\), the function \(\log(z)\) is differentiable in \(\mathbb{C}\setminus R_{\theta_{1}}\), i.e. at all points outside the branch cut. At these points, the derivative is given by \(\log'(z)=1/z\) (see Sheet 4 Q7 where this is proved for the principal branch).

  3. \(f(z)=\sin(z)/z^{2}\) is differentiable everywhere except \(z=0\), since it is the quotient of two complex differentiable functions (and is not defined when \(z^{2}=0\)). Furthermore, \[\left[\sin(z)/z^{2}\right]'=\frac{(\cos z)(z^{2})-(\sin z)(2z)}{(z^{2})^{2}}=\frac{z\cos z-2\sin z}{z^{3}}.\]

  4. Let \(f(z)=f(x+iy)=(x^{3}+3x^{2}y-y^{3}-x^{2}-2y^{2})+i(-x^{3}+3xy^{2}-y^{3}+4xy+3y)\). Then, \[\begin{aligned} u_{x}=3x^{2}+6xy-2x,\quad\quad & v_{y}=6xy-3y^{2}+4x+3,\\ u_{y}=3x^{2}-3y^{2}-4y,\quad\quad & v_{x}=-3x^{2}+3y^{2}+4y.\end{aligned}\] Thus, the partial derivatives exist and are continuous everywhere. We have \(u_{y}=-v_{x}\) everywhere, but \(u_{x}=v_{y}\) if and only if \(3x^{2}-2x=-3y^{2}+4x+3\); that is, when \((x-1)^{2}+y^{2}=2\). So, \(f\) is differentiable only on the circle of radius \(\sqrt{2}\) centred at \(1\) in the complex plane!

Holomorphicity

In example \(4.\) above, the function is only differentiable on a 1-dimensional subset of \(\mathbb{C}\). This is a similar situation to a real function on \(\mathbb{R}\) being differentiable only at a single point - not a very interesting function to work with from an analytical perspective. It will be useful for us to consider functions that are differentiable on genuine \(2\)-dimensional sets in \(\mathbb{C}\), for this will allow us to (later) express the functions using Taylor series. Such functions will turn out to have some quite remarkable properties.

[Holomorphic functions] A function \(f:U\to\mathbb{C}\) defined on an open set \(U\subset\mathbb{C}\) is holomorphic on \(\boldsymbol{U}\) if it is complex differentiable at every point in \(U\).

We say \(f\) is holomorphic at \(\boldsymbol{z_{0}\in U}\) if it is holomorphic on some open ball \(B_{\epsilon}(z_{0})\); (in other words, if there exists \(\epsilon>0\) such that \(f\) is complex differentiable at every point in \(B_{\epsilon}(z_{0})\)).

  • By the previous examples, exp, trig functions, and polynomials are holomorphic on all of \(\mathbb{C}\). Logarithms and complex powers are holomorphic at points outside their branch cuts.

  • The function in the previous example \(4.\) is nowhere holomorphic despite being differentiable on a circle. To see this, simply note that any (open) ball in \(\mathbb{C}\) centred at a point \(z\) on this circle must contain a point \(w\) not on the circle. The function is not complex differentiable at \(w\), so is not holomorphic at \(z\). Thus, being holomorphic is a strictly stronger property than being complex differentiable.

3.3 Connected sets and zero derivatives

We explore what it means for the derivative to be zero for a complex function. Consider, for example, the function \(f:U\to\mathbb{C}\), where \(U=\{z\in\mathbb{C}:|z|\neq1\}\) and \[f(z)=\begin{cases} 1\ & \text{ if }\quad|z|<1,\\ 2\ & \text{ if }\quad|z|>1. \end{cases}\] The set \(U\) is open, the function \(f\) is clearly holomorphic on \(U\) and has zero derivative on \(U\). However, it takes two different values - it is not constant. The problem is that \(f\) lives on two ‘unconnected’ subsets of the complex plane. We wish to find conditions on the set \(U\) under which we can conclude \(f\) is constant (as we are used to for real functions).

First, we need some terminology:

[Paths & path-connectedness]

  1. A path or curve (from \(\boldsymbol{a\in\mathbb{C}}\) to \(\boldsymbol{b\in\mathbb{C}}\)) is a continuous function \(\gamma:[0,1]\to\mathbb{C}\) with \(\gamma(0)=a\) and \(\gamma(1)=b\). We say the path/curve is closed if \(a=b\) (in this case, the endpoints of the path join up).

  2. A path/curve is smooth if it is continuously differentiable. (Our definition of the derivative of a function from \(\mathbb{R}\to\mathbb{C}\) is the ‘obvious’ one.)

  3. We say a subset \(U\subseteq\mathbb{C}\) is path-connected if for every pair of points \(a,b\in U\) there exists a smooth path from \(a\) to \(b\) such that \(\gamma(t)\in U\) for every \(t\in[0,1]\).

  • [Advanced/Warning] These definitions are not universal to all of mathematics. Outside of complex analysis, people often use ‘smooth’ to refer to the property of having infinitely many derivatives. Also, usually people use ‘path-connected’ to mean you can connect two points simply by a continuous curve. We used a stronger definition to make our life easier.

Intuitively, a subset is path-connected if it has ‘one piece’; i.e., if you can draw a line between any two points in the set without the pencil leaving the subset (or the paper!).

[Domains] A domain \(D\) is an open, path-connected subset of \(\mathbb{C}\). Some people call domains regions.

Showing sets are domains (examples)

  1. We know both \(\mathbb{C}\) and \(B_{r}(z)\) (for any \(z\in\mathbb{C}\) and \(r>0\)) are open. They are also path-connected: Simply choose the line segment between \(a\) and \(b\) given by \(\gamma(t)=a+(b-a)t\). This is clearly a smooth path, so \(\mathbb{C}\) and \(B_{r}(z)\) are domains.

  2. The set \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\), on which the principal branches of \(\log\) and complex powers are defined, is path-connected: Choose arbitrary \(a,b\in\mathbb{C}\setminus\mathbb{R}_{\leq0}\). If \(a\) lies on the positive real axis then the line segment again works (since it avoids the negative real axis). Otherwise, we could try letting \[\gamma_{1}(t)=|a|e^{i\mathrm{Arg}(a)(1-t)},\quad\quad\gamma_{2}(t)=|a|+(b-|a|)t.\] The first path takes \(a\) to \(|a|\), avoiding \(\mathbb{R}_{\leq0}\) by tracing around the circle of radius \(|a|\) centred at the origin. The second joins \(|a|\) to \(b\) via a straight line. (You can reparametrize to make \(t\) run from \(0\) to \(1\) on the combined path if you like.) But the combined path running through \(\gamma_1\) then \(\gamma_2\) is not differentiable at the point where the paths \(\gamma_1\) and \(\gamma_2\) meet! One possibility for a differentiable path joining the points \(a\) and \(b\) would be the arc of the circle through \(a\) and \(b\) that does not cross \(\mathbb{R}_{\leq0}\).

    We know \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\) is also open, so it is a domain.

  3. The set \(\{z\in\mathbb{C}:|z|\neq1\}\) is not a domain. It is open, but it is not path-connected. Pick a point \(a\) with \(|a|<1\) and \(b\) with \(|b|>1\). We cannot draw a continuous path from \(a\) to \(b\) without crossing the circle \(|z|=1\); i.e., you cannot draw a continuous path between the points without leaving the set.

Previously we have seen that there is a chain rule for the composition of two complex differentiable functions. There is also the chain rule for the composition of a complex differentiable function and a smooth path:

[Chain rule] Let \(U\subset\mathbb{C}\) be an open set, \(f:U\to\mathbb{C}\) be a holomorphic function on \(\mathbb{C}\) and \(\gamma:[0,1]\to U\) be a smooth path. Then for \(t_{0}\in[0,1]\) we have

\[(f\circ\gamma)'(t_{0})=f'(\gamma(t_{0}))\gamma'(t_{0}).\]

Similar to the proof of the chain rule in Analysis I.

Let \(f:D\to\mathbb{C}\) be holomorphic on a domain \(D\subseteq\mathbb{C}\). If \(f'(z)=0\) for every \(z\in D\) then \(f\) is constant on \(D\).

Since \(D\) is path-connected, it is enough to show \(f\) is constant on any smooth path \(\gamma\) lying in \(D\). In other words we wish to show \(f\circ\gamma\) is a constant function of \(t\).

We have \((f\circ\gamma)'(t)=f'(\gamma(t))\gamma'(t)=0\) by the chain rule. Since \(f=u+iv\), we have

\[0=(f\circ\gamma)'(t)=(u\circ\gamma)'(t)+i(v\circ\gamma)'(t),\] which means both \((u\circ\gamma)'(t)\) and \((v\circ\gamma)'(t)\) are equal to \(0\) for all \(t\). Hence \(u\) and \(v\) are constant along \(\gamma\), so \(f\) is too.

The same statement holds if “\(f'=0\)” is replaced with “\(f\) is purely real/imaginary”, or “\(f\) has constant real/imaginary part”, or “\(f\) has constant modulus” (see Sheet 4). So none of \(\mathrm{Re}(z),\mathrm{Im}(z),|z|\) or \(\mathrm{Arg}(z)\) are holomorphic anywhere (since they are non-constant real-valued functions).

3.4 The angle-preserving properties of holomorphic functions

Given a smooth path \(\gamma:[0,1]\to\mathbb{C}\), we can talk about the tangent vector to \(\gamma\) at a point \(z_{0}=\gamma(t_{0})\) on the path.

[Conformal maps] We say a (real differentiable) map \(f:D\to\mathbb{C}\) on a domain \(D\subseteq\mathbb{C}\) is conformal at \(\boldsymbol{z_{0}}\) if it preserves the angle and orientation between any two tangent vectors at \(z_{0}\). This is exactly the same as saying that it preserves the angle and orientation between any two smooth curves passing through \(z_{0}\). We say that \(f\) is conformal if it is conformal at all points in \(D\).

It turns out that there is a close relationship between holomorphic and conformal maps.

[Holomorphic maps are conformal] A holomorphic map \(f\) with \(f'(z_{0})\ne0\) is conformal at \(z_{0}\).

Take a smooth path \(\gamma:[0,1]\to\mathbb{C}\) that passes through \(z_{0}\), and consider the tangent vector to \(\gamma\) at the point \(z_{0}=\gamma(t_{0})\) on the path. What happens to this tangent vector under the application of a holomorphic function \(f\)?

The path \(f\circ\gamma\) is the path obtained by mapping \(\gamma\) under \(f\). So to calculate what happens to the tangent vector \(\gamma'(t_{0})\) under \(f\), we should calculate the tangent vector \((f\circ\gamma)'(t_{0})\). This is done using the chain rule: \[(f\circ\gamma)'(t_{0})=f'(\gamma(t_{0}))\gamma'(t_{0})=f'(z_{0})\gamma'(t_{0}).\] This describes what happens to tangent vectors geometrically: \(f\) transforms the tangent vector \(\gamma'(t_{0})\) by multiplying it by the complex number \(f'(z_{0})\).

But we understand what multiplication by complex numbers means geometrically: if \(f'(z_{0})\neq0\) then multiplication by \(f'(z_{0})\) is dilation (by \(|f'(z_{0})|\)) followed by a rotation (by \(\mathrm{Arg}(f'(z_{0}))\)).

We now notice the following: both dilations and rotations preserve the angles and orientations between vectors! This completes the proof.

Determining where maps are conformal (examples)

  • Let \(f(z)=z^{2}\). Here, \(f'(z)=2z\) so \(f'(z)=0\iff z=0\) and so \(f\) is conformal on \(\mathbb{C}^{\ast}=\mathbb{C}\setminus\{0\}\) (since it is holomorphic everywhere). We don’t yet know whether \(f\) is conformal at the origin, but we can check: we know that \(f'(z)=2z\) so \(f'(0)=0\) and hence the action of \(f\) on tangent vectors at the origin is to multiply by \(0\), hence sending any tangent vector to zero.

  • Let \(f(z)=\frac{2}{3}z^{3}+\frac{1}{2}(1-2i)z^{2}-iz+2-1\). This function is clearly holomorphic as it is just a polynomial. We have \[f'(z)=2z^{2}+(1-2i)z-i=(2z+1)(z-i),\] which has zeros at \(-\frac{1}{2}\) and \(i\), so \(f\) is conformal on \(\mathbb{C}\setminus\{-\frac{1}{2},i\}\). [Note, we can explicitly check what \(f\) does to tangent vectors at each point in this set: For example, at \(z=i/2\) we have \(f'(i/2)=(i+1)(-i/2)=(1-i)/2=\frac{1}{\sqrt{2}}e^{-i\pi/4}.\) Thus, at \(z=i/2\) the function \(f\) rotates tangent vectors by \(\mathrm{Arg}(f'(i/2))=-\pi/4\) degrees, then dilates by a factor of \(|f'(z)|=1/\sqrt{2}\).]

    Rather than having to then check two paths through \(z=1/2\) and \(z=i\) respectively, to determine whether \(f\) is conformal there, it will be convenient to have a complete description in terms of holomorphicity. The converse to Lemma [lem:A-holomorphic-map-is-conformal] holds:

[Conformal maps are holomorphic] Let \(D\) be a domain. If \(f\) is conformal at \(z_{0}\in D\) then \(f\) is complex differentiable at \(z_{0}\) and \(f'(z_{0})\neq0\). Therefore, if \(f\) is conformal on \(D\), then \(f\) is holomorphic on \(D\) and \(f'(z)\ne0\) for all \(z\in D\). Thus \[f\text{ is conformal on }D\quad\iff\quad f\text{ is holomorphic with }f'(z)\neq0\text{ for all } z\in D.\]

(Non-examinable) Sketch: we know \(f=u+iv\) is real differentiable when considered as a map \(f:\left(\begin{smallmatrix}x\\ y \end{smallmatrix}\right)\mapsto\left(\begin{smallmatrix}u(x,y)\\ v(x,y) \end{smallmatrix}\right)\). The total differential of this map at a point \(z_{0}=x_{0}+iy_{0}\) (see AMV II) is the Jacobian matrix \(D_{z_{0}}=\left(\begin{smallmatrix}u_{x}(x_{0},y_{0}) & u_{y}(x_{0},y_{0})\\ v_{x}(x_{0},y_{0}) & v_{y}(x_{0},y_{0}) \end{smallmatrix}\right)\). To preserve angles it must be of the form \(c\left(\begin{smallmatrix}\cos\theta & -\sin\theta\\ \sin\theta & \cos\theta \end{smallmatrix}\right)\) for some \(c\in\mathbb{R}_{\neq0}\). Thus \(D_{z_{0}}\) is of the form \(\left(\begin{smallmatrix}a & -b\\ b & a \end{smallmatrix}\right)\) and the C-R equations are satisfied at \(z_{0}\). Since \(f\) is real differentiable by assumption, C-R equations are actually enough to show that \(f\) is complex differentiable at \(z_{0}\) (we have not stated exactly this result before, but it holds). Since \(\det D_{z}\neq0\) we have \(f'(z_{0})=u_{x}(x_{0},y_{0})+iv_{x}(x_{0},y_{0})\neq0\).

[Advanced] The idea of this proof, in reverse, can be adapted to give a more sophisticated proof of the CR equations!

Determining where maps are conformal (examples continued)

  • Let \(f(z)=xy+iy^{2}\). Then \(u_{x}=y,v_{y}=2y,u_{y}=x,v_{x}=0\). So, the C-R equations only hold at \(y=x=0\). Hence \(f\) is not conformal at any point \(z_{0}\neq0\), since it cannot be complex differentiable at \(z_{0}\) (as C-R. equations fail to hold), and hence it is not conformal at \(z_{0}\) by Proposition [prop:conf-to-holo]. What happens for \(z_{0}=0\)? There, \(f'(0)=u_{x}(0,0)+iv_{x}(0,0)=0+i0=0\). So \(f\) cannot be conformal at \(0\) either by Proposition [prop:conf-to-holo]. Thus, \(f\) is nowhere conformal.

Visualising conformal maps

Here is a useful result, helpful in visualising the action of conformal maps:

Any conformal map maps orthogonal grids in the \((x,y)\)-plane to orthogonal grids.

[Important warning] We must allow our grids to be made up of arbitrary smooth curves, not necessarily straight lines (Corollary [cor:conformal-preserves-grids] is false if they are required to be made of straight lines).

How to visualise conformal maps (examples)

  • Let \(f(z)=z^{2}=x^{2}-y^{2}+i2xy\) and consider the simple grid in the \((x,y)\)-plane made up of lines parallel to the real/imaginary axes, separated by distance \(1\).

    Given a line (not passing through the origin) in such a grid, say \(x=a\,(\neq0)\), the function \(f\) takes this line to the points \((a^{2}-y^{2},2ay)\) in the \((u,v)\)-plane. If \(u=a^{2}-y^{2}\) and \(v=2ay\), then \(v^{2}=4a^{2}y^{2}\) and so \(u=a^{2}-(\frac{v}{2a})^{2}\) and the image defines a parabola.

    Similarly, the line \(y=b\,(\neq0)\) is taken to the parabola \(u=(\frac{v}{2b})^{2}-b^{2}\) in the \((u,v)\)-plane. By sketching these on the same axes we see that the parabolas indeed cross at right angles.

  • For the same example, consider the level curves \(u(x,y)=a\) and \(v(x,y)=b\) with \(a,b\neq0\). These trace out the curves \(y^{2}=x^{2}-a\) and \(y=\frac{b}{2x}\) respectively. By sketching these curves on a graph (for, say, \(u=\pm1,v=\pm1\)) we see that they are perpendicular. (They have to be, since \(f(z)\) is conformal on \(\mathbb{C}-\{0\}\) and they map to perpendicular straight lines.)

3.5 Biholomorphic maps

We wish to build a dictionary of conformal maps as a way of getting back and forth between domains of the complex plane. To do this, we need our maps to be invertible.

[Biholomorphic maps] Let \(D\) and \(D'\) be domains. We say that \(f:D\to D'\) is biholomorphic if \(f\) is holomorphic, a bijection, and the inverse \(f^{-1}:D'\to D\) is also holomorphic. A biholomorphic map \(f\) is called a biholomorphism. When \(f\) as above exists, we say that the domains \(D\) and \(D'\) are biholomorphic and write \(f:D\xrightarrow{\sim}D'\).

[Advanced] It is possible to prove that if \(f\) as above is holomorphic and bijective, then it is automatic that its inverse is holomorphic.

Finding biholomorphisms (examples)

  1. The function \(\exp:\mathbb{C}\to\mathbb{C}^{\ast}\) is not biholomorphic since it is not injective (e.g., \(e^{z}=e^{z+2n\pi i}\)). How can we make it biholomorphic? We simply need to restrict it to a smaller domain where it will be injective. It is not too hard to check that \[\exp(z_{1})=\exp(z_{2})\] if and only if \(\mathrm{Re}(z_{1})=\mathrm{Re(}z_{2})\) and \(\mathrm{Im}(z_{1})-\mathrm{Im}(z_{2})\in2\pi\mathbb{Z}\). So if we rule out different \(z_{1}\) and \(z_{2}\) whose imaginary parts differ by an non-zero integer multiple of \(2\pi\) then we might be ok. We can do this by setting \[D:=\{z\in\mathbb{C}:\mathrm{Im}(z)\in(-\pi,\pi)\:\}.\] If \(z=x+iy\in D\) then \(\exp(z)=e^{x}e^{iy}\) so the principal argument of \(\exp(z)\) is never equal to \(\pi\). On the other hand, it is not hard to check that \(\exp\) maps \(D\) onto \(\mathbb{C}-\mathbb{R}_{\leq0}\). Moreover, by our choice of \(D\), \(\exp\) is injective on \(D\). So \(\exp\) is a bijection between \(D\) and \(\mathbb{C}-\mathbb{R}_{\leq0}\). Therefore it has an inverse function, and we already know what it is: the principal branch of \(\mathrm{Log}\). The function \(\mathrm{Log}\) maps \(\mathbb{C}-\mathbb{R}_{\leq0}\) to \(D\) and is differentiable at every point of \(\mathbb{C}-\mathbb{R}_{\leq0}\), so it is holomorphic on \(\mathbb{C}-\mathbb{R}_{\leq0}\). In summary, when restricted to \(D\), \(\exp\) is biholomorphic and hence \(D\) and \(\mathbb{C}-\mathbb{R}_{\leq0}\) are biholomorphic.

  2. Let \(f(z)=z^{2}\). We have seen that this function is holomorphic on \(\mathbb{C}\). However, as in the previous example, it is not injective since e.g. \(f(1)=f(-1)=1\). Note that \(f(z_{1})=f(z_{2})\) with \(z_{1}\neq z_{2}\) if and only if \(z_{1}=\pm z_{2}\), so we may be able to create a biholomorphic function by restricting \(f\) to a domain that never simultaneously contains \(z\) and \(-z\). Such a domain is the right half plane \(\mathbb{H}_{R}:=\{z\in\mathbb{C}:\mathrm{Re}(z)>0\}\). We saw in a previous example that \(f\) maps \(\mathbb{H}_{R}\) to \(\mathbb{C}-\mathbb{R}_{\leq0}\). In fact this is a bijection, with inverse given by \[f^{-1}:\mathbb{C}-\mathbb{R}_{\leq0}\to\mathbb{H}_{R},\quad f^{-1}(z):=\exp(\frac{1}{2}\mathrm{Log}(z)).\] Thus \(f^{-1}\) is holomorphic, since it is a composition of holomorphic functions, and hence \(f\) gives a biholomorphism \(f:\mathbb{H}_{R}\xrightarrow{\sim}\mathbb{C}-\mathbb{R}_{\leq0}.\)

  3. The ‘affine’ linear maps \(z\mapsto az+b\) (for \(a\in\mathbb{C}^{\ast},b\in\mathbb{C}\)) are biholomorphic \(\mathbb{C}\xrightarrow{\sim}\mathbb{C}\).

As expected we can compose biholomorphic maps to construct new ones.

[Automorphism groups] Let \(D\subset\mathbb{C}\) be a domain. The set of all biholomorphic maps \(f:D\xrightarrow{\sim}D\) from \(D\) to itself forms a group under composition. We call this group the automorphism group of \(\boldsymbol{D}\) and denote it by \({\rm Aut}(D)\).

We need only check the group properties:

  • The identity map \({\rm Id}:z\mapsto z\) is always in \(\mathrm{Aut}(D)\), and is clearly biholomorphic.

  • We know composition of functions is associative, so \(f\circ(g\circ h)=(f\circ g)\circ h\), for holomorphic maps.

  • \(f^{-1}\) is the inverse of \(f\) since \(f\circ f^{-1}={\rm Id}=f^{-1}\circ f\). Moreover, \(f^{-1}\) is in \(\mathrm{Aut}(D)\) since it is itself holomorphic by assumption and has holomorphic inverse \((f^{-1})^{-1}=f\).

  • \(\mathrm{Aut}(D)\) is closed: \((f\circ g)\) is holomorphic (by the chain rule) and its inverse \((f\circ g)^{-1}:=g^{-1}\circ f^{-1}\) is also the composition of two holomorphic functions, so holomorphic.

Note that example \(1.\) shows conformal maps are not necessarily biholomorphic. On the other hand, it turns out (see Sheet 5) that all biholomorphic maps are conformal. [Whilst the converse is not true in general, conformal maps turn out to be ‘locally’ biholomorphic.]

4 Möbius transformations

4.1 Definition and first properties of Möbius transformations

In this chapter we study a very special class of functions that play a truly fundamental role in mathematics, and have many beautiful properties. Recall that \(\mathrm{GL}_{2}(\mathbb{C})\) is the collection of matrices \[\begin{aligned} \mathrm{GL}_{2}(\mathbb{C}) & =\left\{ \left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\::\:a,b,c,d\in\mathbb{C},\:ad-bc\neq0\right\} \\ & =\{2\times 2\text{ complex matrices with nonzero determinant\}.}\end{aligned}\] Also recall the extended complex plane \(\hat{\mathbb{C}}=\mathbb{C}\cup\{\infty\}\) from Chapter 1.

[Möbius transformations] Given any matrix \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\in GL_{2}(\mathbb{C})\) we can define a function

\[M_{T}:\mathbb{C}\to\hat{\mathbb{C}}\] by the formula \[M_{T}(z)=\frac{az+b}{cz+d}\] if \(cz+d\neq0\), and if \(cz+d=0\), then we set \(M_{T}(z)=\infty\). The function \(M_{T}\) is called a Möbius transformation.

  • [Advanced] The reason for excluding matrices for which \(\det T=0\) is because if \(\det T=0\) and at least one of \(c\) and \(d\) is non-zero, so that the definition makes sense, then \(T\) has rank 1, and this implies that \((a,b)=\lambda(c,d)\) for some \(\lambda\in\mathbb{C}\). But this would give that for \(z\) with \(cz+d\neq0\), we would have \(M_{T}(z)=\lambda\), so the transformation would be constant. This is not a very interesting function to think about!

  • For \(T\in\mathrm{GL}_{2}(\mathbb{C})\), by taking complex square roots, we can find a number such that \(k^{2}=\det T\). Then \[M_{T}(z)=\frac{az+b}{cz+d}=\frac{\frac{az}{k}+\frac{b}{k}}{\frac{cz}{k}+\frac{d}{k}}=M_{\frac{_{1}}{^{k}}T}(z)\] and \[\det(\frac{_{1}}{^{k}}T)=\frac{1}{k^{2}}\det T=1.\] Thus, we may scale any \(T\in\mathrm{GL}_{2}(\mathbb{C})\) to obtain \(T'=\frac{T}{k}\) such that \(\det T'=1\), and \(M_{T}=M_{T'}\).

It is slightly annoying that at the moment, the domain of a Möbius transformation is not the same as its codomain. This would prevent us from composing two Möbius transformations. However, it is possible to extend the definition to describe where \(\infty\in\hat{\mathbb{C}}\) gets mapped to: we declare that

\[M_{T}(\infty)=\begin{cases} \frac{a}{c} & \text{if }c\neq0\\ \infty & \text{if }c=0 \end{cases}.\] Now we have \(M_{T}:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\).

  1. \(f(z)=z^{-1}\) is a Möbius transformation corresponding to the matrix \(\left(\begin{array}{cc} 0 & 1\\ 1 & 0 \end{array}\right)\). Note that \(|z|<1\) and \(z\neq0\) if and only if \(|f(z)|=|z^{-1}|=|z|^{-1}>1\), so \(f\) maps the punctured unit ball \(B_{1}(0)-\{0\}\) onto the outside of the closed unit ball. We have \(f(0)=\infty\) and \(f(\infty)=0\), so \(f\) interchanges these two points.

  2. The Cayley map \(f(z)=\frac{z-i}{z+i}\) corresponds to the matrix \(\left(\begin{array}{cc} 1 & -i\\ 1 & i \end{array}\right).\) We have \[f(z)\in B_{1}(0)\iff|f(z)|<1\iff|z+i|>|z-i|\iff z\in\mathbb{H}=\{z\in\mathbb{C}:\mathrm{Im}(z)>0\}.\] Hence \(f\) maps the upper half plane to the open unit ball centered at \(0\). Moreover, \(f(\infty)=1\) and \(f(-i)=\infty\).

The set of Möbius transformations form a group under composition. Furthermore,

  1. \(M_{T_{1}}\circ M_{T_{2}}=M_{T_{1}T_{2}}.\)

  2. \((M_{T})^{-1}=M_{T^{-1}}.\)

  3. \(M_{T}=\mathrm{Id}\quad\iff\quad T=t\left(\begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right)\quad(t\in\mathbb{C}^{\ast}).\)

See Sheet 6.

Recall from Linear Algebra that \(\mathrm{GL}_{2}(\mathbb{C})\) forms a group under matrix multiplication. Lemma [lem:Mobius-are-group] says more than that the Möbius transformations form a group. It says that the mapping \[T\mapsto M_{T}\] is a group homomorphism between \(\mathrm{GL}_{2}(\mathbb{C})\) and the Möbius transformations! In other words, composing Möbius transformations is basically just multiplying matrices!

Let \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\in\mathrm{GL}_{2}(\mathbb{C})\). If \(c=0\), the Möbius transformation \(M_{T}\) gives a biholomorphic map \[M_{T}:\mathbb{C}\xrightarrow{\sim}\mathbb{C}.\] If \(c\neq0\), then \(M_{T}\) gives a biholomorphic map \[M_{T}:\mathbb{C}-\left\{ \frac{-d}{c}\right\} \xrightarrow{\sim}\mathbb{C}-\left\{ \frac{a}{c}\right\} .\]

If \(c=0\) then \(a,d\neq0\) since \(\det T\neq0\). Hence \[M_{T}(z)=\frac{az+b}{d}=\frac{a}{d}z+\frac{b}{d}\] is just an affine linear map which is holomorphic. It is a bijection with inverse given by \((M_{T})^{-1}=\frac{d}{a}z-\frac{b}{a}\) , which is holomorphic by the same token. So \(M_{T}\) gives a biholomorphic map \(\mathbb{C}\xrightarrow{\sim}\mathbb{C}\).

If \(c\neq0\) let’s check that \(M_{T}\) is complex differentiable at every \(z\neq\frac{-d}{c}\). We can calculate

\[M_{T}'(z)=\frac{a(cz+d)-c(az+b)}{(cz+d)^{2}}=\frac{\det(T)}{(cz+d)^{2}}.\] Therefore the derivative exists at all points of \(\mathbb{C}-\left\{ \frac{-d}{c}\right\}\) and hence \(M_{T}\) is holomorphic there. It is bijective with the inverse given by the inverse Möbius transformation, which is holomorphic by the same argument. Hence \(M_{T}\) is a biholomorphism from \(\mathbb{C}-\left\{ \frac{-d}{c}\right\} \xrightarrow{\sim}\mathbb{C}-\left\{ \frac{a}{c}\right\}\).

A Möbius transformation \(M_{T}\) is conformal at all \(z\in\mathbb{C}\) with \(M_{T}(z)\neq\infty\).

We know that biholomorphic maps are conformal, and \(M_{T}\) is a biholomorphism on \(\mathbb{C}\) minus the points that map to \(\infty\). Alternatively, one can check that in both cases of the proof of Lemma [lem:Mobius-biholo], the derivative never vanishes.

Any Möbius transformation is a bijection from \(\hat{\mathbb{C}}\) to \(\hat{\mathbb{C}}\).

By Lemma [lem:Mobius-are-group] \(M_{T}\) has an inverse \(M_{T^{-1}}\), so that \(M_{T}M_{T^{-1}}=M_{T^{-1}}M_{T}\) as maps from \(\hat{\mathbb{C}}\) to \(\hat{\mathbb{C}}\), so \(M_{T}\) is a bijection.

4.2 Fixed points, the cross-ratio, and the three points Theorem

Let \(T\in\mathrm{GL}_{2}(\mathbb{C})\) and \(M_T\) be a Möbius transformation. Then a point \(z\) is a fixed point of \(M_T\) if \(M_T(z) = z\).

Let \(T\in\mathrm{GL}_{2}(\mathbb{C})\). If \(M_{T}:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\) is not the identity map, then \(M_{T}\) has at most \(2\) fixed points in \(\hat{\mathbb{C}}\). In other words, if a Möbius transformation has three fixed points in \(\hat{\mathbb{C}}\), then it is the identity.

We split into two cases.

First suppose \(M_{T}(\infty)=\infty\). From the definition, this can only happen if \(c=0\), so \(M_{T}\) preserves \(\mathbb{C}\), and for \(z\in\mathbb{C}\), \(M_{T}(z)=\frac{a}{d}z+\frac{b}{d}\) with \(a,d\neq0\) (because \(\det T\neq0\)). Such an affine linear map has at most one fixed point as follows:

  • If \(a=d\) then \(b\neq0\) since we assume \(M_{T}\) is not the identity. Then \(M_{T}\) is a translation by \(b/d\) which has no fixed points.

  • If \(a\neq d\) then \(M_{T}\) has a unique fixed point in \(\mathbb{C}\) given by \(z_{0}=b/(d-a)\).

Now suppose \(M_{T}(\infty)\neq\infty\). Then any fixed points of \(M_{T}\) are in \(\mathbb{C}\). Suppose \(z_{0}\in\mathbb{C}\) is such that \(M_{T}(z_{0})=z_{0}\). We have \[M_{\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)}(z_{0})=z_{0}\quad\iff\frac{az_{0}+b}{cz_{0}+d}=z_{0}\iff\quad cz_{0}^{2}+(d-a)z_{0}-b=0.\] There are at most two roots of this quadratic equation, so at most two fixed points of \(M_{T}\) in \(\hat{\mathbb{C}}\).

We introduce an important quantity called the cross-ratio.

Given four distinct points \(z_{0},z_{1},z_{2},z_{3}\in\mathbb{C}\), the cross-ratio of these points is denoted \((z_{0},z_{1};z_{2},z_{3})\) and defined by

\[(z_{0},z_{1};z_{2},z_{3}):=\frac{(z_{0}-z_{2})(z_{1}-z_{3})}{(z_{0}-z_{3})(z_{1}-z_{2})}.\] We can extend the definition to the case that one of the points is \(\infty\) by removing all differences involving that point, for example,

\[(\infty,z_{1};z_{2},z_{3}):=\frac{(z_{1}-z_{3})}{(z_{1}-z_{2})}.\]

We can use the cross ratio to prove that a Möbius transformation is uniquely determined by how it acts upon any three given points in \(\hat{\mathbb{C}}\).

[Three points Theorem] Let \(\{z_{1},z_{2},z_{3}\}\) and \(\{w_{1},w_{2},w_{3}\}\) be two sets of three ordered distinct points in \(\hat{\mathbb{C}}\). Then there exists a unique Möbius Transformation \(f\) such that \(f(z_{i})=w_{i}\) for \(i=1,2,3\).

Existence: Consider the functions \[F(z):=(z,w_{1};w_{2},w_{3}),\quad G(z):=(z,z_{1};z_{2},z_{3}).\] These are Möbius transformations with the properties that \(F(w_{1})=1,F(w_{2})=0,\:F(w_{3})=\infty\), and \(G(z_{1})=1,G(z_{2})=0,G(z_{3})=\infty\). Therefore \(F^{-1}\circ G\) maps each \(z_{i}\) to \(w_{i}\).

Uniqueness: Assume that there are two such maps, say \(f\) and \(\tilde{f}\). Then the Möbius transformation \(H:=f^{-1}\circ\tilde{f}\) satisfies \(H(z_{i})=f^{-1}(\tilde{f}(z_{i}))=f^{-1}(w_{i})=z_{i}.\) This shows \(H\) has three fixed points, \(z_{1},z_{2},z_{3}\), and by Lemma [lem:at-most-2-fixed-points] it must therefore be trivial. Thus, \(f=\tilde{f}\).

One has the following beautiful and fundamental fact about Möbius transformations:

[Möbius transformations preserve cross-ratio] Möbius transformations preserve the cross ratio: if \(z_{0},z_{1},z_{2},z_{3}\) are four distinct points in \(\hat{\mathbb{C}}\), and \(f\) is a Möbius transformation, then \[(f(z_{0}),f(z_{1});f(z_{2}),f(z_{3}))=(z_{0},z_{1};z_{2},z_{3}).\tag{$\ast$}\]

Let \(w_{i}=f(z_{i})\). Let \(F\) and \(G\) be the functions defined in the proof of Theorem [thm:3points]. Recall that \(F^{-1}\circ G\) mapped each \(z_{i}\) to \(w_{i}\). So does \(f\). Since there is a unique Möbius transformation with this property, we must have \(f=F^{-1}\circ G\). Rearranging, \(F\circ f=G.\) But this is just the identity stated after applying both functions to \(z_{0}:\) \[(f(z_{0}),f(z_{1});f(z_{2}),f(z_{3}))=F\circ f(z_{0})=G(z_{0})=(z_{0},z_{1};z_{2},z_{3}).\]

Finding a Möbius transformation from three points (example)

Find the unique Möbius transformation \(f:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\) mapping the points \(\{1,-1,i\}\) to the points \(\{0,\infty,1\}\). Our method is simply to rearrange \((\ast)\): Notice that \[\lim_{|w|\to\infty}\frac{f(z)-w}{w_{1}-w}\:=\:\lim_{|w|\to\infty}\frac{\frac{f(z)}{w}-1}{\frac{w_{1}}{w}-1}\:=\:\frac{-1}{-1}\:=\:1,\] so \((\ast)\) reduces to \[\begin{aligned} 1\cdot\frac{w_{1}-w_{3}}{f(z)-w_{3}}\quad &=\quad\left(\frac{z-z_{2}}{z-z_{3}}\right)\left(\frac{z_{1}-z_{3}}{z_{1}-z_{2}}\right)\\ \iff \frac{0-1}{f(z)-1}\quad & =\quad\left(\frac{z-(-1)}{z-i}\right)\left(\frac{1-i}{1-(-1)}\right)\\ \iff f(z)-1\quad & =\quad\frac{-2(z-i)}{(z+1)(1-i)}\\ \iff f(z)\quad & =\quad\frac{(-1-i)z+(1+i)}{(1-i)z+(1-i)}\\ \iff f(z)\quad & =\quad\left(\frac{-1-i}{1-i}\right)\left(\frac{z-1}{z+1}\right)\\ \iff f(z)\quad & =\quad\frac{-iz+i}{z+1}.\end{aligned}\]

Möbius transformations methods

This leads us to a general strategy to find a Möbius transformation from how it acts on three points: Simply notice that since the cross ratio is preserved, \[\left(\frac{f(z)-w_{2}}{f(z)-w_{3}}\right)\left(\frac{w_{1}-w_{3}}{w_{1}-w_{2}}\right)\quad=\quad\left(\frac{z-z_{2}}{z-z_{3}}\right)\left(\frac{z_{1}-z_{3}}{z_{1}-z_{2}}\right),\] and to find \(f\) we need only rearrange this equation.

There is also a general strategy to find the image of a region \(D\) under a Möbius transformation \(M_T\):

  1. Find the image \(M_{T}(\partial D)\) of the boundary \(\partial D\).

  2. Find the image \(M_{T}(z_{0})\) of a point \(z_{0}\in D\) in the interior.

  3. The region \(D'\) bounded by \(M_{T}(\partial D)\) and containing \(M_{T}(z_{0})\) is precisely the image of \(D\) under \(M_{T}\), and \[M_{T}:D\:\xrightarrow{\sim}\:D'=M_{T}(D).\]

4.3 Circles and lines

The following property of Möbius transformations is very useful when deciding how they map different regions of the complex plane.

Möbius transformations map circles and lines in \(\hat{\mathbb{C}}\) to circles and lines in \(\hat{\mathbb{C}}\).

[Technical] We consider any line to pass through infinity, so that the above makes sense. By circles in \(\hat{\mathbb{C}}\) we mean simply circles in \(\mathbb{C}\).

In order to talk about what happens to circles and lines under Möbius transformations we need to know the equations of circles and lines. Of course, we already know the equation of a circle or line, but it turns out these can both be described by the same type of equation using complex numbers.

A circle of centre \(\alpha\) and radius \(r\) is given by the equation \[|z-\alpha|^{2}=r^{2}\] which can be rewritten as

\[r^{2}=(z-\alpha)(\bar{z}-\bar{\alpha})=z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\alpha\bar{\alpha}.\] Introducing a new parameter \(\beta=\alpha\bar{\alpha}-r^{2}\in\mathbb{R}\) we have the equation

\[z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\beta=0.\] This is the general equation of a circle, given by parameters \(\alpha\in\mathbb{C}\) and \(\beta\in\mathbb{R}\) such that \(|\alpha|^{2}-\beta=r^{2}>0\).

What about lines? Well a line can always be written as a bisector:

\[\ell=\{z\in\mathbb{C}\::\:|z-w_{1}|=|z-w_{2}|\}\] with \(w_{1}\neq w_{2}\in\mathbb{C}\). This means

\[(z-w_{1})(\bar{z}-\bar{w_{1}})=(z-w_{2})(\bar{z}-\bar{w_{2}})\] or

\[z\bar{z}-w_{1}\bar{z}-\bar{w_{1}}z+w_{1}\bar{w_{1}}=z\bar{z}-w_{2}\bar{z}-\bar{w_{2}}z+w_{2}\bar{w_{2}}.\] Rearranging this, and writing \(\alpha=w_{1}-w_{2}\neq0\) and \(\beta=w_{1}\bar{w_{1}}-w_{2}\bar{w_{2}}\in\mathbb{R}\), we have

\[-\alpha\bar{z}-\bar{\alpha}z+\beta=0.\] This is the general equation of a line in \(\mathbb{C}\), given by parameters \(\alpha\in\mathbb{C}\) and \(\beta\in\mathbb{R}\) such that \(\alpha\neq0\).

Therefore we have proved

[Equation of circles and lines in \(\mathbb{C}\)] Given \(\gamma,\beta\in\mathbb{R}\) and \(\alpha\in\mathbb{C}\), the equation \[\gamma z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\beta=0\] describes a circle if \(\gamma=1\) and \((|\alpha|^{2}-\beta)>0\), and a line if \(\gamma=0\) and \(\alpha\neq0\). Conversely, any circle or line can be described by an equation of this form.

[Proof of Proposition [prop:circlines]] Consider an arbitrary Möbius transformation \(M_{T}\) where \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right).\) We can assume by previous remarks that \(\det T=1\). If \(c=0\), then \(M_{T}\) is affine linear and is easily seen to preserve circles and lines (since rotations, dilations, and translations obviously do). So we can assume \(c\neq0\).

Note that \[\begin{aligned} M_{T}(z) & =\frac{az+b}{cz+d}=\frac{caz+cb}{c(cz+d)}=\frac{a(cz+d)}{c(cz+d)}+\frac{cb-ad}{c(cz+d)}=\frac{a}{c}-\frac{1}{c(cz+d)}=\frac{a}{c}-\left(\frac{1}{c^{2}}\right)\left(\frac{1}{z+\frac{d}{c}}\right).\end{aligned}\] Thus \(M_{T}\) is just some linear maps composed with the function \(f(z)=1/z\), called an inversion. Since linear maps preserve circles and lines, it is therefore enough to consider the action of \(f(z)=1/z\) on lines and circles. Moreover, we have \(f^{-1}(z)=1/z=f(z)\) so it is enough to consider one direction.

Let \(X\) be a circle or line. By Lemma [lem:circles-lines], \(X\) is described by an equation \(\gamma z\bar{z}-\alpha\bar{z}-\bar{\alpha}z+\beta=0\) with \(\gamma,\beta\in\mathbb{R}\). We claim that \(f(X)\) is described by a similar equation. Indeed, if \(z\in f(X)\) and \(z\neq0\), then \(f(z)=f^{-1}(z)\in X\) and this happens if and only if \[\gamma\left(\frac{1}{z}\right)\overline{\left(\frac{1}{z}\right)}-\alpha\overline{\left(\frac{1}{z}\right)}-\bar{\alpha}\left(\frac{1}{z}\right)+\beta=0\iff\beta z\bar{z}-\alpha z-\bar{\alpha}\bar{z}+\gamma=0.\]

Finally, let’s check this is the equation of a circle or a line.

First consider the case \(\beta=0\). If we started with a line we have \(\alpha\neq0\). If we started with a circle we know \(|\alpha|^{2}=|\alpha|^{2}-\beta>0\) so we have \(\alpha\neq0\) again. Therefore, the new equation describes a line.

Next, consider the case \(\beta\neq0\). Then we can divide by \(\beta\) to get \(z\bar{z}-\frac{\alpha}{\beta}z-\bar{\frac{\alpha}{\beta}}\bar{z}+\frac{\gamma}{\beta}=0\). This looks like the equation of a circle, but we have to check \[\begin{align} \vert|\frac{\alpha}{\beta}\vert|^{2}-\frac{\gamma}{\beta} & >0\label{eq:circle-ineq}\end{align}\] If we began with a line, then \(\gamma=0\) and \(\alpha\neq0\) so this is clearly true. If we began with a circle, then \(\gamma=1\). If \(\beta<0\) then \(\eqref{eq:circle-ineq}\) is obvious since both terms are nonnegative and \(\gamma/\beta<0\). If \(\beta>0\) then \(\eqref{eq:circle-ineq}\) follows from \(|\alpha|^{2}-\beta>0\).

We we use the term circline to refer to an object that is either a circle or line. Hence the previous Proposition could have been stated ‘Möbius transformations preserve circlines’. The fact that Möbius transformations preserve circles and lines is very powerful when combined with:

Any three distinct non-colinear points \(z_{1},z_{2},z_{3}\in\mathbb{C}\) uniquely determine a circle in \(\mathbb{C}\) passing through those points. Any two distinct points uniquely determine a line passing through those points.

So to find out where a circle is mapped under a Möbius transformation, one simply needs to check where three points on the circle are mapped!

Finding the image of regions under Möbius transformations (example)

Find the image of the unit disc \(\mathbb{D}:=B_{1}(0)=\{w\in\mathbb{C}:|w|<1\}\) under the Möbius transformation corresponding to the matrix \(T=\left(\begin{array}{cc} 2+2i & -2-6i\\ 1 & -1-2i \end{array}\right)\). To determine the image we look at what happens to the boundary: we first find the image of the unit circle centred at the origin. Consider what happens to the four points on the unit circle, \(1,i,-1,-i\) under the map \(M_{T}(z)=\frac{(2+2i)z-(2+6i)}{z-(1+2i)}\). Under \(M_{T}\): \[\begin{aligned} & 1 & \mapsto\frac{(2+2i)-(2+6i)}{1-(1+2i)}=\frac{-4i}{-2i}=2;\\ & i & \mapsto\frac{(2+2i)i-(2+6i)}{i-(1+2i)}=\frac{-4-4i}{-1-i}=4;\:\\ & -1 & \mapsto\frac{-(2+2i)-(2+6i)}{-1-(1+2i)}=\frac{-4-8i}{-2-2i}\end{aligned}\] By Proposition [prop:circlines] the unit circle must be mapped to a circle, and the three calculations above tell us that the image of the unit circle must be the circle centred at \(z=3\) of radius \(1\). Note, the action of \(M_{T}\) is not something so simple as a rotation; for example we have \[-i\mapsto\frac{-(2+2i)i-(2+6i)}{-i-(1+2i)}=\frac{-8i}{-1-3i}=\frac{12+4i}{5},\quad\text{which is near the image of -1 !!}\]

What about the interior of this circle? Let’s pick a point in the unit disc and see where it is taken. The point \(z=0\) is an obvious choice. We have \[M_{T}(0)=\frac{(2+2i)0-(2+6i)}{0-(1+2i)}=\frac{2+6i}{1+2i}=\frac{(2+6i)(1-2i)}{5}=\frac{14+2i}{5},\] which lies inside this circle in question. Thus, by continuity \(M_{T}\) maps \(\mathbb{D}\) to the interior of the circle centred at \(z=3\) of radius \(1\) (it cannot map another point \(z_{0}\in\mathbb{D}\) to somewhere outside this circle as the image of the path from \(0\) to \(z_{0}\) would have to cross the boundary, but Möbius transformations are conformal). Continuity (and the existence of an inverse) tell us the map must be onto and so \[M_{T}:\mathbb{D}\:\xrightarrow{\sim}\:\{z\in\mathbb{C}:|z-3|<1\}.\]

4.4 The Riemann sphere, revisited

Since we saw earlier that we can identify \(\hat{\mathbb{C}}\) with the Riemann sphere \(S^{2}\) by stereographic projection from the north pole, we can now think of Möbius transformations as bijections from \(S^{2}\) to \(S^{2}\)! In fact, our definitions were carefully chosen so that Möbius transformations give continuous bijections from \(S^{2}\) to \(S^{2}\).

This is a very fruitful perspective that can unify our picture of Möbius transformations. I encourage you to try to think about Möbius transformations in this way whenever you can!

It turns out that the definition of holomorphicity can be extended4 to functions from \(S^{2}\) to \(S^{2}\). In fact, Möbius transformations give biholomorphic maps from \(S^{2}\) to \(S^{2}\). Even better: the Möbius transformations give all the biholomorphic maps from \(S^{2}\) to \(S^{2}\).

What do circles and lines in \(\hat{\mathbb{C}}\) correspond to in the Riemann sphere? The answer is just circles! We can now add some more to our table of correspondences in \(\eqref{eq:s2-table1}\):

\[\begin{aligned} \underline{\text{In } S^{2}} & &\underline{\text{In }\hat{\mathbb{C}}} \nonumber\\ N & \longleftrightarrow & \infty \nonumber\\ S & \longleftrightarrow & 0 \nonumber\\ & \text{Geometrically:}\nonumber\\ \text{Circle not through }N & \longleftrightarrow & \text{Circle}\nonumber\\ \text{Circle through }N & \longleftrightarrow & \text{Line}\nonumber\end{aligned}\] This also justifies our earlier convention that we think of all lines as passing through infinity.

[Advanced] One other nice fact is that stereographic projection itself is conformal! This means two tangent vectors to the sphere at the same point are mapped to two tangent vectors in \(\mathbb{C}\) (at the same point) with the same angle.

Consider the map \(f:\hat{\mathbb{C}}\to\hat{\mathbb{C}}\), \(f(z)=i\frac{z-i}{z+i}\). Using stereographic projection, we can think of this as a transformation \(\hat{f}\) of the sphere \(S^{2}\). What is it?

We have that \(\hat{f}(N)\) should correspond to \(f(\infty)=i\) , \(\hat{f}(0,1,0)\) should correspond to \(f(i)=0\) and \(\hat{f}(0,0,-1)\) should correspond to \(f(0)=\frac{1}{i}=-i\). So \(\hat{f}\) maps \(N,(0,1,0),(0,0,-1)\) to \((0,1,0),(0,0,-1),(0,-1,0)\). We might guess that \(\hat{f}\) is simply a rotation of 90 degrees about the \(x\)-axis taking the ‘back’ hemisphere to the bottom/south hemisphere, and this turns out to be correct!

To see why this is true, let’s cheat slightly and assume that the rotation that we’ve guessed corresponds to a Möbius transformation. Then the Möbius transformation is determined by what it does to three points, and we’ve already checked that our guess agrees with the map \(f\) at three points \(\infty,i\) and \(0\).

4.5 Möbius transformations preserving the upper half plane or unit disc

Notation: For a domain \(D\subset\mathbb{C}\), denote by \({\rm Mob}(D)\) the set of all Möbius transformations \(f\) such that \(f(D)=D\), i.e. \(f\) maps \(D\) to \(D\).

[H2H] Every Möbius transformation mapping \(\mathbb{H}\) to \(\mathbb{H}\) is of the form \(M_{T}\) with \(T\) in the group \[{\rm SL}_{2}(\mathbb{R}):=\{T\in{\rm Mat}_{2}(\mathbb{R}):\det T=1\}.\] Conversely, every such Möbius transformation maps \(\mathbb{H}\) to \(\mathbb{H}\), and hence gives a biholomorphism from \(\mathbb{H}\) to \(\mathbb{H}\).

In more concise terms, we have

\[f\in\mathrm{Mob}(\mathbb{H})\iff f=M_{T}\text{ with }T\in\mathrm{SL}_{2}(\mathbb{R}).\]

This gives us a group homomorphism \(\mathrm{SL}_{2}(\mathbb{R})\to\mathrm{Mob}(\mathbb{H})\) mapping \(T\mapsto M_{T}\) and hence also a group homomorphism \(\mathrm{SL}_{2}(\mathbb{R})\to\mathrm{Aut}(\mathbb{H})\).

Any Möbius transformation \(f:\mathbb{H}\to\mathbb{H}\) must map the boundary of \(\mathbb{H}\) to the boundary of \(\mathbb{H}\). Since this boundary is precisely the real line, \(f:\mathbb{R}\cup\{\infty\}\to\mathbb{R}\cup\{\infty\}\). In particular, it must take the ordered set \(\{1,0,\infty\}\) to \(\{x_{1},x_{2},x_{3}\}\) for some \(x_{i}\in\mathbb{R}\cup\{\infty\}\). Consider the cross-ratio; assuming \(x_{i}\neq\infty\) we have \[\begin{aligned} & \left(\frac{f(z)-x_{2}}{f(z)-x_{3}}\right)\left(\frac{x_{1}-x_{3}}{x_{1}-x_{2}}\right)\quad & =\quad\left(\frac{z-0}{z-\infty}\right)\left(\frac{1-\infty}{1-0}\right)\quad=\quad z\nonumber\\ \iff & (f(z)-x_{2})(x_{1}-x_{3})\quad & =\quad z(f(z)-x_{3})(x_{1}-x_{2})\nonumber\\ \iff & f(z) & =\quad\frac{x_{3}(x_{1}-x_{2})z+x_{2}(x_{3}-x_{1})}{(x_{1}-x_{2})z+(x_{3}-x_{1})}.\nonumber\end{aligned}\] Thus, all coefficients are real, so \(T\in{\rm GL}_{2}(\mathbb{R})\).

Furthermore, if \(T=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\in{\rm GL}_{2}(\mathbb{R})\) and \(z=x+iy\), then \[\begin{aligned} \mathrm{Im}\left(M_{T}(z)\right)=\mathrm{Im}\left(\frac{az+b}{cz+d}\right) & = & \mathrm{Im}\left(\frac{(az+b)(c\bar{z}+d)}{|cz+d|^{2}}\right)\nonumber\\ & = & \mathrm{Im}\left(\frac{adz+bc\bar{z}}{|cz+d|^{2}}\right)\nonumber\\ & = & \frac{(ad-bc)y}{|cz+d|^{2}}\quad=\quad\frac{y\det T}{|cz+d|^{2}}.\nonumber\end{aligned}\] We have \(z\in\mathbb{H}\iff y>0\) so \[M_{T}(z)\in\mathbb{H}\quad\iff\quad T\in{\rm GL}_{2}(\mathbb{R}),\:\det T>0\] and so we can replace \(T\) by a real matrix of determinant 1 by scaling \(T\) by a real number.

[D2D] Every Möbius transformation from the unit disk \(\mathbb{D}\) to itself is of the form \(M_{T}\) with \(T\) in the set \[{\rm SU}(1,1):=\left\{ T=\begin{pmatrix}\alpha & \beta\\ \bar{\beta} & \bar{\alpha} \end{pmatrix}:\:\alpha,\beta\in\mathbb{C},\:\det T=|\alpha|^{2}-|\beta|^{2}=1\right\} .\] Conversely, every such Möbius transformation maps \(\mathbb{D}\) to \(\mathbb{D}\) and hence gives a biholomorphic automorphism of \(\mathbb{D}\).

In more concise terms, we have

\[f\in\mathrm{Mob}(\mathbb{D})\iff f=M_{T}\text{ with }T\in\mathrm{SU}(1,1).\]

  • We obtain a group homomorphism \[{\rm SU}(1,1)\to\mathrm{Mob}(\mathbb{D})\qquad T\mapsto M_{T}\] and hence also a group homomorphism \(\mathrm{SU}(1,1)\to\mathrm{Aut}(\mathbb{D})\).

  • [Advanced] The group \({\rm SU}(1,1)\) is not the Special Unitary Group \(\mathrm{SU}(2)\). It is actually the set of matrices \(T\) which preserve the quadratic form \(\left<z,w\right>=\overline{z_{1}}w_{1}-\overline{z_{2}}w_{2}\) over \(\mathbb{C}\); so \(\left<Tz,Tw\right>\:=\:\left<z,w\right>\) for \(T\in{\rm SU}(1,1)\) and \(z,w\in\mathbb{C}\).

(\(\Rightarrow\)): Let \(M_{T}:\mathbb{D}\to\mathbb{D}\) be a Mob trans and consider the following picture: \[\begin{aligned} \mathbb{H}& \stackrel{f}{\rightarrow}& \mathbb{H}\\ \stackrel{M_C}{\downarrow}& &\stackrel{M_C}{\downarrow} \\ \mathbb{D}&\stackrel{M_T}{\rightarrow} &\mathbb{D}\end{aligned}\] where \(M_{C}\) is the Cayley Map, so \(C=\left(\begin{array}{cc} 1 & -i\\ 1 & i \end{array}\right)\).

We have that \(f:=M_{C^{-1}}\circ M_{T}\circ M_{C}\) is a Möbius transformation from \(\mathbb{H}\) to \(\mathbb{H}\). By Prop [prop:H2h] (H2H) we have \(f=M_{S}\) with \(S\in{\rm SL}_{2}(\mathbb{R})\), and by Lemma [lem:Mobius-are-group] we have \(S=C^{-1}TC\). Let \(S=\left(\begin{array}{cc} a & b\\ c & d \end{array}\right)\) with \(ad-bc=1\) and \(a,b,c,d\in\mathbb{R}\). Then you can calculate that \[T\quad=\quad CSC^{-1}\quad=\quad\frac{1}{2}\begin{pmatrix}(a+d)+i(b-c) & (a-d)-i(c+b)\\ (a-d)+i(c+b) & (a+d)-i(b-c) \end{pmatrix}.\] Clearly this is of the required form. Finally, we have \[\det T=\det(CSC^{-1})=(\det C)^{-1}\cdot\det S\cdot\det C=\det S=1\] as required.

(\(\Leftarrow\)): Conversely, if \(T\in{\rm SU}(1,1)\) then the same calculation in reverse shows that the matrix \(S:=C^{-1}TC\) is in \({\rm SL}_{2}(\mathbb{R})\). Thus \(M_{S}:\mathbb{H}\to\mathbb{H}\) is a Möbius transformation by Prop [prop:H2h] (H2H), and by Lemma [lem:Mobius-are-group] the map \(M_{T}:=M_{C}\circ M_{S}\circ M_{C}^{-1}\) is a Möbius transformation \(\mathbb{H}\to\mathbb{H}\).

We can actually say much more about what they look like:

[D2D*] \(\quad\)

  1. Every Möbius transformation \(f\) from the unit disk \(\mathbb{D}\) to itself can be written \[f(z)\:=\:e^{i\theta}\:\frac{z-z_{0}}{\overline{z_{0}}\,z-1},\] for some angle \(\theta\) and \(z_{0}\in\mathbb{D}\) the unique point such that \(f(z_{0})=0\).

  2. All Möbius transformations of the unit disk to itself for which \(f(0)=0\) are rotations about \(0\).

The map \(g(z):=\tfrac{z-z_{0}}{\overline{z_{0}}\,z-1}\) in the above swaps \(0\) and \(z_{0}\) and is an “involution (see Sheet 6); that is, \(g\circ g=\mathrm{Id}\). Furthermore, the map \(z\mapsto e^{i\theta}z\) is a rotation. So all Möbius transformations of the unit disk are given by an involution followed by a rotation.

  1. By Proposition D2D we have \[f(z)\quad=\quad\frac{az+b}{\bar{b}z+\bar{a}}\quad=\quad\left(-\frac{a}{\bar{a}}\right)\:\left(\frac{z-\left(-\frac{b}{a}\right)}{\left(-\frac{\bar{b}}{\bar{a}}\right)z-1}\right),\] so \(z_{0}=-b/a\). Moreover, since \(\left|-\frac{a}{\bar{a}}\right|=1\), we must have \(-\frac{a}{\bar{a}}=e^{i\theta}\) for some \(\theta\in(-\pi,\pi]\). All that remains is to check that \(z_{0}\in\mathbb{D}\). Since \(|a|^{2}-|b|^{2}=1\), we have \[|z_{0}|^{2}-1\quad=\quad\left|-\frac{b}{a}\right|^{2}-1\quad=\quad\frac{|b|^{2}}{|a|^{2}}-1\quad=\quad-\frac{1}{|a|^{2}}\quad<\quad0,\] and so \(|z_{0}|<1\) as required.

  2. We have \[f(0)=0\quad\iff\quad e^{i\theta}\left(\frac{-z_{0}}{-1}\right)=0\quad\iff\quad z_{0}=0\quad\iff\quad f(z)=-e^{i\theta}z,\] and so \(f\) is a rotation.

Finding automorphisms (example)

Find a Möbius transformation \(f\) from the closed unit disc onto the closed unit disc taking \(\frac{i}{2}\) to \(0\) and \(-i\) to \(1\). Since \(\frac{i}{2}\mapsto0\), by Corollary D2D* with \(z_{0}=\frac{i}{2}\) we have \[f(z)=e^{i\theta}\left(\frac{z-\frac{i}{2}}{-\frac{i}{2}z-1}\right)=e^{i\theta}\left(\frac{2z-i}{-iz-2}\right)\] for some \(\theta\). Since \(-i\mapsto1\), we have \[1=e^{i\theta}\left(\frac{-2i-i}{-1-2}\right)\quad\iff\quad1=e^{i\theta}i\quad\iff\quad e^{i\theta}=-i.\] Thus \[f(z)=-i\left(\frac{2z-i}{-iz-2}\right)=i\left(\frac{2z-i}{iz+2}\right)=\frac{2z-i}{z-2i}\quad(\text{multiplying top and bottom by }-i).\]

Finding the image of geometric shapes under Möbius transformation (example)

Let \(F\) be the geometric figure made up of a line segment from \(0\) to \(-1\), a clockwise circular arc (tracing out the unit circle) from \(-1\) to \(i\), then a line segment from \(i\) to \(0\). Find the image of \(F\) under the Cayley Map \(M_{C}(z)=\frac{z-i}{z+i}\).

Each section is a segment of a circle or a line, so must be taken to a segment of a circle or a line by Proposition [prop:circlines]. First, let us check where the three ‘vertices’ go. We have \[0\mapsto\frac{-i}{i}=-1;\quad-1\mapsto\frac{-1-i}{-1+i}=\frac{(1+i)^{2}}{2}=i;\quad i\mapsto\frac{i-i}{i+i}=0.\] Let’s look at where the line segment from \(0\) to \(i\) goes: \[\frac{i}{2}\mapsto\frac{\frac{i}{2}-i}{\frac{i}{2}+i}\,=\,\frac{-3i}{i}\,=\,-\frac{1}{3},\] so the line segment from \(0\) to \(i\) must map to the circle/line from \(-1\) to \(0\), passing through \(-\frac{1}{3}\); that is, it is the line segment from \(-1\) to \(0\) on the real axis.

Now, if we wanted we could do the same for the other two sections; pick a point on each remaining line/circular arc and see where it maps to, thus determining whether each image is a line segment or a circular arc. However, there is a much quicker method: We know Mob trans are conformal and so preserve angles/orientation - so tracing round the shape we must have the same angles in the image of the figure. So, travelling from \(M_{C}(0)=-1\), when we reach \(M_{C}(i)=0\) we must turn anticlockwise \(\pi/2\) degrees and head towards \(M_{C}(-1)=i\). Thus, the next section must simply be the straight line from \(0\) to \(i\). Similarly, the final section must be a circular arc from \(i\) back to \(-1\). Thus, the image is just \(F\). Actually, letting \(D=\{z\in\mathbb{D}:\pi/2<\mathrm{Arg}(z)<\pi\}\) be the interior of \(F\), one can check that \(D\) maps to \(D\) and and so \(M_{C^{-1}}\) (and therefore \(M_{C}\)) is actually in \(\mathrm{Aut}(D)\).

4.6 Finding biholomorphic maps between domains

  1. Find a biholomorphism from \(D:=\{z\in\mathbb{D}:\mathrm{Im}(z)<0\}\) to \(\mathbb{H}\). We build the map in various stages from simpler known maps.

    \(\quad\)

    • Step 1: We know the Cayley transform \(M_{C}\) is a map from \(\mathbb{H}\) to \(\mathbb{D}\), so its inverse \(M_{C^{-1}}\) maps \(\mathbb{D}\) to \(\mathbb{H}\). This is a good thing to start with as it has the right image. To find the image of \(D\) under \(M_{C^{-1}}\) first consider how it acts on two segments of the boundary:

      • The line segment from \(-1\) to \(1\) (through \(0\)). We have \(M_{C^{-1}}=\frac{iz+i}{-z+1}\) so \[M_{C^{-1}}(-1)=\frac{-i+i}{1+1}=0,\quad M_{C^{-1}}(1)=\frac{i+i}{-1+1}=\infty,\quad M_{C^{-1}}(0)=\frac{0+i}{0+1}=i.\] Thus, the line segment from \(-1\) to \(1\) (through \(0\)) is taken to the line segment from \(0\) to \(\infty\) (through \(i\)); so, the nonnegative imaginary axis. [You could also have just checked that “\(-d/c\)\(=-1\) is on the line, so it must be taken to another line.]

      • The circular arc from \(-1\) to \(1\) (through \(-i\)). We have \[M_{C^{-1}}(-i)=\frac{-i^{2}+i}{i+1}=1.\] Thus, the circular arc from \(-1\) to \(1\) (through \(-i\)) is taken to the line segment from \(0\) to \(\infty\) (through \(1\)); so, the real axis. [Instead, we could just have used conformality to deduce that this was the image - the angle and its orientation at \(z=-1\) must be preserved, so the positive real axis had to be the image.]

      Combined, this tells us that the image of \(D\) under \(M_{C^{-1}}\) is the first quadrant \(\Omega=\{w\in\mathbb{C}:0<\mathrm{Arg}(w)<\pi/2\}\) - by conformality the interior must stay on the ‘same side’ of each line segment. [You could instead explicitly check what happens to an element in \(D\), say \(z=-i/2\).]

    • Step 2: We now need a biholomorphic map from \(\Omega\) to \(\mathbb{H}\). We already know of one from earlier: the map \(g:z\mapsto z^{2}:\Omega\xrightarrow{\sim}\mathbb{H}\). Since compositions of biholomorphic maps are biholomorphic, this gives us the map we want; let \(f:=g\circ M_{C^{-1}}\) then \[f:D\:\xrightarrow{\sim}\:\mathbb{H}.\]

    We can write down the map \(f\) explicitly by composing the formulae of the functions: \[f(z)\quad=\quad g\circ M_{C^{-1}}(z)\quad=\quad g\left(\frac{iz+i}{-z+1}\right)\quad=\quad\left(\frac{iz+i}{-z+1}\right)^{2}.\]

  2. Find the image of \(\mathbb{D}\setminus\mathbb{R}_{\leq0}:=\{z\in\mathbb{D}:-\pi<\mathrm{Arg}z<\pi\}\) under the map \(h(z)=z^{1/2}\). Thus, find a biholomorphic map from \(\mathbb{D}\setminus\mathbb{R}_{\leq0}\) to the unit disc \(\mathbb{D}\).

    We have (using the principal branch) \[z=re^{i\theta}\:(\theta\in(-\pi,\pi),0<r<1)\:\iff\:z^{1/2}=\sqrt{r}e^{i\theta/2}\:\left(\frac{\theta}{2}\in(-\frac{\pi}{2},\frac{\pi}{2}),0<\sqrt{r}<1\right).\] We know the complex power function is biholomorphic on \(\mathbb{C}-\mathbb{R}_{\leq0}\), so \[h:\:\mathbb{D}\setminus\mathbb{R}_{\leq0}\:\:\xrightarrow{\sim}\:\:\{z\in\mathbb{D}:-\pi/2<\mathrm{Arg}z<\pi/2\}\] maps \(\mathbb{D}\setminus\mathbb{R}_{\leq0}\) biholomorphically to the (open) right half of the unit disc.

    We also know rotations are biholomorphic on \(\mathbb{C}\) and so \(r(z)=e^{-\pi/2}\,z=-iz\) maps the right half of the unit disc biholomorphically to the lower half of the unit disc.

    Using the previous example \(f\) from \(1.\) we can now write down a biholomorphic map from \(D:=\{z\in\mathbb{D}:\mathrm{Im}(z)<0\}\) to \(\mathbb{H}\), and then map \(\mathbb{H}\) (via the Cayley Map) to \(\mathbb{D}\). Namely, since \(M_{C}(z)=\frac{z-i}{z+i}:\mathbb{H}\xrightarrow{\sim}\mathbb{D}\) we have \(\tilde{f}:\mathbb{D}\setminus\mathbb{R}_{\leq0}\:\xrightarrow{\sim}\:\mathbb{D}\), where \[\tilde{f}(z):\:=\:(M_{C}\circ f\circ r\circ h)(z)\:=\:\frac{\left(\frac{i(-iz^{1/2})+i}{-(-iz^{1/2})+1}\right)^{2}-i}{\left(\frac{i(-iz^{1/2})+i}{-(-iz^{1/2})+1}\right)^{2} +i}=\frac{(z^{1/2}+i)^{2}-i(iz^{1/2}+1)^{2}}{(z^{1/2}+i)^{2}+i(iz^{1/2}+1)^{2}},\] which you can simplify further if you wish.

  3. Translations are biholomorphic on \(\mathbb{C}\) so for example \[f(z)=z-i:\{w\in\mathbb{C}:\mathrm{Im}(w)-1>\mathrm{Re}(w)\}\:\:\xrightarrow{\sim}\:\:\{w\in\mathbb{C}:\mathrm{Im}(w)>\mathrm{Re}(w)\}.\] Since \[g(z)=e^{-i\frac{\pi}{4}}z=\frac{1-i}{\sqrt{2}}z:\quad\{w\in\mathbb{C}:\mathrm{Im}(w)>\mathrm{Re}(w)\}\:\xrightarrow{\sim}\:\mathbb{H}\] we have that \(\tilde{f}:=g\circ f\) satisfies \[\tilde{f}(z)=\{w\in\mathbb{C}:\mathrm{Im}(w)-1>\mathrm{Re}(w)\}\:\xrightarrow{\sim}\:\mathbb{H}.\] Explicitly, \(\tilde{f}(z)=\left(\frac{1-i}{\sqrt{2}}\right)(z-i)\).

5 Notions of Convergence in complex analysis and power series

5.1 Pointwise and uniform convergence

Now we have good knowledge of basic continuous/holomorphic functions and their properties, we would like to be able to construct new continuous functions from them by taking limits.

Given a sequence \(\{f_{n}\}_{n\in\mathbb{N}}\) of functions \(f_{n}\) on a subset of a metric space (say, on a region in \(\mathbb{C}\)), we want the limit function \(f(x):=\lim_{n\to\infty}f_{n}(x)\) (if it exists) to behave “nicely; i.e., we want it to carry over useful properties of the functions \(f_{n}\). One important case is that of an infinite series of functions \[\sum_{n=0}^{\infty}f_{n}:=\lim_{N\to\infty}\sum_{n=0}^{N}f_{n};\] for example, a power series \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-z_{0})^{n}}\) in \(\mathbb{C}\). First we need a concrete notion of what it means for a sequence of functions between two metric spaces to converge. Our initial naive idea is the following version:

[Pointwise convergence] Let \((X,d_{X})\) and \((Y,d_{Y})\) be two metric spaces. A sequence of functions \(\{f_{n}\}_{n\in\mathbb{N}}:X\to Y\) converges pointwise (on \(X\)) to \(f\) if every \(x\in X\) the limit function \(f(x):=\lim_{n\to\infty}f_{n}(x)\) exists in \(Y\). In other words, we have \[\underline{\forall\:x\in X\text{ and }\forall\:\epsilon>0},\text{ there exists }N\in\mathbb{N}\quad\text{such that}\quad\forall\:n>N,\quad d_{Y}(f_{n}(x),f(x))<\epsilon.\] Note that \(N\) depends on \(x\in X\).

Note that for any given \(x\in X\) we have that \(f_{n}(x)\) is just a sequence of points in \(Y\). This definition is precisely what we get by applying Definition [def:limits-in-metric-space] to the sequence \(f_{n}(x)\) at every point \(x\in X\) independently.

The problem with pointwise convergence (examples)

  1. Consider the sequence of functions \(f_{n}=x^{n}\) on \([0,1]\). It is easy to see that sequence is pointwise convergent on \([0,1]\) with limit \[f(x)=\begin{cases} 0, & \quad\text{if }x<1.\\ 1, & \quad\text{if }x=1.\ \end{cases}\] This is a sequence of continuous, differentiable functions on a compact set, but the limit function is not continuous!

  2. We encounter this issue again in the complex plane. Let \(f_{n}:\mathbb{C}\to\mathbb{C}:z\mapsto z^{n}\). We split into three cases.

    • Pick \(z\in\mathbb{D}\) and \(\epsilon>0\). Then, we can certainly find \(N\in\mathbb{N}\) such that \(|z|^{N}<\epsilon\) (for example, take any \(N>\log\epsilon/\log|z|\)). Thus, for every \(n>N\) we have \[|f_{n}(z)-0|=|z|^{n}<|z|^{N}<\epsilon,\] and so \(\lim_{n\to\infty}f_{n}(x)=0\) in \(\mathbb{D}\).

    • When \(|z|=1\) the point \(z\) rotates around the unit circle \(\partial\mathbb{D}\) by \(\mathrm{Arg}(z)\) every iteration. For any \(z\neq1\) this sequence clearly doesn’t converge, but for \(z=1\) we have \(\lim_{n\to\infty}f_{n}(z)=\lim_{n\to\infty}1=1\).

    • Here, the value \(|z|^{n}\) is unbounded, so the limit does not exist.

    To conclude, the sequence \(f_{n}(z)\) is not pointwise convergent on \(\mathbb{C}\), however, notice that it is pointwise convergent on \(\mathbb{D}\cup\{1\}\) with limit function \[f(z)=\begin{cases} 0, & \quad\text{if }z\in\mathbb{D}.\\ 1, & \quad\text{if }z=1.\ \end{cases}\] Again, this is not continuous.

  3. Was the problem that the region where the function converges wasn’t open? From the previous examples it seems like this could be the case (it was fine on the interior of balls). No! For example, the sequence of continuous functions \(f_{n}(x)=\arctan(nx)\) converges pointwise on all of \(\mathbb{R}\) to the not-continuous function \[f(x)=\begin{cases} \pi/2, & \quad\text{if }x>0.\\ 0, & \quad\text{if }x=0.\\ -\pi/2, & \quad\text{if }x<0.\ \end{cases}\]

Key: Pointwise convergence does not preserve continuity. We need a better notion.

[Uniform convergence] We say a sequence of functions \(\{f_{n}\}_{n\in\mathbb{N}}:X\to Y\) converges uniformly (on \(X\)) to (the limit function) \(f\) if we have \[\underline{\forall\,\epsilon>0},\quad\exists\,N\in\mathbb{N}\quad\text{such that}\quad\forall n>N\quad d(f_{n}(x),f(x))<\epsilon,\quad\underline{\forall\,x\in X}.\] Note that \(N\) here does not depend on the specific choice of \(x\in X\) - the same \(N\) works for all of them!

Uniform convergence trivially implies pointwise convergence (to the same limit).

The following is a generalization of a theorem from Analysis I.

[Uniform limits of continuous functions are continuous] Let \((X,d_{X})\) and \((Y,d_{Y})\) be two metric spaces and let \(\{f_{n}\}_{n\in N}:X\to Y\) be a sequence of continuous functions that converges uniformly to \(f\) on \(X\). Then \(f\) is continuous on \(X\).

We will show \(f\) is continuous at any given \(x_{0}\in X\). Let \(\epsilon>0\). Since \(f_{n}\to f\) uniformly we have that \[\begin{equation} \exists\,N\in\mathbb{N}\quad\text{ such that }\quad\forall\:n>N\quad d_{Y}(f(x),f_{n}(x))<\frac{\epsilon}{3}\quad \forall\:x\in X.\label{eqn1} \end{equation}\] We know each \(f_{n}\) is continuous at \(x_{0}\), therefore there exists \(\delta>0\) such that \[\begin{equation} \forall\:x\text{ with }d_{X}(x,x_{0})<\delta,\quad\text{ we have }\quad d_{Y}(f_{n}(x),f_{n}(x_{0}))<\frac{\epsilon}{3}.\label{eqn2} \end{equation}\] Pick some fixed \(n>N\). Then, for any \(x\) with \(d_{X}(x,x_{0})<\delta\) we have \[\begin{aligned} d_{Y}(f(x),f(x_{0}))\:\stackrel{(D3)}{\leq}\: & d_{Y}(f(x),f_{n}(x))\:+\:d_{Y}(f_{n}(x),f(x_{0}))\\ \:\stackrel{(D3)}{\leq}\: & d_{Y}(f(x),f_{n}(x))\:+\:d_{Y}(f_{n}(x),f_{n}(x_{0}))\:+\:d_{Y}(f_{n}(x_{0}),f(x_{0}))\\ \:\stackrel{(\ref{eqn1})}{<}\: & \frac{\epsilon}{3}\:+\:d_{Y}(f_{n}(x),f_{n}(x_{0}))\:+\:\frac{\epsilon}{3}\\ \:\stackrel{(\ref{eqn2})}{<}\: & \frac{\epsilon}{3}\:+\:\frac{\epsilon}{3}\:+\:\frac{\epsilon}{3}\quad=\quad\epsilon.\end{aligned}\] Thus \(f\) is continuous at \(x_{0}\).

We would like to develop criteria to determine when sequences of complex valued functions converge uniformly. Here is the first, if we already know the sequence converges pointwise:

[Test for uniform convergence] Let \(f_{n}:X\to\mathbb{C}\) be a sequence of functions converging pointwise to a limit function \(f\).

  1. If \(|f_{n}(x)-f(x)|\leq s_{n}\) for every \(x\in X\), where \(\{s_{n}\}_{n\in\mathbb{N}}\) is some sequence in \(\mathbb{R}_{>0}\) (independent of \(x\)) with \(\lim_{n\to\infty}s_{n}=0\), then \(f_{n}\) converge uniformly to \(f\) on \(X\).

  2. If there exists a sequence \(x_{n}\in X\) such that \(|f_{n}(x_{n})-f(x_{n})|\geq c\) for some positive constant \(c\), then \(f_{n}\) does not converge uniformly to \(f\) on \(X\).

Very easy.

  1. Let \(\epsilon>0\) and \(x\in X\). Since \(s_{n}\to0\) there exists \(N\in\mathbb{N}\) with \(s_{n}<\epsilon\) for \(n>N\). Thus, for \(n>N\) we have \(|f_{n}(x)-f(x)|\leq s_{n}<\epsilon\).

  2. The statement obviously implies the negation of uniform continuity; it states that there exists \(\epsilon\,(=c)\) such that for all sufficiently large \(n\in\mathbb{N}\) there is a point \(x\,(=x_{n})\in X\) for which \(|f_{n}(x)-f(x)|\geq\epsilon\).

Checking for uniform convergence (example)

Consider the sequence of functions \(f_{n}(z)=e^{z}+\frac{1}{n}\) and \(g_{n}(z)=e^{z}+\frac{z}{n}\). It is easy to see that both functions converge pointwise to the exponential function \(f(z)=e^{z}\): for every fixed \(z\in\mathbb{C}\) we have \(\lim_{n\to\infty}(e^{z}+\frac{1}{n})=e^{z}=\lim_{n\to\infty}(e^{z}+\frac{z}{n})\).

Notice that \(f_{n}\to f\) uniformly because for every \(z\in\mathbb{C}\) we have \[|f_{n}(z)-f(z)|=\left|\left(e^{z}+\frac{1}{n}\right)-e^{z}\right|=\frac{1}{n};\] so we may take \(s_{n}=1/n\) in Lemma [lem:uniform-convergence-test] part \(1.\).

However \(g_{n}\not\to f\) uniformly - the giveaway is that in writing \[|g_{n}(z)-f(z)|=\left|\left(e^{z}+\frac{z}{n}\right)-e^{z}\right|=\frac{|z|}{n};\] the difference depends on \(|z|\), and \(|z|\) is unbounded in \(\mathbb{C}\). So, we simply notice that upon taking \(z_{n}=n\) that \(|g_{n}(z_{n})-f(z_{n})|=1\) and by Lemma [lem:uniform-convergence-test] part \(2.\) there is no uniform convergence.

However, for \(\rho>0\) the sequence \(g_{n}\) does converge uniformly in any ball \(B_{\rho}(0)\) in the complex plane. For then \(|g_{n}(z)-f(z)|=|z|/n<s_{n}:=\rho/n\).
As mentioned, we are mainly interested in the convergence of infinite series.

[Weierstrass M-test] Let \(f_{n}:X\to\mathbb{C}\) be a sequence of functions such that \[|f_{n}(x)|\leq M_{n} \text{ for all }x\in X \quad \text{ and }\quad\sum_{n=1}^{\infty}M_{n}<\infty.\] Then \[\sum_{n=1}^{\infty}f_{n}\text{ converges uniformly on }X \text{ to some limit function }f:X\to\mathbb{C}.\]

Excluded - see similar proof from Analysis I. Instead, we will prove a stronger result later.

Determining the convergence of series via the M-test (example)

Show that \[\sum_{n=1}^{\infty}\frac{|2z|^{3n}}{3^{2n}\:n^{2}}\] converges uniformly on \(\overline{\mathbb{D}}\) and let \(f(z)\) be its limit function. Is \(f(z)\) continuous on \(\overline{\mathbb{D}}\)?

Note that when \(|z|\leq1\) we have \[\frac{|2z|^{3n}}{3^{2n}\:n^{2}}\:\:\leq\:\:\frac{2^{3n}}{3^{2n}\:n^{2}}\:\:=\:\:\left(\frac{8}{9}\right)^{n}\:\left(\frac{1}{n^{2}}\right)\:\:<\:\:\frac{1}{n^{2}}.\] We know \(\sum_{n\in\mathbb{N}}1/n^{2}\) converges, so taking \(M_{n}=1/n^{2}\) the Weierstrass M-test implies the function converges uniformly to some limit function \(f\). Furthermore, since every \(f_{n}\) was continuous, then so is \(f\) (by Theorem [thm:uniform-limit-cont]).
Have we found our ideal definition of convergence? Indeed, we record one other crucial property of uniform convergence from Analysis I:

Assume a sequence of functions \(f_{n}:[a,b]\to~\mathbb{R}\) converge uniformly on an interval \([a,b]\) to some function \(f\), and that \(\{f_{n}\}\) are all continuous. Then \[\lim_{n\to\infty}\int_{a}^{c}f_{n}(x)\,dx=\int_{a}^{c}f(x)\,dx,\quad\quad\text{for all }c\in[a,b].\]

If the convergence is not uniform all bets are off! We will heavily use this in term 2 for contour integrals.

Everything looks good, but there is a slight issue to think about.

The problem with uniform convergence (an important example)

Let us return to the sequence of functions \(f_{n}(z)=z^{n}\). We saw that it converges pointwise on the (open) unit disc \(\mathbb{D}\) to the function \(f(z)=0\). Is this convergence uniform?

Let’s see if we can find a sequence such that \(|f_{n}(z_{n})-f(z_{n})|=c\) for some \(c>0\). For simplicity, let’s just try to find a sequence of positive real numbers: We want \[c=|f_{n}(z_{n})-f(z_{n})|=|(z_{n})^{n}-0|=(z_{n})^{n},\] so simply take \(z_{n}=c^{1/n}\). We need this sequence to be in \(\mathbb{D}\) so let’s set \(c=1/2\). Then with \(z_{n}=(1/2)^{1/n}\in\mathbb{D}\), by construction we have \[|f_{n}(z_{n})-f(z_{n})|=\left|\left(\left(\frac{1}{2}\right)^{1/n}\right)^{n}-0\right|=\left(\frac{1}{2}\right)^{n/n}=\frac{1}{2}.\] It follows from Lemma [lem:uniform-convergence-test] part \(2.\) that the convergence is not uniform. BUT, notice that the limit function \(f(z)=0\) is trivially continuous on all of \(\mathbb{D}\) - our notion can’t even conclude that the constant function is continuous for a very basic example - we have in some sense been too restrictive!
Key: Uniform convergence is actually too restrictive in terms of preservation of continuity.

5.2 Locally uniform convergence

[Locally uniform convergence] Let \(\{f_{n}\}_{n\in\mathbb{N}}\) be a sequence of functions in a metric space \(X\). We say \(\{f_{n}\}\) converges locally uniformly (on \(X\)) to (the limit function) \(f\), if for every \(x\in X\) there exists an open set \(U\subset X\) containing \(x\) on which \(\{f_{n}\}_{n\in\mathbb{N}}\) converges uniformly to \(f\).

Let \(\{f_{n}\}_{n\in\mathbb{N}}\) be a sequence of continuous functions, which converges locally uniformly on \(X\) to a limit function \(f\). Then \(f\) is continuous on \(X\).

Locally uniform convergence gives for any \(x\in X\) that the sequence converges uniformly on some open \(U\) containing \(x\). Hence \(f\) is continuous on \(U\) by Theorem [thm:uniform-limit-cont]. Hence \(f\) is continuous at \(x\). Since \(x\) was arbitrary, \(f\) is continuous.

This is in many ways the “right notion for convergence of a sequence of functions. It was championed by Weierstrass. In term 2, we will see that the limit of a locally uniform convergent sequence of holomorphic functions is again holomorphic!!

Locally uniform but not uniform convergence (example)

Again, consider \(f_{n}(z)=z^{n}\). We know this sequence converges pointwise to \(f=0\) on the (open) unit disc \(\mathbb{D}\), but not uniformly. However, the convergence is locally uniform.

For \(w\in\mathbb{D}\), we can find \(r<1\) such that \(w\in B_{r}(0)\) (this will be our open set \(U\)). Then for all \(z\in B_{r}(0)\), we have \(|z^{n}|<r^{n}\) and \(\lim_{n\to\infty}r^{n}=0\). Hence by Lemma [lem:uniform-convergence-test] part \(1.\) (with \(s_{n}:=r^{n}\)), we have uniform convergence in \(B_{r}(0)\).

Note the ‘counter example’ \((1/2)^{1/n}\) from above fails since \(\lim_{n\to\infty}(1/2)^{1/n}=1\), so at some point the sequence \((1/2)^{1/n}\) leaves the ball \(B_{r}(0)\).

Proving locally uniform convergence on every ball \(B_{r}(0)\) in \(\mathbb{D}\) (for \(0<r<1\)) is enough to prove that the limit function \(f\) is continuous on all of \(\mathbb{D}\). In fact, to show a limit function \(f\) is continuous on a set \(X\), it turns out to be enough to prove uniform convergence on all compact sets in \(X\):

[Local M-test] Let \(X\) be a metric space and let \(f_{n}:X\to\mathbb{C}\) be a sequence of continuous functions such that for any \(y\in X\), there is an open \(U\subset X\) containing \(y\) and constants \(M_{n}>0\) with \(\sum_{n=1}^{\infty}M_{n}<\infty\) and \(|f_{n}(x)|\leq M_{n}\) for all \(x\in U\). Then \(\sum_{n=1}^{\infty}f_{n}\) converges locally uniformly to a continuous function on \(X\).

[Hard but worthwhile!] If we prove \(\sum_{n=1}^{\infty}f_{n}\) converges locally uniformly on \(X\), then it converges to a continuous function on \(X\) by Theorem [thm:locally-uniform-limit-of-cont]. So we need to show that if \(F_{k}=\sum_{n=1}^{k}f_{n}\), then the sequence \(F_{k}\) converges locally uniformly on \(X\), i.e. we want to prove:

There is a function \(F:X\to\mathbb{C}\) such that for any \(y\in X\) there is an open \(U\) with \(y\in U\subset X\) and with \(F_{n}\) converging uniformly to \(F\) on \(U\).

Step 1. The sequence is pointwise convergent. Given \(y\in X\), the hypothesis implies that there are constants \(M_{n}>0\) such that \(|f_{n}(y)|\leq M_{n}\) and \(\sum_{n=1}^{\infty}M_{n}\) converges. Note that \[\begin{aligned} |F_{k}(y)| & =\left|\sum_{n=1}^{k}f_{n}(y)\right|\leq\sum_{n=1}^{k}|f_{n}(y)|\leq\sum_{n=1}^{k}M_{n}\\\end{aligned}\] Since as \(k\to\infty,\) the RHS converges, it must be bounded, say by \(L\), so for all \(k\), \(|F_{k}(y)|\leq L\) and hence the sequence \(F_{k}(y)\) is bounded. Hence it is contained in some large closed ball, which is compact by Heine-Borel, and hence there is a subsequence \(F_{k_{j}}(y)\) that converges to some value \(F(y)\). We want to prove the whole sequence also converges to \(F(y)\).

Also note that for \(k_{j}\geq k\) \[|F_{k_{j}}(y)-F_{k}(y)|=\left|\sum_{n=k+1}^{k_{j}}f_{n}(y)\right|\leq\sum_{n=k+1}^{k_{j}}|f_{n}(y)|\leq\sum_{n=k+1}^{k_{j}}M_{n}.\] Taking the limit as \(j\to\infty\), both the LHS and RHS converge, and we get \[|F(y)-F_{k}(y)|\leq\sum_{n=k+1}^{\infty}M_{n}.\] Now taking the limit as \(k\to\infty\) of both sides, the RHS goes to zero (any sequence that converges must have tails that go to 0) and we obtain \[\lim_{k\to\infty}|F(y)-F_{k}(y)|=0.\] So \(F_{k}(y)\to F(y)\) as \(k\to\infty\). Repeating this for all \(y\), we get that \(F_{k}\to F\) pointwise on \(X\). (I.e. we didn’t need the subsequences.)

Step 2. Locally uniform convergence. Given \(y\) as in the statement we want to prove, let \(U\) be the open set provided by hypothesis of the Theorem, and \(M_{n}\) be the provided constants for this \(U\).

By the same estimate as we used before, we have for all \(x\) in \(U\) (this is the key point!) and for all \(\ell>k\)

\[|F_{\ell}(x)-F_{k}(x)|\leq\sum_{n=k+1}^{\ell}M_{n}.\] Taking the limit as \(\ell\to\infty\) we obtain \(|F(x)-F_{k}(x)|\leq\sum_{n=k+1}^{\infty}M_{n}\) for all \(x\in U\). Now since \(\lim_{k\to\infty}\sum_{n=k+1}^{\infty}M_{n}=0\), the test for uniform convergence (Lemma [lem:uniform-convergence-test] part \(1.\)) tells us that \(F_{k}\to F\) uniformly on \(U\), as required.

5.3 Complex power series

A (complex) power series is an expression of the form \[\sum_{n=0}^{\infty}a_{n}(z-c)^{n}\] with \(a_{n}\) and \(c\) complex numbers. Recall that from Analysis I we have:

There are three cases:

  1. \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-c)^{n}\quad}\) converges only for \(z=c\). (\(R=0\))

  2. There exists \(R>0\) (radius of convergence) such that

    • \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-c)^{n}\quad}\) converges absolutely for \(|z-c|<R\).
      \(\quad\quad\quad\) (We call \(B_{R}(c)\) the disc of convergence.)

    • \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-c)^{n}\quad}\) diverges for \(|z-c|>R\).

    (Anything can happen on the circle \(|z-c|=R\)).

  3. \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-c)^{n}\quad}\) converges absolutely for all \(z\in\mathbb{C}\). (\(R=\infty\))

See Analysis I.

[Recap of Ratio and Root tests] The radius of convergence is usually determined using the ratio test or the root test (see Analysis I) - it is the real number for which the “limit value in the test is \(1\)”.

We have the following key result:

A power series \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-c)^{n}}\) with radius of convergence \(0<R\leq\infty\) converges uniformly on any ball \(B_{r}(c)\) with \(0<r<R\). This implies the power series is locally uniformly convergent on its disc of convergence.

The second statement follows from the first since every \(w\in B_{R}(c)\) is contained in some \(B_{r}(c)\) with \(r<R\).

So we will prove the convergence is uniform on \(B_{r}(c)\) given \(r<R\). Consider the point \(z_{0}=c+r\). We have \(|z_{0}-c|=r<R\), so \(z_{0}\in B_{R}(c)\) and the series \({\displaystyle \sum_{n=1}^{\infty}a_{n}\,(z_{0}-c)^{n}=\sum_{n=1}^{\infty}a_{n}\,r^{n}}\) converges absolutely by Theorem [thm:recap-power-series]; in other words \({\displaystyle \sum_{n=1}^{\infty}|a_{n}|\,r^{n}<\infty}\). So, with \(M_{n}=|a_{n}|\,r^{n}\) it follows that \(|a_{n}(z-c)^{n}|\leq M_{n}\) for all \(z\in B_{r}(c)\) and \({\displaystyle \sum_{n=0}^{\infty}M_{n}<\infty}\). Thus, the conditions of the M-test (Theorem [thm:M-test]) are satisfied and the series \({\displaystyle \sum_{n=1}^{\infty}a_{n}\,(z-c)^{n}}\) converges uniformly on \(B_{r}(c)\).

Power series do not converge uniformly in the entire disc of convergence \(B_{R}(c)\). The good news is that this is not necessary because we only need locally uniform convergence to conclude continuity!

Showing a power series converges (example)

For example, consider the series \(\sum_{n=0}^{\infty}\frac{z^{n}}{n!}\). Applying the ratio test, we have \[L\:=\:\lim_{n\to\infty}\frac{|z^{n+1}|/(n+1)!}{|z^{n}|/n!}\:=\:\lim_{n\to\infty}\frac{|z|}{n+1}\:=\:0\:<\:1,\] for every \(z\in\mathbb{C}\); so \(R=\infty\) and the series converges absolutely in \(\mathbb{C}\). In fact, by Theorem [thm:A-power-series-converges-loc-un] it converges locally uniformly in the whole complex plane to a continuous function. This series looks familiar, and actually agrees with the exponential function \(e^{z}\) we’ve defined. We’ll be able to prove this later in the course.

Differentiation/integration of power series

Now we have an idea of what it means for a power series to converge, and where it converges, we would like to know when a power series represents a holomorphic function. In other words, we want to know when we can differentiate (and integrate) power series. We have the following generalization of a result from Analysis I for real power series.

[Term by term differentiation or integration preserves the radius of convergence] Let \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-c)^{n}}\) be a power series with radius of convergence \(0<R\leq\infty\). Then the formal derivatives and anti-derivatives \[\sum_{n=1}^{\infty}na_{n}(z-c)^{n-1}\qquad\text{and}\qquad\sum_{n=0}^{\infty}\frac{a_{n}}{n+1}(z-c)^{n+1}\] have the same radius of convergence \(R\).

By ‘formal’ we mean that the above series (our guess for the derivative and anti-derivative) both define convergent series in the disc \(B_{R}(c)\).

So, we know the series with the expected formulae for derivative and anti-derivative both converge, but do they genuinely represent the derivative and anti-derivative of the original series?

[Power series can be differentiated term by term in their disc of convergence] Let \({\displaystyle \sum_{n=0}^{\infty}a_{n}(z-c)^{n}}\) be a power series in \(\mathbb{C}\), with radius of convergence \(0<R\leq\infty\), and let \(f:B_{R}(c)\to\mathbb{C}\) be the resulting limit function. Then \(f\) is holomorphic on \(B_{R}(c)\) with \[\begin{align} f^{'}(z) & =\sum_{n=1}^{\infty}na_{n}(z-c)^{n-1}\label{eq:deriv-of-power-series}\end{align}\] for \(z\in B_{R}(c)\).

For simplicity, assume \(c=0\) (the case of general \(c\) is essentially the same).

We wish to show for each \(w\in B_{R}(c)\) that \({\displaystyle \lim_{z\to w}\frac{f(z)-f(w)}{z-w}}\) exists and agrees with the expression in \(\eqref{eq:deriv-of-power-series}\). Since convergence of \(f(z)\) is absolute in the disc of convergence (meaning we can reorder sums), we have \[f(z)-f(w)\:=\:\sum_{n=1}^{\infty}a_{n}\,(z^{n}-w^{n})\:=\:\sum_{n=1}^{\infty}a_{n}\,(z-w)\,q_{n}(z),\] where \(q_{n}(z)={\displaystyle \sum_{k=0}^{n-1}w^{k}\,z^{n-1-k}}\). So, for \(z\neq w\) \[\frac{f(z)-f(w)}{z-w}\:=\:\sum_{n=1}^{\infty}a_{n}\,\,q_{n}(z)\:=:\:h(z).\] Note the series makes sense even at \(w\), so we view \(h\) as being defined there too.

We claim that the series defining \(h(z)\) converges to a continuous function on \(B_{R}(0)\). We’ll prove this using the local M-test. Given \(z_{0}\in B_{R}(0)\), choose \(r<R\) such that \(w,z_{0}\in B_{r}(0)\). We need constants \(M_{n}\) for this set \(B_{r}(0)\) that control the terms \(a_{n}q_{n}(z)\) defining \(h\). We have for \(z\in B_{r}(0)\) \[\begin{aligned} |\:a_{n}\,q_{n}(z)|\:=\:\left|a_{n}\,\sum_{k=0}^{n-1}w^{k}\,z^{n-1-k}\right| & \quad\leq\quad|a_{n}|\,\sum_{k=0}^{n-1}|w|^{k}\,|z|^{n-1-k}\\ & \quad<\quad|a_{n}|\,\sum_{k=0}^{n-1}r^{k}\,r^{n-1-k}\quad=\quad n\,|a_{n}|\,r^{n-1}=:M_{n}.\end{aligned}\] We have \(\sum_{n=1}^{\infty}M_{n}=\sum_{n=0}^{\infty}n|a_{n}|r^{n-1}\) which converges, since by Proposition [prop:DIFF-INT-RADIUS] the series \({\displaystyle \sum_{n=1}^{\infty}na_{n}z^{n-1}}\) has radius of convergence \(R\), and so converges absolutely on \(B_{R}(0)\), in particular at the point \(r\). It follows from the Local M-test that the series defining \(h\) converges locally uniformly to a continuous function on \(B_{R}(0)\). Hence

\[\lim_{z\to w}\frac{f(z)-f(w)}{z-w}=\lim_{z\to w}h(z)=h(w)=\sum_{n=1}^{\infty}a_{n}\,\,q_{n}(w)=\:\sum_{n=1}^{\infty}a_{n}\,\sum_{k=0}^{n-1}w^{k}\,w^{n-1-k}\:=\:\sum_{n=1}^{\infty}n\,a_{n}\,\,w^{n-1}\] as required.

A power series \(f\) as in Theorem [thm:diff-int] with positive radius of convergence \(R\) can be differentiated infinitely many times and \(f^{(k)}(z):=\sum_{n=k}^{\infty}k!\binom{n}{k}a_{n}(z-c)^{n-k}\) for \(z\in B_{R}(c)\). This implies \(f^{(k)}(c)=k!\,a_{k}\).

[Power series can be integrated term by term in their disc of convergence] A power series \(f\) as in Theorem [thm:diff-int] with positive radius of convergence has a holomorphic antiderivative \(F:B_{R}(c)\to\mathbb{C}\), that is, \(F'(z)=f(z)\), and \(F\) is given by \(F(z):=\sum_{n=0}^{\infty}\frac{a_{n}}{n+1}(z-c)^{n+1}\) for \(z\in B_{R}(c)\).

Convergence of familiar power series (examples)

  1. The expected power series for \(\sin(z)\), \(\cos(z)\), \(\sinh(z)\) and \(\cosh(z)\) all converge locally uniformly to continuous functions on \(\mathbb{C}\), and the derivatives/anti-derivatives match those expected. We will see later that the series genuinely do represent the functions in question.

    More generally, when can a holomorphic function be represented via a power series? What about \(\log(z)\)? It is only defined (and is holomorphic) on \(\mathbb{C}\setminus\mathbb{R}_{\leq0}\) (and is not continuous if extended to the negative real line), so certainly cannot be defined by a power series on \(\mathbb{D}\), say. Maybe it works if we change the point at which the series is expanded about (i.e., change the constant \(c\) from zero). Indeed, one can show that \({\displaystyle \sum_{n=1}^{\infty}(-1)^{n+1}\frac{z^{n}}{n}}\) converges on \(\mathbb{D}\). We expect this to be the power series converging to \(\log(1+z)\), but what was so special about the taking a power series about the point \(c=1\)? It turns out, nothing. We will find out that the holomorphic functions are precisely the functions that can be locally represented by power series at every point in their domain. As a result, we will see by Theorem [thm:diff-int] that every holomorphic function is infinitely many times complex differentiable!

  2. Consider the geometric series \({\displaystyle \sum_{n=1}^{\infty}z^{n}}\). By the ratio test this sum converges when \(|z|<1\), so the radius of convergence is \(R=1\) and the series converges to a continuous function on its disc of convergence \(\mathbb{D}\). To find the continuous function the series converges to we need to consider the pointwise limit. Notice that \[\sum_{n=1}^{\infty}z^{n}\:=\:\lim_{N\to\infty}\left(\sum_{n=1}^{N}z^{n}\right)\:=\:\lim_{N\to\infty}\left(\frac{1-z^{N+1}}{1-z}\right)\:=\:\frac{1}{1-z},\quad\quad(\text{for }|z|<1).\] This limit function is defined and is continuous on all of \(\mathbb{C}\setminus\{1\}\). So, in some sense the convergence of the series in the complex plane is limited to the unit disc \(\mathbb{D}\) because it can’t pass the ‘pole’ at \(z=1\).

  3. We may determine the convergence of new power series by substitution. For example, notice that \(|z|<1\iff|z^{2}|<1\iff|-z^{2}|<1\), so by the substitution \(z\mapsto z^{2}\) we have \({\displaystyle \sum_{n=1}^{\infty}z^{2n}}\) converges locally uniformly on \(\mathbb{D}\) to \(\frac{1}{1-z^{2}}\), and similarly, by the substitution \(z\mapsto-z^{2}\) we have \({\displaystyle \sum_{n=1}^{\infty}(-1)^{n}z^{2n}}\) converges locally uniformly to \(\frac{1}{1+z^{2}}\).

    These examples give us real insight into the reasons for convergence of the corresponding real power series. Consider the graph of the real function \(y=1/(1-x^{2})\). It appears obvious why its real interval of convergence is the unit interval - there are asymptotes at \(x=\pm1\) that we can’t ‘get past’ continuously. But on the other hand \(y=\frac{1}{1+x^{2}}\) is a nice smooth looking graph everywhere on the real line - so why on earth is its interval of convergence also restricted to the unit interval? We can now see the answer - the interval of convergence of the real power series is restricted by the disc of convergence of the corresponding complex power series!!! The issue being the poles in the complex plane at \(z=\pm i\) that we couldn’t see when considering only the real version of the function. So hidden inside the real power series of nice continuous real function is actually some meaningful and significant complex analysis.

    Key: Complex analysis can give us new information about real functions!

6 Complex integration over contours

6.1 Definition of contour integrals

Our first aim is to give a meaning to the following \[\int_{\gamma}f(z)\,dz,\] where \(f:U\rightarrow\mathbb{C}\) is a complex function on an open set \(U\), and \(\gamma\) is a curve in \(U\). This will be called the integral of \(f\) along the curve \(\gamma\).
Our first step towards the above is to start by considering complex valued functions of a real variable. That is, \[f:[a,b]\rightarrow\mathbb{C}\] where \([a,b]\subset\mathbb{R}\). We note that such a function can be written as

\[f=u+iv,\,\,\forall t\in[a,b],\] where \(u,v:[a,b]\to\mathbb{R}\) and given by \(u=\mathrm{Re}(f)\) and \(v=\mathrm{Im}(f)\). Moreover we note that \(f\) is continuous if and only if both \(u\) and \(v\) are continuous. We then define \[\int_{a}^{b}f(t)\,dt:=\int_{a}^{b}u(t)\,dt+i\int_{a}^{b}v(t)\,dt\in\mathbb{C}.\] We note that \(\mathrm{Re}\left(\int_{a}^{b}f(t)\,dt\right)=\int_{a}^{b}u(t)\,dt\) and \(\mathrm{Im}\left(\int_{a}^{b}f(t)\,dt\right)=\int_{a}^{b}v(t)\,dt\).

We consider the function \(f(t)=t+it\). That is \(u(t)=v(t)=t\). We compute \[\begin{aligned} \int_{0}^{1}f(t)\,dt & =\int_{0}^{1}(t+it)\,dt=\int_{0}^{1}t\,dt+i\int_{0}^{1}t\,dt=\\ & =\left[\frac{t^{2}}{2}\right]_{0}^{1}+i\left[\frac{t^{2}}{2}\right]_{0}^{1}=\frac{1}{2}+i\frac{1}{2}.\end{aligned}\]

Before going on we check our definition of integrating complex valued functions of a real parameter is complex linear (as it should be!).

 [lem:linearity]

  1. Let \(f_{1}\) and \(f_{2}\) be continuous functions from \([a,b]\) to \(\mathbb{C}\). Then \(\int_{a}^{b}(f_{1}(t)+f_{2}(t))dt=\int_{a}^{b}f_{1}(t)dt+\int_{a}^{b}f_{2}(t)dt.\)

  2. For any complex number \(c\in\mathbb{C}\), and continuous function \(f:[a,b]\to\mathbb{C}\), \(\int_{a}^{b}cf(t)dt=c\int_{a}^{b}f(t)dt\).

(Simple, included for completeness). For part 1, write \(f_{j}=u_{j}+iv_{j}\) where \(u_{j}\) and \(v_{j}\) are the real and imaginary parts of \(f_{j}\). Then using the definition, and the known linearity of real integrals,

\[\begin{aligned} \int_{a}^{b}(f_{1}(t)+f_{2}(t))dt & =\int_{a}^{b}(u_{1}(t)+iv_{1}(t)+u_{2}(t)+iv_{2}(t))dt\\ & =\int_{a}^{b}(u_{1}(t)+u_{2}(t)+i(v_{1}(t)+v_{2}(t))dt\\ & =\int_{a}^{b}(u_{1}(t)+u_{2}(t))dt+i\int_{a}^{b}(v_{1}(t)+v_{2}(t))dt\\ & =\int_{a}^{b}u_{1}(t)dt+\int_{a}^{b}u_{2}(t)dt+i\left(\int_{a}^{b}v_{1}(t)dt+\int_{a}^{b}v_{2}(t)dt\right)\\ & =\int_{a}^{b}u_{1}(t)dt+i\int_{a}^{b}v_{1}(t)dt+\int_{a}^{b}u_{2}(t)dt+i\int_{a}^{b}v_{2}(t)dt\\ & =\int_{a}^{b}f_{1}(t)dt+\int_{a}^{b}f_{2}(t)dt.\end{aligned}\]

For part 2 we write \(f=u+iv\) where \(u\) and \(v\) are real, and write \(c=x+iy\). Then using linearity of real integrals as before,

\[\begin{aligned} \int_{a}^{b}cf(t)dt & =\int_{a}^{b}(x+iy)(u(t)+iv(t))dt\\ & =\int_{a}^{b}(xu(t)-yv(t)+i(yu(t)+xv(t)))dt\\ & =\int_{a}^{b}(xu(t)-yv(t))dt+i\int_{a}^{b}(yu(t)+xv(t)))dt\\ & =x\int_{a}^{b}u(t)dt-y\int_{a}^{b}v(t)dt+i\left(y\int_{a}^{b}u(t)dt+x\int_{a}^{b}v(t)dt\right)\\ & =(x+iy)\left(\int_{a}^{b}u(t)dt+i\int_{a}^{b}v(t)dt\right)\\ & =c\int_{a}^{b}f(t)dt.\end{aligned}\]

Recall from Definition [def:curves] that a smooth curve in \(\mathbb{C}\) is a continuously differentiable function \(\gamma:[0,1]\to\mathbb{C}\). In fact we can be more general and consider curves \(\gamma:[a,b]\to\mathbb{C}\). For short, we will say that such curves are \(C^1\).

Let us write \(\gamma(t)=u(t)+iv(t)\) with \(u,v:[a,b]\rightarrow\mathbb{R}\). Then the derivative \(\gamma'\) is defined as \[\gamma'(t):=u'(t)+iv'(t).\] One needs to be careful for the meaning of the condition on smoothness at the end-points. That is, there we demand that the one-sided derivative exists and is continuous from the one side. That is, \(\gamma'(b):=\lim_{h\rightarrow0^{-}}\frac{u(b+h)-u(b)}{h}+i\lim_{h\rightarrow0^{-}}\frac{v(b+h)-v(b)}{h}\) exists and \(\lim_{t\rightarrow b^{-}}\gamma'(t)=\gamma'(b)\).

Examples.

  1. An example of a \(C^{1}\) curve is \[\gamma:[0,2\pi]\rightarrow\mathbb{C},\,\,\,\gamma(\theta)=re^{i\theta},\] where \(r>0\). (Here we use the letter \(\theta\) to denote the parameter since it is more often used to denote angles.) We note that this is nothing else than the circle with center the origin and radius \(r\). Moreover we note that as the parameter \(\theta\) runs from \(0\) to \(2\pi\) we run the curve on a anti-clockwise direction. It is easy to see that this is \(C^{1}\) since \[\gamma'(\theta)=(re^{i\theta})'=(r\cos'(\theta))+ir\sin'(\theta))=rie^{i\theta}.\]

  2. Another example is given by \[\gamma:[0,2\pi]\rightarrow\mathbb{C},\,\,\,\gamma(\theta)=re^{-i\theta},\] where \(r>0\). We note the graph of this curve is exactly the same as above, namely the circle centred at the origin and radius \(r\). However here we run the curve on the clockwise direction!

  3. We now define the curve \[\gamma:[0,2]\rightarrow\mathbb{C},\] \[\gamma(t):=\begin{cases} t+it,\,\,\,\,0\leq t\leq1\\ t+i\,\,\,\,1\leq t\leq2. \end{cases}\] It is easy to see that \(\gamma\) consists of a straight line connecting the origin and the point \(1+i\) followed by another line connecting the point \(1+i\) and the point \(2+i\). Note that this curve is not \(C^{1}\) since the derivative at the point \(t=1\) does not exist (from the left it is \(1+i\) and from the right it is \(1\)).

We are now ready to give the following definition.

Let \(U\subset\mathbb{C}\) be an open set, and let \(f:U\rightarrow\mathbb{C}\) be a continuous function. Let \(\gamma:[a,b]\rightarrow U\subset\mathbb{C}\) be a \(C^{1}\)-curve. Then we define the integral of \(f\) along the curve \(\gamma\) by \[\int_{\gamma}f(z)\,dz:=\int_{a}^{b}f(\gamma(t))\gamma'(t)\,dt.\]

Note the function \(f\) is of a complex variable \(z\). Moreover \(f(\gamma(t))\) makes sense since \(\gamma([a,b])\subset U\). Furthermore the condition that \(\gamma\) is \(C^{1}\) guarantees the existence of \(\gamma'(t)\). Finally the function \(g(t):=f(\gamma(t))\gamma'(t)\) is a function of the real variable \(t\), so we have already defined \(\int_{a}^{b}g(t)\,dt\) above.

Basic Properties. We now note the following basic properties. Assume \(f\) is continuous.

  1. We have \[\int_{\gamma}(f_{1}(z)+f_{2}(z))\,dz=\int_{\gamma}f_{1}(z)\,dz+\int_{\gamma}f_{2}(z)\,dz\]

    Indeed we have by using Lemma [lem:linearity], \[\begin{aligned} \int_{a}^{b}(f_{1}(\gamma(t))+f_{2}(\gamma(t)))\gamma'(t)\,dt & =\int_{a}^{b}f_{1}(\gamma(t))\gamma'(t)\,dt+\int_{a}^{b}f_{2}(\gamma(t))\gamma'(t)\,dt\\ & =\int_{\gamma}f_{1}(z)\,dz+\int_{\gamma}f_{2}(z)\,dz.\end{aligned}\]

  2. For a \(c\in\mathbb{C}\) we have \[\int_{\gamma}cf(z)\,dz=c\int_{\gamma}f(z)\,dz.\] Indeed, \[\int_{\gamma}cf(z)\,dz:=\int_{a}^{b}cf(\gamma(t))\gamma'(t)\,dt=c\int_{a}^{b}f(\gamma(t))\gamma'(t)\,dt=c\int_{\gamma}f(z)\,dz\] where we used Lemma [lem:linearity] for the middle equality.

  3. Given \(\gamma:[a,b]\rightarrow\mathbb{C}\) we define the curve \((-\gamma):[-b,-a]\rightarrow\mathbb{C}\), by \((-\gamma)(t):=\gamma(-t)\). Then we have \[\int_{-\gamma}f(z)\,dz=-\int_{\gamma}f(z)\,dz.\] See Problem Sheet 9, Question 2.

 

  1. We compute the integral \(\int_{\gamma}dz\), where \(\gamma(\theta)=re^{i\theta}\) with \(0\leq\theta\leq2\pi\), with \(r>0\). \[\int_{\gamma}\,dz=\int_{0}^{2\pi}(re^{i\theta})'\,d\theta=\int_{0}^{2\pi}rie^{i\theta}\,d\theta=ri\int_{0}^{2\pi}e^{i\theta}d\theta=\] \[ri\left(\int_{0}^{2\pi}\cos(\theta)\,d\theta+i\int_{0}^{2\pi}\sin(\theta)\,d\theta\right)=ri(0+0)=0.\]

  2. We consider the curve \[\gamma:[-\pi/2,\pi/2]\rightarrow\mathbb{C},\,\,\,\,\gamma(\theta)=2e^{i\theta}\] and we compute the integral \(\int_{\gamma}\bar{z}\,dz.\) We have \[\int_{\gamma}\bar{z}\,dz=\int_{-\pi/2}^{\pi/2}\overline{(2e^{i\theta})}(2e^{i\theta})'\,d\theta=\int_{-\pi/2}^{\pi/2}2e^{-i\theta}2ie^{i\theta}\,d\theta=4i\int_{-\pi/2}^{\pi/2}d\theta=4\pi i.\]

In the last example we may parametrise the same curve differently. For example we may consider \(\delta:[0,1]\rightarrow\mathbb{C}\) where \(\delta(t)=2e^{\pi i(t-1/2)}\). It is easily seen that both \(\gamma\) and \(\delta\) parametrise the very same curve, namely the semi-circle joining \(-2i\) and \(2i\) and lying entirely at the positive part of the \(x\)-axis. We may then ask whether \(\int_{\gamma}f(z)\,dz=\int_{\delta}f(z)\,dz.\) It turns out that this is indeed the case, as the following lemma shows.

[Reparametrisation of curves] Let \(U\subset\mathbb{C}\) be an open set, \(f:U\to\mathbb{C}\) be continuous, and let \(\gamma:[a,b]\rightarrow U\) be a \(C^{1}\) curve. If \(\phi:[a',b']\rightarrow[a,b]\) with \(\phi(a')=a\) and \(\phi(b')=b\) is continuously differentiable and we define \[\delta:[a',b']\rightarrow\mathbb{C},\,\,\,\delta:=\gamma\circ\phi\,\,\,(\text{composition}),\] then we have \[\int_{\gamma}f(z)\,dz=\int_{\delta}f(z)\,dz.\]

We have \[\int_{\delta}f(z)\,dz=\int_{a'}^{b'}f(\delta(t))\delta'(t)\,dt\] \[=\int_{a'}^{b'}f(\gamma(\phi(t)))(\gamma(\phi(t))'\,dt\] \[=\int_{a'}^{b'}f(\gamma(\phi(t)))\gamma'(\phi(t))\phi'(t)\,dt\] We change the variable \(s:=\phi(t)\) and get that \(ds=\phi'(t)dt\), and we use the fact that \(\phi(a')=a\) and \(\phi(b')=b\). That is, the integral above is equal to \[\int_{a}^{b}f(\gamma(s))\gamma'(s)\,ds=\int_{\gamma}f(z)\,dz.\]

In the last example above we have, with \(\phi(t)=\pi(t-\frac{1}{2})\) that \(\delta=\gamma\circ\phi\). The next step is to consider more general curves, and not only \(C^{1}\) curves.

[Contours] Let \(\gamma:[a,b]\rightarrow\mathbb{C}\) be a curve, and suppose that there exist \(a=a_{0}<a_{1}<a_{2}<\ldots<a_{n-1}<a_{n}=b\) such that the curves \(\gamma_{i}:[a_{i-1},a_{i}]\rightarrow\mathbb{C}\), \(i=1,2,\ldots,n\) defined by \(\gamma_{i}(t):=\gamma(t)\) for \(t\in[a_{i-1},a_{i}]\) are \(C^{1}\) curves. Then we say that \(\gamma\) is a piecewise \(C^{1}\)-curve, or contour.

For a contour \(\gamma\) as above, we then define \[\int_{\gamma}f(z)\,dz=\sum_{i=1}^{n}\int_{\gamma_{i}}f(z)\,dz.\]

This type of integral is called a contour integral. It is the main object of study of this chapter and plays a cental role in Complex Analysis.

[Technical] Here one should check that the above is well-defined. That is, we may find a different \(a=a'_{0}<a'_{1}<a'_{2}<\ldots<a'_{m-1}<a'_{m}=b\), and \(\gamma'_{j}\) to establish that \(\gamma\) is a contour. But then it is easy to show that \[\sum_{i=1}^{n}\int_{\gamma_{i}}f(z)\,dz=\sum_{j=1}^{m}\int_{\gamma'_{j}}f(z)\,dz.\]

We will need one more definition regarding curves. We will need to add them. That is, if \(\gamma:[a,b]\rightarrow\mathbb{C}\) and \(\delta:[c,d]\rightarrow\mathbb{C}\) are two contours with \(\gamma(b)=\delta(c)\) then we define the contour \(\gamma\cup\delta:[a,b+d-c]\rightarrow\mathbb{C}\) \[(\gamma\cup\delta)(t):=\begin{cases} \gamma(t),\,\,\,a\leq t\leq b\\ \delta(t+c-b),\,\,\,b\leq t\leq b+d-c. \end{cases}\] Then it is easy to see that \[\int_{\gamma \cup \delta}f(z)\,dz=\int_{\gamma}f(z)\,dz+\int_{\delta}f(z)\,dz.\]

6.2 The Fundamental Theorem of Calculus

The Fundamental Theorem of Calculus in Analysis I tells us that we can calculate real integrals by finding an antiderivative of the quantity being integrated. There is a version of this theorem for contour integrals and, as you might expect, it is very useful in practice.

[Complex Fundamental Theorem of Calculus (FTC)] [thm:FTC]Let \(U\subset\mathbb{C}\) be an open set and let \(F:U\rightarrow\mathbb{C}\) be holomorphic with continuous derivative \(f\). Then for any contour \(\gamma:[a,b]\rightarrow U\) we have \[\int_{\gamma}f(z)\,dz=F(\gamma(b))-F(\gamma(a)).\] In particular if \(\gamma\) is closed, that is \(\gamma(a)=\gamma(b)\), then we have that \[\int_{\gamma}f(z)\,dz=0.\]

First we consider the case of \(\gamma\) being a \(C^{1}\) curve. Write \(F=u+iv\). We have

\[\int_{\gamma}F'(z)\,dz=\int_{a}^{b}F'(\gamma(t))\gamma'(t)\,dt\] \[\begin{aligned} =\int_{a}^{b}(F(\gamma(t)))'\,dt & =\int_{a}^{b}(u(\gamma(t)))'\,dt+i\int_{a}^{b}(v(\gamma(t)))'\,dt\\ & =\left[u(\gamma(t))\right]_{a}^{b}+i\left[v(\gamma(t))\right]_{a}^{b}=F(\gamma(b))-F(\gamma(a)).\end{aligned}\] where here we note that \((F(\gamma(t)))'=\frac{dF(\gamma(t))}{dt}\) and for the second to last equality we used the Fundamental Theorem of Calculus from real analysis.

We can now extend the proof to any contour. Indeed all we need to observe is that (with notation as above) \[\int_{\gamma}F'(z)\,dz=\sum_{i=1}^{n}\int_{\gamma_{i}}F'(z)\,dz=\sum_{i=1}^{n}\left[F(\gamma(t))\right]_{a_{i-1}}^{a_{i}}=F(\gamma(a_{n}))-F(\gamma(a_{0}))=F(\gamma(b))-F(\gamma(a)).\]

 

  • [Looking ahead] Later we will prove that if a function \(F\) is holomorphic, then its derivative \(f\) is automatically continuous. So that assumption can be dropped from the FTC.

  • Note that Theorem [thm:FTC] states that if two curves \(\gamma_{1}:[a,b]\rightarrow\mathbb{C}\) and \(\gamma_{2}:[a',b']\rightarrow\mathbb{C}\) have the same endpoints, that is \(\gamma_{1}(a)=\gamma_{2}(a')\) and \(\gamma_{1}(b)=\gamma_{2}(b')\) then \[\int_{\gamma_{1}}f(z)\,dz=\int_{\gamma_{2}}f(z)\,dz\] under the same hypothesis on \(F\) as in Theorem [thm:FTC].

(Important) We now compute the integral \[\int_{|z|=r}z^{n}\,dz:=\int_{\gamma}z^{n}\,dz,\,\,\,\,n\in\mathbb{Z}\] where \(\gamma(\theta)=re^{i\theta}\) , \(0\leq\theta\leq2\pi\). We first consider the case \(n\neq-1\). Then we have \[\int_{|z|=r}z^{n}\,dz=\int_{\gamma}z^{n}\,dz=\int_{\gamma}\left(\frac{z^{n+1}}{n+1}\right)'\,dz=\left[\frac{z^{n+1}}{n+1}\right]_{r}^{r}=0,\] that is we apply the theorem with \(F(z)=\frac{z^{n+1}}{n+1}\) and take \(U=\mathbb{C}\setminus\{0\}\), which clearly contains the curve \(\gamma\).

However we cannot do the same with \(n=-1\). Indeed, a natural candidate to consider as \(F(z)\) would be the logarithm. But we have to make sure that the open set \(U\) where this is defined does include the curve \(\gamma\). For example the principal branch of the logarithm will not do, since we remove the negative real axis, and the contour \(\gamma\) goes through it. Actually if we turn to the actual definition of the integral we see that \[\int_{\gamma}\frac{1}{z}\,dz=\int_{0}^{2\pi}r^{-1}e^{-i\theta}rie^{i\theta}\,d\theta=i\int_{0}^{2\pi}d\theta=2\pi i\neq0.\] Note that the integral is not zero. That is, we can conclude that there is no open set \(U\) that contains \(\gamma\) such that \(\log(z)\) is well-defined, since if it were so, by the theorem above we would get that the integral had to be zero, which is not!

Our next step is to address the following question:

Problem:

Given a function \(f:U\rightarrow\mathbb{C}\), provide sufficient conditions such that there exists a holomorphic function \(F:U\rightarrow\mathbb{C}\), with derivative \(F'=f\). Such an \(F\) is usually called the antiderivative or primitive of \(f\). In other words, is there a converse to the second part of the Fundamental Theorem of Calculus?

For this we need to be able to estimate the modulus of a contour integral, that is, bound \(\left|\int_{\gamma}f(z)\,dz\right|\) from above. We start with a definition.

[Length of a contour] Let \(\gamma:[a,b]\rightarrow\mathbb{C}\) be a contour. We define the length of \(\gamma\) by \[L(\gamma):=\int_{a}^{b}|\gamma'(t)|\,dt.\]

We now prove

[The Estimation Lemma] [lem:length-estimate]Let \(f:U\rightarrow\mathbb{C}\) be continuous and \(\gamma:[a,b]\rightarrow U\) be a contour. Then \[\left|\int_{\gamma}f(z)\,dz\right|\leq L(\gamma)\sup_{\gamma}|f|,\] where \[\sup_{\gamma}|f|:=\sup\{|f(z)|\,:\,z\in\gamma\}.\]

We start by proving the following: Given a continuous function \(g:[a,b]\rightarrow\mathbb{C}\), we have that \[\begin{equation} \left|\int_{a}^{b}g(t)\,dt\right|\leq\int_{a}^{b}|g(t)|\,dt.\label{eq:triangle-cx-int} \end{equation}\] Indeed if we write \(\int_{a}^{b}g(t)\,dt=re^{i\theta}\), with \(r\geq0\) then we have that \[\left|\int_{a}^{b}g(t)\,dt\right|=r=\mathrm{Re}(r)=\mathrm{Re}\left(e^{-i\theta}\int_{a}^{b}g(t)\,dt\right)\] \[=\mathrm{Re}\left(\int_{a}^{b}e^{-i\theta}g(t)\,dt\right)=\int_{a}^{b}\mathrm{Re}\left(e^{-i\theta}g(t)\right)\,dt\] \[\leq\int_{a}^{b}\left|e^{-i\theta}g(t)\right|\,dt=\int_{a}^{b}\left|g(t)\right|\,dt\] Using \(\eqref{eq:triangle-cx-int}\) we can now conclude the proof of the lemma. Indeed, \[\left|\int_{\gamma}f(z)\,dz\right|=\left|\int_{a}^{b}f(\gamma(t))\gamma'(t)\,dt\right|\leq\int_{a}^{b}\left|f(\gamma(t))\gamma'(t)\right|\,dt,\] where we used the above observation for \(g(t)=f(\gamma(t))\gamma'(t)\). But then we have that \[\int_{a}^{b}\left|f(\gamma(t))\gamma'(t)\right|\,dt\leq\sup_{\gamma}|f|\int_{a}^{b}\left|\gamma'(t)\right|\,dt=\sup_{\gamma}|f|L(\gamma).\]

We consider \(\gamma:[0,\pi/2]\rightarrow\mathbb{C}\) given by \(\gamma(\theta)=2e^{i\theta}\). Then we get an upper bound for \[\left|\int_{\gamma}\frac{z+4}{z^{3}-1}\,dz\right|\] We have \(L(\gamma)=\int_{0}^{\pi/2}|(2e^{i\theta})'|\,d\theta=\pi\), and for \(z\in\gamma\), that is \(|z|=2\) we have \[\left|\frac{z+4}{z^{3}-1}\right|=\frac{|z+4|}{|z^{3}-1|}\leq\frac{|z|+4}{||z|^{3}-1|}=\frac{6}{7}.\]

Here we used the triangle inequality for the numerator and the reverse triangle inequality for the denominator. Then, by Proposition [lem:length-estimate], \[\left|\int_{\gamma}\frac{z+4}{z^{3}-1}\,dz\right|\leq\frac{6\pi}{7}.\]

We are now ready to answer the problem stated above. The next theorem gives a converse to the second part of the FTC.

[Converse to FTC] [thm:holo-antiderivative]Let \(f:D\rightarrow\mathbb{C}\) be continuous on a domain \(D\). If \(\int_{\gamma}f(z)\,dz=0\) for all closed contours \(\gamma\) in \(D\), then there exists a holomorphic \(F:D\rightarrow\mathbb{C}\) such that \[F'(z)=f(z).\]

We fix a point \(a_{0}\in D\). Then for any other point \(w\in D\), by definition of \(D\) being a domain there is always a smooth path connecting \(a_{0}\) and \(w\) and we write \(\gamma_{w}\) for such a path. Then we define the following function \[F(w):=\int_{\gamma_{w}}f(z)\,dz.\]

Step 1. \(F(w)\) doesn’t depend on the contour we use. We first check that \(F(w)\) does not depend on the choice of contour connecting \(a_{0}\) and \(w\). To see this, if \(\tilde{\gamma}(w)\) is another contour from \(a_{0}\) to \(w\), then we can consider the closed contour \(C=\gamma(w)\cup(-\tilde{\gamma}(w))\) obtained by following \(\gamma(w)\) then \(\tilde{\gamma}(w)\) in the reverse direction. Then

\[0=\int_{C}f(z)\,dz=\int_{\gamma(w)\cup(-\tilde{\gamma}(w))}f(z)\,dz=\int_{\gamma(w)}f(z)\,dz-\int_{\tilde{\gamma}(w)}f(z)\,dz\] where the first equality is from the hypothesis that integrals of \(f\) over closed contours are zero. This gives

\[\int_{\gamma(w)}f(z)\,dz=\int_{\tilde{\gamma}(w)}f(z)\,dz\] showing that \(F(w)\) does not depend on the choice of \(\gamma(w)\).

Step 2. \(F\) is holomorphic with derivative \(f\). We now claim that \(F\) is holomorphic and moreover its derivative is \(f\), that is, for every \(w\in D\) we have that \(F'(w)\) exists and \(F'(w)=f(w)\), i.e.,

\[\lim_{h\rightarrow0}\frac{F(w+h)-F(w)}{h}=f(w)\,\,\,\,\,w\in D.\]

For any given \(w\in D\) we fix an \(r>0\) such that \(B_{r}(w)\subset D\). This ball exists since \(D\) is open. Then for any \(h\in\mathbb{C}\) with \(|h|<r\) we consider the straight line \(\delta_{h}\) that connects the point \(w\) to \(w+h\). A parametrisation of such a line is given by \[\delta_{h}:[0,1]\rightarrow D,\,\,\,\,t\mapsto w+th.\] We now consider the contour \(\gamma_{w}\cup\delta_{h}\). This is a contour from \(a_{0}\) to \(w+h\), contained in \(D\), so \[F(w+h)=\int_{\gamma_{w} \cup \delta_{h}}f(z)\,dz=\int_{\gamma_{w}}f(z)\,dz+\int_{\delta_{h}}f(z)\,dz=F(w)+\int_{\delta_{h}}f(z)\,dz.\] It is easily seen that \(\int_{\delta_{h}}f(w)\, dz=hf(w)\) and hence we may rewrite the above equation as \[F(w+h)=F(w)+hf(w)+\int_{\delta_{h}}(f(z)-f(w))\,dz,\] or for \(h\neq0\) \[\frac{F(w+h)-F(w)}{h}-f(w)=\frac{1}{h}\int_{\delta_{h}}(f(z)-f(w))\,dz.\] That is, \[\left|\frac{F(w+h)-F(w)}{h}-f(w)\right|=\left|\frac{1}{h}\int_{\delta_{h}}(f(z)-f(w))\,dz\right|=\frac{1}{|h|}\left|\int_{\delta_{h}}(f(z)-f(w))\,dz\right|\] and by Lemma [lem:length-estimate] (note here it is essential that \(f\) is continuous) we have that the above quantity is bounded by \[\leq\frac{1}{|h|}L(\delta_{h})\sup_{z\in\delta_{h}}|f(z)-f(w)|.\] But a simple calculation shows that \(L(\delta_{h})=|h|\) and hence we have that \[\left|\frac{F(w+h)-F(w)}{h}-f(w)\right|\leq\sup_{z\in\delta_{h}}|f(z)-f(w)|.\] Now we take the limit \(h\rightarrow0\), and we observe that the curve \(\delta_{h}\) becomes just the point \(w\). (More rigorously, every \(z\in\delta_{h}\) has \(|z-w|\leq h\).) Since \(f\) is continuous, we get that \(\sup{}_{z\in\delta_{h}}|f(z)-f(w)|\rightarrow0\) as \(h\to0\). Hence we have that \(\left|\frac{F(w+h)-F(w)}{h}-f(w)\right|\rightarrow0\) as \(h\rightarrow0\). That is,

\[\lim_{h\rightarrow0}\frac{F(w+h)-F(w)}{h}=f(w).\] Since \(w\) was any point in \(D\) we have established the theorem.

The above theorem established the existence of a holomorphic anti-derivative of \(f\). However there may be more than one such \(F\), after all we make quite a few choices in the construction of \(F\) above. For example we could have picked a different \(a_{0}\). However if there is another \(\tilde{F}\) such that \(\tilde{F}'(w)=f(w)\) for all \(w\in D\), then we would have that \(F'(w)=\tilde{F}'(w)\) or equivalently \((F-\tilde{F})'(w)=0\). But we know from Theorem [thm:function-with-zero-derivative-constant-on-domain] that if a function defined over a domain has a zero derivative we have that the function is just a constant. That is, there is some \(c\in\mathbb{C}\) such that \(\tilde{F}=F+c\).


  1. These notes are essentially the notes of Sabine Bögli from 2021–2022, which in turn were essentially those of Michael Magee from 2018-2019, in which it says: Chapters 1-5 are an evolution of notes of Stephen Harrap, which were in turn based on original notes of Jens Funke. Chapters 6 onwards are based on notes of Thanasis Bouganis.↩︎

  2. The multiplication of complex numbers is commutative and associative. The fact that such a multiplication exists in 2 dimensions is truly remarkable: there is no such multiplication on \(\mathbb{R}^{3}\), not even if we relax the condition that it is commutative! The search for an associative multiplication on \(\mathbb{R}^{3}\) lead Hamilton to discover (a non-commutative) one on \(\mathbb{R}^{4}\) instead which was his famous discovery of the quaternions.↩︎

  3. To be precise, we would like the solution to \(e^{z}=w\) to vary as a continuous function of \(w\); see later for the precise definition of continuous function.↩︎

  4. The trick is to use the stereographic projection maps to transfer your function to \(\mathbb{C}\), and use the definition of holomorphicity there.↩︎