MATH 2031 Lecture Notes (Michaelmas) - 5 Coordinates and differentiability

5.1 Orthogonal curvilinear coordinates

We have seen that volume integrals can be evaluated by changing to a new coordinate system $\xb(u,v,w)$. More broadly, it can be useful to define a basis for the $(u,v,w)$ coordinate system and express vector fields in this basis.

Recall that the tangent vectors in the new coordinates are given by $\displaystyle\ddy{\xb}{u}, \ddy{\xb}{v}, \ddy{\xb}{w}$.

Each tangent vector is tangent to a curve on which the other two coordinates are constant.

Tangent vectors and coordinate surfaces.

The coordinates $(u,v,w)$ are called orthogonal curvilinear if the three tangent vectors are mutually orthogonal at every point. We can then define an orthonormal basis $\{\eb_u, \eb_v, \eb_w\}$ by normalising: \[ \eb_u = \frac{1}{h_u}\ddy{\xb}{u}, \quad \eb_v = \frac{1}{h_v}\ddy{\xb}{v}, \quad \eb_w = \frac{1}{h_w}\ddy{\xb}{w}, \] where \[ h_u = \left|\ddy{\xb}{u}\right|, \quad h_v = \left|\ddy{\xb}{v}\right|, \quad h_w = \left|\ddy{\xb}{w}\right| \] are called the scale factors.

The scale factors tell you how the $(u,v,w)$ coordinate system is stretched in space compared to Cartesian. The (actual) edge lengths of an infinitesimal cube in $(u,v,w)$ coordinates are as follows:

Example. Show that Cartesian coordinates $(x,y,z)$ are orthogonal curvilinear and find the scale factors.

We have $\xb(x,y,z) = x\eb_1 + y\eb_2 + z\eb_3$, so \[ \ddy{\xb}{x} = \eb_1, \quad \ddy{\xb}{y}=\eb_2, \quad \ddy{\xb}{z}=\eb_3. \] Clearly any pair of these tangent vectors are orthogonal. The scale factors are \[ h_x = \left|\ddy{\xb}{x}\right| = 1, \quad h_y=\left|\ddy{\xb}{y}\right| = 1, \quad h_z=\left|\ddy{\xb}{z}\right| = 1. \] (So the basis $\{\eb_1, \eb_2, \eb_3\}$ is already orthonormal.)

Example. Show that cylindrical polar coordinates $(r,\theta,z)$ are orthogonal curvilinear and find the scale factors and unit vectors $\{\eb_r,\eb_\theta,\eb_z\}$.

We have $\xb(r,\theta,z) = r\cos\theta\eb_1 + r\sin\theta\eb_2 + z\eb_3$, for $r\geq 0$ and $\theta\in[0,2\pi]$, so \[\begin{align*} &\ddy{\xb}{r} = \cos\theta\eb_1 + \sin\theta\eb_2,\\ &\ddy{\xb}{\theta} = -r\sin\theta\eb_1 + r\cos\theta\eb_2,\\ &\ddy{\xb}{z} = \eb_3. \end{align*}\] Check that these are orthogonal: \[\begin{align*} &\ddy{\xb}{r}\cdot\ddy{\xb}{\theta} = -r\cos\theta\sin\theta + r\sin\theta\cos\theta=0,\\ &\ddy{\xb}{r}\cdot\ddy{\xb}{z} = 0,\\ &\ddy{\xb}{\theta}\cdot\ddy{\xb}{z} = 0. \end{align*}\] The scale factors are \[\begin{align*} &h_r = \sqrt{\cos^2\theta + \sin^2\theta} = 1,\\ &h_\theta = \sqrt{r^2\cos^2\theta + r^2\sin^2\theta} = r,\\ &h_z = 1, \end{align*}\] so the unit vectors are \[\begin{align*} \eb_r &= \frac{1}{h_r}\ddy{\xb}{r} = \cos\theta\eb_1 + \sin\theta\eb_2,\\ \eb_\theta &= \frac{1}{h_\theta}\ddy{\xb}{\theta} = -\sin\theta\eb_1 + \cos\theta\eb_2,\\ \eb_z &= \frac{1}{h_z}\ddy{\xb}{z} = \eb_3. \end{align*}\]

Unit vectors in cylindrical coordinates.

Although they are mutually orthogonal, the basis vectors in a general orthogonal curvilinear coordinate system are not constant because their direction varies.

Any vector field can be expressed in the new basis.

Example. Write the vector field $\fb(\xb)=-y\eb_1 + x\eb_2$ in the cylindrical polar coordinate basis.

From the previous example, we have that \[\begin{align*} \eb_r &= \frac{1}{h_r}\ddy{\xb}{r} = \cos\theta\eb_1 + \sin\theta\eb_2,\\ \eb_\theta &= \frac{1}{h_\theta}\ddy{\xb}{\theta} = -\sin\theta\eb_1 + \cos\theta\eb_2,\\ \eb_z &= \frac{1}{h_z}\ddy{\xb}{z} = \eb_3. \end{align*}\] However, for this problem we need to express $\{\eb_1,\eb_2,\eb_3\}$ in terms of $\{\eb_r,\eb_\theta,\eb_z\}$. Solving simultaneously gives \[\begin{align*} \eb_1 &= \cos\theta\eb_r - \sin\theta\eb_\theta,\\ \eb_2 &= \sin\theta\eb_r + \cos\theta\eb_\theta,\\ \eb_3 &= \eb_z, \end{align*}\] so \[ \fb(\xb) = -r\sin\theta(\cos\theta\eb_r - \sin\theta\eb_\theta) + r\cos\theta(\sin\theta\eb_r + \cos\theta\eb_\theta) = r\eb_\theta. \]

We can derive expressions for the various differential operators.

Proposition 5.1 If $(u,v,w)$ are orthogonal curvilinear coordinates, with $\fb(\xb)$ a vector field and $\phi(\xb)$ a scalar field, then

$(i)\;\;$ $\displaystyle \nabla \phi = \frac{1}{h_u}\ddy{\phi}{u}\eb_u + \frac{1}{h_v}\ddy{\phi}{v}\eb_v + \frac{1}{h_w}\ddy{\phi}{w}\eb_w$,

$(ii)\;\;$ $\displaystyle \nabla\cdot\fb = \frac{1}{h_uh_vh_w}\left[\ddy{}{u}(h_vh_wf_u) + \ddy{}{v}(h_uh_wf_v) + \ddy{}{w}(h_uh_vf_w) \right]$,

$(iii)\;\;$ $\displaystyle \nabla\times\fb = \frac{1}{h_uh_vh_w}\begin{vmatrix}h_u\eb_u & h_v\eb_v & h_w\eb_w\\ \displaystyle\ddy{}{u} & \displaystyle\ddy{}{v} & \displaystyle\ddy{}{w}\\ h_uf_u & h_vf_v & h_wf_w\end{vmatrix}$,

$(iv)\;\;$ $\displaystyle \nabla^2 \phi = \frac{1}{h_uh_vh_w}\left[\ddy{}{u}\left(\frac{h_vh_w}{h_u}\ddy{\phi}{u}\right) + \ddy{}{v}\left(\frac{h_uh_w}{h_v}\ddy{\phi}{v}\right) + \ddy{}{w}\left(\frac{h_uh_v}{h_w}\ddy{\phi}{w}\right)\right].$

Proof. $(i)$ Gradient. The $u$-component of $\nabla\phi$ is \[ \eb_u\cdot\nabla \phi = \left(\frac{1}{h_u}\ddy{\xb}{u}\right)\cdot\nabla \phi = \frac{1}{h_u}\ddy{x_j}{u}\ddy{\phi}{x_j} = \frac{1}{h_u}\ddy{\phi}{u}, \] where the last step used the chain rule. The other components are similar.

$(ii)$ Divergence. This is derived the same way as our original Cartesian expression – consider an infinitesimal cube aligned with $\eb_u$, $\eb_v$, $\eb_w$.

Infinitesimal cube for deriving divergence.

The volume of this cube is \[\begin{align*} |V| &= \int_0^\delta\int_0^\delta\int_0^\delta\left|\ddy{\xb}{u}\cdot\left(\ddy{\xb}{v}\times\ddy{\xb}{w}\right)\right|\,du\,dv\,dw\\ &= \int_0^\delta\int_0^\delta\int_0^\delta h_uh_vh_w\Big|\eb_u\cdot(\eb_v\times\eb_w)\Big|\,du\,dv\,dw\\ &= \int_0^\delta\int_0^\delta\int_0^\delta h_uh_vh_w\Big|\eb_u\cdot(\pm\eb_u)\Big|\,du\,dv\,dw\\ &= \int_0^\delta\int_0^\delta\int_0^\delta h_uh_vh_w\,du\,dv\,dw\\ &= \delta^3 h_uh_vh_w \quad \textrm{[for $h_u, h_v, h_w$ evaluated at some point in the cube]}. \end{align*}\] Therefore \[ \nabla\cdot\fb = \lim_{|V|\to 0}\frac{1}{|V|}\oint_S\fb\cdot\dS = \lim_{\delta\to 0}\frac{1}{h_uh_vh_w\delta^3}\sum_{i=1}^6\int_{S_i}\fb\cdot\dS. \] The integral over the right-hand face gives \[ \int_{S_1}\fb\cdot\dS = \int_0^\delta\int_0^\delta\fb\cdot\eb_u \left|\ddy{\xb}{v}\times\ddy{\xb}{w}\right|\,dv\,dw = \int_0^\delta\int_0^\delta f_u h_vh_w\,dv\,dw=\delta^2h_vh_wf_u, \] evaluated at some point $\xb_1$ on $S_1$. Overall, we get \[\begin{align*} \nabla\cdot\fb &= \lim_{\delta\to 0}\frac{1}{h_uh_vh_w}\left( \frac{h_vh_wf_u\Big|_{\xb_1} - h_vh_wf_u\Big|_{\xb_2}}{\delta}\right.\\ &\qquad \left. + \frac{h_uh_wf_v\Big|_{\xb_3} - h_uh_wf_v\Big|_{\xb_4}}{\delta} + \frac{h_uh_vf_w\Big|_{\xb_5} - h_uh_vf_w\Big|_{\xb_6}}{\delta}\right),\\ \end{align*}\] which gives the required expression.

$(iii)$ Curl. We can derive the $\eb_w$ component of $\nabla\times\fb$ by considering a square in the $uv$-plane:

The area inside the loop is \[\begin{align*} |A| &= \int_0^\delta\int_0^\delta\left|\ddy{\xb}{u}\times\ddy{\xb}{v}\right|\,du\,dv\\ &= \int_0^\delta\int_0^\delta h_uh_v\Big|\eb_u\times\eb_v\Big|\,du\,dv\\ &= \int_0^\delta\int_0^\delta h_uh_v\Big|\pm\eb_w\Big|\,du\,dv\\ &= \int_0^\delta\int_0^\delta h_uh_v\,du\,dv\\ &= \delta^2 h_uh_v \quad \textrm{[for $h_u,h_v$ evaluated at some point inside $C$]}. \end{align*}\]

So from the definition of curl, \[ \eb_w\cdot(\nabla\times\fb) = \lim_{|A|\to 0}\frac{1}{|A|}\oint_C\fb\cdot\,d\xb = \lim_{\delta\to 0}\frac{1}{\delta^2h_uh_v}\sum_{i=1}^4\int_{C_i}\fb\cdot\,d\xb. \] Evaluating the line integral along $C_1$ gives \[\begin{align*} \int_{C_1}\fb\cdot\,d\xb &= \int_0^\delta\fb\cdot\ddy{\xb}{u}\,du\\ &= \int_0^\delta\fb\cdot(h_u\eb_u)\,du\\ &= \int_0^\delta h_u f_u\,du\\ &= \delta h_uf_u \quad \textrm{[evaluated at some point $\xb_1$ on $C_1$].} \end{align*}\] Overall, we get \[\begin{align*} \eb_w\cdot(\nabla\times\fb) &= \lim_{\delta\to 0}\frac{1}{h_uh_v}\left(\frac{h_uf_u\Big|_{\xb_1} - h_uf_u\Big|_{\xb_2}}{\delta} + \frac{h_vf_v\Big|_{\xb_3} - h_vf_v\Big|_{\xb_4}}{\delta} \right)\\ &= \frac{1}{h_uh_v}\left[\ddy{}{u}(h_vf_v) - \ddy{}{v}(h_uf_u)\right]. \end{align*}\] The $\eb_u$ and $\eb_v$ components are obtained analogously.

$(iv)$ Laplacian. [See Tutorial Sheet.]

Example. Calculate $\nabla\times\fb$ in the cylindrical polar coordinate basis, where $\fb(\xb) = r\eb_\theta.$

From earlier we have $h_r=1$, $h_\theta=r$, $h_z = 1$, so from Proposition 5.1 $(iii)$ we have \[ \nabla\times\fb = \frac{1}{h_rh_\theta h_\phi}\begin{vmatrix} h_r\eb_r & h_\theta\eb_\theta & h_z\eb_z\\ \displaystyle\ddy{}{r} & \displaystyle\ddy{}{\theta} & \displaystyle\ddy{}{z}\\ h_rf_r & h_\theta f_\theta & h_z f_z \end{vmatrix} = \frac{1}{r}\begin{vmatrix} \eb_r & r\eb_\theta & \eb_z\\ \displaystyle\ddy{}{r} & \displaystyle\ddy{}{\theta} & \displaystyle\ddy{}{z}\\ 0 & r^2 & 0 \end{vmatrix} = \frac{1}{r}\ddy{}{r}(r^2)\eb_z = 2\eb_z. \] This agrees with what we found in Section 4.1 (with $\omega=1$).

5.2 Differentiability

A function $\fb:\Real^n\to\Real^m$ is differentiable at $\ab\in\Real^n$ if there exists an $m\times n$ matrix $D\fb(\ab)$ called the Jacobian matrix (or derivative) such that \[ \lim_{|\hb|\to 0}\frac{\Big(\fb(\ab+\hb) - \fb(\ab) \Big) - [D\fb(\ab)]\hb}{|\hb|} = \bfzero. \]

Think of this as saying that \[ \fb(\ab+\hb) = \fb(\ab) + [D\fb(\ab)]\hb + O(|\hb|^2), \] meaning that $\fb$ is well-approximated by a linear function near to $\ab$.

To see the form of $D\fb(\ab)$, start with some special cases.

$(i)\;\;n=m=1$. Then $\fb$ is a single-variable function $f(x)$ if $Df(a)\in\Real$ exists it must satisfy \[ \lim_{h\to 0}\frac{\Big(f(a+h)-f(a)\Big) - hDf(a)}{h} = 0\quad \implies Df(a)=\lim_{h\to 0}\frac{f(a+h)-f(a)}{h} = f'(a). \]

$(ii)\;\;m=1$. Then $\fb$ is a scalar field $f(\xb)$, so $Df(\ab)$ is a $1\times n$ matrix (a row vector). If it exists it must satisfy \[ [Df(\ab)]\hb = (v_1 \ldots v_n)\begin{pmatrix}h_1\\\vdots\\h_n\end{pmatrix} = v_1h_1 + v_2h_2 + \ldots + v_nh_n. \] To find the $x$-component, $v_1$, let $\hb=h\eb_1$ so that \[ \lim_{h\to 0}\frac{\Big(f(\ab+h\eb_1) - f(\ab)\Big) - v_1h}{h} = 0 \quad \implies v_1 = \lim_{h\to 0}\frac{f(\ab+h\eb_1) - f(\ab)}{h} = \left.\ddy{f}{x}\right|_{\ab}. \] It follows that when $f$ is a scalar field, the Jacobian matrix reduces to \[ Df(\ab) = \nabla f\Big|_{\ab} = \left(\left.\ddy{f}{x_1}\right|_{\ab} \ldots \left.\ddy{f}{x_n}\right|_{\ab} \right). \]

For a general function $\fb:\Real^n\to\Real^m$, the Jacobian matrix has the form \[ D\fb(\ab) = \begin{pmatrix} \displaystyle\left.\ddy{f_1}{x_1}\right|_{\ab} & \displaystyle\left.\ddy{f_1}{x_2}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_1}{x_n}\right|_{\ab}\\ \displaystyle\left.\ddy{f_2}{x_1}\right|_{\ab} & \displaystyle\left.\ddy{f_2}{x_2}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_2}{x_n}\right|_{\ab}\\ \vdots & \vdots & \ddots & \vdots\\ \displaystyle\left.\ddy{f_m}{x_1}\right|_{\ab} & \displaystyle\left.\ddy{f_m}{x_2}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_m}{x_n}\right|_{\ab} \end{pmatrix}. \]

A vector field $\fb(\xb)$ on $\Real^3$ corresponds to $n=m=3$. The divergence is the trace of $Df(\ab)$, while the curl is made up of the off-diagonal entries.

Warning: except for $n=m=1$, the fact that the partial derivatives exist does not imply that a scalar field $f(\xb)$ is differentiable.

Example. Is $f(x,y)=x^{1/3}y^{1/3}$ differentiable at $\ab=(0,0)$?

From the definition of partial differentiation, we have \[ \left.\ddy{f}{x}\right|_{(0,0)} = \lim_{h\to 0}\frac{f(h,0)-f(0,0)}{h} = 0, \qquad \left.\ddy{f}{y}\right|_{(0,0)} = \lim_{h\to 0}\frac{f(0,h)-f(0,0)}{h} = 0. \] So both partial derivatives exist at $(0,0)$, and $Df(0,0) = (0,0)$. Here is a plot of $z=f(x,y)$:

But suppose we take $\hb$ to be some other direction, say $\hb=h(\eb_1+\eb_2)$. Then \[ \lim_{|\hb|\to 0}\frac{\Big(f(\bfzero + \hb) - f(\bfzero)\Big)- \bfzero\cdot\hb}{|\hb|} = \lim_{h\to 0}\frac{(f(\hb) - \bfzero)}{\sqrt{2}|h|} = \lim_{h\to 0}\frac{h^{2/3}}{\sqrt{2}|h|} = \infty. \] So $f$ is not differentiable at $(0,0)$, even though the partial derivatives exist there.

A function $\fb : \Real^n\to\Real^m$ is continuously differentiable (or a $C^1$ function) at $\ab\in\Real^n$ if all of its partial derivatives exist and are continuous at $\ab$.

Theorem 5.1 If a function $\fb:\Real^n\to\Real^m$ is continuously differentiable in a neighbourhood of $\ab\in\Real^n$, then $\fb$ is differentiable at $\ab$.

Proof. Omitted.

We have \[ \textrm{continuous partials} \implies \textrm{differentiable} \implies \textrm{partials exist}, \] but neither of the converse statements are true.

Example. Where is $f(x,y)=y|x-2|$ (i) continuously differentiable; (ii) differentiable?

(i) It is continuously differentiable where $\displaystyle\ddy{f}{x}, \ddy{f}{y}$ are continuous.

For $x>2$, we have $f(x,y)=y(x-2)$ so $\displaystyle\ddy{f}{x}=y, \ddy{f}{y}=x-2.$

For $x<2$, we have $f(x,y)=-y(x-2)$ so $\displaystyle\ddy{f}{x}=-y, \ddy{f}{y}=-x+2.$

For $x=2$, we need to be more careful. We have \[ \left.\ddy{f}{x}\right|_{x=2} = \lim_{h\to 0}\frac{f(2+h,y) - f(2,y)}{h} = \lim_{h\to 0}\frac{y|h|}{h}. \] For $y\neq 0$, the limits for $h<0$ and $h>0$ are different, so the limit does not exist and this partial derivative does not exist for $x=2$, $y\neq 0$. If $y=0$ then the limit exists and is equal to $0$. However, this partial derivative is not continuous because it does not exist anywhere except a single point on the line $x=2$.

Therefore, $f$ is continuously differentiable everywhere except the line $x=2$.

(ii) By Theorem 5.1, we immediately know that $f$ is differentiable everywhere except on the line $x=2$. For $x=2$ and $y\neq 0$, it cannot be differentiable because $\nabla f$ does not exist. But it might be differentiable at $(2,0)$. Note that \[ \left.\ddy{f}{y}\right|_{(2,0)} = \lim_{h\to 0}\frac{f(2,y+h)-f(2,y)}{h} = 0, \quad \textrm{so} \quad Df(2,0) = \bfzero. \] Writing $\hb=h_1\eb_1 + h_2\eb_2$, we have \[ \lim_{|\hb|\to 0}\frac{\Big(f(2\eb_1+\hb) - f(2\eb_1)\Big)-\bfzero\cdot\hb}{|\hb|} = \lim_{|\hb|\to 0}\frac{h_2|h_1|}{|\hb|}. \] But \[ \left|\frac{h_2|h_1|}{|\hb|}\right| \leq \left|\frac{|\hb|^2}{|\hb|}\right| = |\hb| \quad \implies \lim_{|\hb|\to 0}\frac{h_2|h_1|}{|\hb|} = 0. \] Therefore $f$ is also differentiable at $(2,0)$.

5.3 Inverse functions

A function $\fb:\Real^n\to\Real^n$ has an inverse $\fb^{-1}$ if and only if \[ \fb^{-1}\big(\fb(\xb)\big) = \xb \quad \textrm{and} \quad \fb\big(\fb^{-1}(\ub)\big)=\ub \quad \textrm{for all $\xb, \ub$}. \]

Note that we need $m=n$ here.

Example. Find $\fb^{-1}(\xb)$ for a linear function $\fb(\xb)=A\xb$.

The inverse exists if and only if $A$ is invertible, in which case $\fb^{-1}(\xb)=A^{-1}\xb$.

If $\fb(\xb)$ is nonlinear, then finding a global inverse is hard. However, if $\fb$ is differentiable, then \[ \fb(\ab+\hb) = \fb(\ab) + [D\fb(\ab)]\hb + O(|\hb|^2), \] so $\fb$ looks like a linear function near to $\ab$, with matrix $D\fb(\ab)$. It follows that we can find a local inverse if and only if $D\fb(\ab)$ is invertible.

Theorem 5.2 (Inverse Function Theorem) A continuously differentiable function $\fb:\Real^n\to\Real^n$ has a local differentiable inverse near $\ab$ if $\det[D\fb(\ab)]\neq 0$.

Proof. Omitted (need to justify that the higher-order terms behave themselves!). Note that the derivative of $\fb^{-1}$ will be the inverse matrix $[D\fb(\ab)]^{-1}$.

For $n=1$, Theorem Theorem 5.2 says that a function $f(x)$ has a differentiable inverse at $a$ if $f'(a)\neq 0.$ This makes sense graphically if you recall that $y=f^{-1}(x)$ is the reflection of $y=f(x)$ in the line $y=x$.

A continuous inverse function $f^{-1}(x)$ exists everywhere in this example, but $f^{-1}$ is not differentiable at the point where $f'(x)=0$.

You may also recall from Calculus I that where $f'(x)\neq 0$, we have $\displaystyle\frac{\mathrm{d}}{\mathrm{d}x}(f^{-1}) = \frac{1}{f'(x)}$.

Example. Where does the function $\fb(r,\theta) = (r\cos\theta, r\sin\theta)$ have a differentiable inverse?

Note that this is just the polar coordinate parametrisation, $\xb(r,\theta)$, mapping a semi-infinite region in the $r\theta$-plane to the $xy$-plane:

The Jacobian matrix is \[ D\fb(r,\theta) = \begin{pmatrix} \displaystyle\ddy{f_1}{r} & \displaystyle\ddy{f_1}{\theta}\\ \displaystyle\ddy{f_2}{r} & \displaystyle\ddy{f_2}{\theta} \end{pmatrix} = \begin{pmatrix} \cos\theta & -r\sin\theta\\ \sin\theta & r\cos\theta \end{pmatrix}. \] To apply Theorem 5.2, calculate the determinant: \[ \det[D\fb(r,\theta)] = \begin{vmatrix} \cos\theta & -r\sin\theta\\ \sin\theta & r\cos\theta \end{vmatrix} = r. \] This is non-zero for $r\neq 0$, so Theorem 5.2 guarantees that there is a differentiable inverse mapping $(x,y)\to(r,\theta)$ everywhere except at the origin. Graphically, that is the point where the grid is singular and $\theta$ is not defined.

Note that Theorem 5.2 doesn’t give us a formula for $\fb^{-1}$. However, we do know its derivative: \[ D\fb^{-1}(x,y) = [D\fb(r,\theta)]^{-1} = \begin{pmatrix} \cos\theta & -r\sin\theta\\ \sin\theta & r\cos\theta \end{pmatrix}^{-1} = \frac{1}{r}\begin{pmatrix} r\cos\theta & r\sin\theta\\ -\sin\theta & \cos\theta \end{pmatrix}. \] Therefore \[ \ddy{r}{x} = \cos\theta = \frac{x}{r}, \quad \ddy{r}{y} = \sin\theta = \frac{y}{r}, \quad \ddy{\theta}{x} = -\frac{\sin\theta}{r} = -\frac{y}{r^2}, \quad \ddy{\theta}{y} = \frac{\cos\theta}{r} = \frac{x}{r^2}. \] In fact, the inverse function is $\displaystyle \fb^{-1}(x,y) = (r,\theta) = \left(\sqrt{x^2+y^2}, \arctan\left(\frac{y}{x}\right)\right).$

[You can check that the derivatives of this function agree with those we just calculated.]

A function $\fb:\Real^n\to\Real^n$ is called orientation preserving if $\det[D\fb(\xb)]>0$, or orientation reversing if $\det[D\fb(\xb)]<0$.

Example. For $r>0$, is the function $\fb(r,\theta)=(r\cos\theta,r\sin\theta)$ orientation-preserving or orientation-reversing?

We calculated $\det[D\fb(r,\theta)]=r>0$, so $\fb$ is orientation-preserving. What this means is that, for example, a square in the $r\theta$-plane maintains the order of its vertices after the mapping:

Illustration that mapping is orientation-preserving.

Let $U,V$ be two subsets of $\Real^n$. A function $\fb:U\to V$ is called a diffeomorphism if (i) it is bijective and (ii) its inverse function $\fb^{-1}:V\to U$ is also differentiable. The domains $U,V$ are said to be diffeomorphic.

Think of a diffeomorphism as a “non-folding distortion” of a grid on $U$. In fluid mechanics, for example, this represents the flow of a fluid – reversing the motion moves all of the fluid particles back to where they started.

Here is a diffeomorphism of the unit disk (to itself) given by \[ \fb(r,\theta)= \Big(r^2 \cos\Big[\theta + 2\pi r^3(1-r)\Big], r^2\sin\Big[\theta + 2\pi r^3(1-r)\Big]\Big). \]

The set of all diffeomorphisms from some domain $U$ to itself form a group, called $\mathrm{Diff}(U)$. Those that leave points on the boundary unchanged (as in the unit disk example just above) form a subgroup.

5.4 Implicit functions

Suppose a curve is given implicitly by $f(x,y)=0$. When can we solve for $y=g(x)$?

If we assume that $y=g(x)$, then \[\begin{align*} \frac{\mathrm{d}}{\mathrm{d}x}f\big(x,g(x)\big) = 0 \quad &\iff \ddy{f}{x} + g'(x)\ddy{f}{y} = 0 \quad \textrm{[by the Chain Rule]}\\ &\iff g'(x) = -\displaystyle\ddy{f}{x}\bigg/\displaystyle\ddy{f}{y}. \end{align*}\] So there is a problem when $\displaystyle\ddy{f}{y}=0$.

Example. Where can the circle $x^2 + y^2=1$ be expressed as a differentiable function $y=g(x)$?

Let $f(x,y)=x^2+y^2-1$. Then $\displaystyle\ddy{f}{y} = 2y$. So we can describe the circle by a single-valued, differentiable function $y=g(x)$ only for $y\neq 0$.

The derivative of the implicit function is \[ g'(x) = -\ddy{f}{x}\bigg/\ddy{f}{y} = -\frac{2x}{2y} = -\frac{x}{y}. \]

[For $y>0$ we have $g(x)=\sqrt{1-x^2}$, whereas in $y<0$ we have $g(x)=-\sqrt{1-x^2}$ (and you can check that $g'(x)=-x/y$ in either case). But there is no differentiable function $g(x)$ that works around either $(-1,0)$ or $(1,0)$.]

This idea generalises.

Theorem 5.3 (Implicit Function Theorem) Given a continuously differentiable function $\fb:\Real^{n+\textcolor{red}{m}}\to\Real^\textcolor{red}{m}$ with $\xb\in\Real^n$ and $\yb\in\Real^\textcolor{red}{m}$, solutions to $\fb(\xb,\yb)=\bfzero$ near a point $(\xb,\yb)=\ab$ can be realised as an implicit function \[ \yb = \gb(\xb) \quad \textrm{if} \quad \det[D_{\yb}\fb(\ab)] \neq 0. \] Moreover, this local solution is unique and differentiable with $D_{\xb}\gb(\ab) = -[D_\yb\fb(\ab)]^{-1}D_{\xb}\fb(\ab).$

Note: the meaning of these “partial” Jacobian matrices means that only the columns corresponding to $\xb$ or $\yb$ are included – for example, \[ D_{\yb}\fb(\ab) = \begin{pmatrix} \displaystyle\left.\ddy{f_{1}}{y_1}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_{1}}{y_m}\right|_{\ab}\\ \vdots & \ddots & \vdots\\ \displaystyle\left.\ddy{f_{m}}{y_1}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_{m}}{y_m}\right|_{\ab} \end{pmatrix}. \]

Although it tells you the derivative $D_{\xb}\gb(\ab)$, Theorem 5.3 does not tell you the value of the function $\gb(\ab)$ itself.

Proof. Omitted.

Example. Near what points is the spherical surface $x^2 + y^2 + z^2=1$ guaranteed to be expressible as $z=g(x,y)$? Find $\displaystyle\ddy{g}{x}$ and $\displaystyle\ddy{g}{y}$ at these points.

Here we have \[ \xb=(x,y), \quad \yb=(z), \quad f(\xb,\yb) = (x^2 + y^2 + z^2-1). \] So $n=2$ and $m=1$. Therefore $D_{\yb}f(\ab)$ is a $1\times 1$ matrix (a scalar!) given by \[ D_{\yb}f(\ab) = \left.\ddy{f}{z}\right|_{\ab} = 2z \quad \implies \det{D_{\yb}f(\ab)} = 2z. \] So by Theorem 5.3, we can write the equation for the sphere as $z=g(x,y)$ provided $z\neq 0$. (When $z=0$, Theorem 5.3 is inconclusive.)

For $z\neq 0$, Theorem 5.3 also gives us the derivative of $g(x,y)$, which is a $1\times 2$ matrix \[ D_{\xb}g(\ab) = -[D_{\yb}f(\ab)]^{-1}D_{\xb}f(\ab) \quad \implies \begin{pmatrix}\displaystyle\ddy{g}{x} & \displaystyle\ddy{g}{y}\end{pmatrix} = -\frac{1}{2z}\begin{pmatrix} 2x & 2y \end{pmatrix} = \begin{pmatrix}\displaystyle-\frac{x}{z} & \displaystyle-\frac{y}{z}\end{pmatrix}. \]

[If we put $y=0$ then this reduces to the circle example with $x^2+z^2=1$. ]

Example. If $(u,v,w)$ satisfy the simultaneous equations $\begin{cases}uv^2 + v^2w^3 + u^5w^4-1=0\\ u^2w+u^2v^3+v^4w^5+1=0\end{cases}$, can $u,v$ be expressed as a function of $w$ near $\ab=(u,v,w)=(1,1,-1)$?

Since there are two equations and three unknowns, this represents some curve in $(u,v,w)$ space. Let \[ \xb = (w), \quad \yb=\begin{pmatrix}u\\ v\end{pmatrix}, \quad \fb(\xb,\yb) = \begin{pmatrix}uv^2 + v^2w^3 + u^5w^4-1\\ u^2w+u^2v^3+v^4w^5+1\end{pmatrix}, \] so $n=1$ and $m=2$. Then $D\fb_{\yb}(\ab)$ is the $2\times 2$ matrix \[ D\fb_{\yb}(\ab) = \begin{pmatrix} \displaystyle\ddy{f_1}{u} & \displaystyle\ddy{f_1}{v}\\ \displaystyle\ddy{f_2}{u} & \displaystyle\ddy{f_2}{v} \end{pmatrix}_{\ab} = \begin{pmatrix} v^2 + 5u^4w^4 & 2uv + 2vw^3\\ 2uw +2uv^3 & 3u^2v^2 + 4v^3w^5 \end{pmatrix}_{(1,1,-1)} = \begin{pmatrix} 6 & 0\\ 0 & -1 \end{pmatrix}. \] This has determinant $\displaystyle\det[D\fb_{\yb}(\ab)] = -6 \neq 0$ so by Theorem 5.3 we can write $\yb=\gb(\xb)$, meaning $u=g_1(w)$ and $v=g_2(w)$.

Although it doesn’t give us a full expression for $\gb(\xb)$ itself, Theorem 5.3 does let us find its derivative at $\ab$. Note that \[ D_{\xb}\fb(\ab) = \begin{pmatrix} \displaystyle\ddy{f_1}{w}\\ \displaystyle\ddy{f_2}{w}\\ \end{pmatrix}_{\ab} = \begin{pmatrix} 3v^2w^2 + 4u^5w^3\\ u^2 + 5v^4w^4 \end{pmatrix}_{(1,1,-1)} = \begin{pmatrix} -1\\ 6 \end{pmatrix}, \] so \[ D_{\xb}\gb(\ab) = -\begin{pmatrix} 6 & 0\\ 0 & -1 \end{pmatrix}^{-1}\begin{pmatrix}-1\\ 6\end{pmatrix} = \frac{1}{6}\begin{pmatrix} -1 & 0\\ 0 & 6 \end{pmatrix}\begin{pmatrix}-1\\ 6\end{pmatrix} = \begin{pmatrix} \frac16\\ 6 \end{pmatrix}. \] This means that for $(u,v,w)$ near to $(1,1,-1)$ we have \[ \begin{pmatrix} u\\ v \end{pmatrix} \approx\begin{pmatrix} g_1(-1) + \frac16\big(w-(-1)\big)\\ g_2(-1) + 6\big(w-(-1)\big) \end{pmatrix} = \begin{pmatrix} 1 + \frac16(w+1)\\ 1 + 6(w+1) \end{pmatrix}. \]

The idea of linearisation – approximating a nonlinear function by a linear function using its derivative – will be invaluable in future courses, e.g. Mathematical Biology III or Dynamical Systems III (among others).