$$ \def\ab{\boldsymbol{a}} \def\bb{\boldsymbol{b}} \def\cb{\boldsymbol{c}} \def\db{\boldsymbol{d}} \def\eb{\boldsymbol{e}} \def\fb{\boldsymbol{f}} \def\gb{\boldsymbol{g}} \def\hb{\boldsymbol{h}} \def\kb{\boldsymbol{k}} \def\nb{\boldsymbol{n}} \def\tb{\boldsymbol{t}} \def\ub{\boldsymbol{u}} \def\vb{\boldsymbol{v}} \def\xb{\boldsymbol{x}} \def\yb{\boldsymbol{y}} \def\Ab{\boldsymbol{A}} \def\Bb{\boldsymbol{B}} \def\Cb{\boldsymbol{C}} \def\Eb{\boldsymbol{E}} \def\Fb{\boldsymbol{F}} \def\Jb{\boldsymbol{J}} \def\Lb{\boldsymbol{L}} \def\Rb{\boldsymbol{R}} \def\Ub{\boldsymbol{U}} \def\xib{\boldsymbol{\xi}} \def\evx{\boldsymbol{e}_x} \def\evy{\boldsymbol{e}_y} \def\evz{\boldsymbol{e}_z} \def\evr{\boldsymbol{e}_r} \def\evt{\boldsymbol{e}_\theta} \def\evp{\boldsymbol{e}_r} \def\evf{\boldsymbol{e}_\phi} \def\evb{\boldsymbol{e}_\parallel} \def\omb{\boldsymbol{\omega}} \def\dA{\;d\Ab} \def\dS{\;d\boldsymbol{S}} \def\dV{\;dV} \def\dl{\mathrm{d}\boldsymbol{l}} \def\bfzero{\boldsymbol{0}} \def\Rey{\mathrm{Re}} \def\Real{\mathbb{R}} \newcommand{\dds}[1]{\frac{d{#1}}{ds}} \newcommand{\ddy}[2]{\frac{\partial{#1}}{\partial{#2}}} \newcommand{\ddt}[1]{\frac{d{#1}}{dt}} \newcommand{\DDt}[1]{\frac{\mathrm{D}{#1}}{\mathrm{D}t}} $$
5.1 Orthogonal curvilinear coordinates
We have seen that volume integrals can be evaluated by changing to a new coordinate system \(\xb(u,v,w)\). More broadly, it can be useful to define a basis for the \((u,v,w)\) coordinate system and express vector fields in this basis.
Recall that the tangent vectors in the new coordinates are given by \(\displaystyle\ddy{\xb}{u}, \ddy{\xb}{v}, \ddy{\xb}{w}\).
Each tangent vector is tangent to a curve on which the other two coordinates are constant.
The coordinates \((u,v,w)\) are called orthogonal curvilinear if the three tangent vectors are mutually orthogonal at every point. We can then define an orthonormal basis \(\{\eb_u, \eb_v, \eb_w\}\) by normalising: \[ \eb_u = \frac{1}{h_u}\ddy{\xb}{u}, \quad \eb_v = \frac{1}{h_v}\ddy{\xb}{v}, \quad \eb_w = \frac{1}{h_w}\ddy{\xb}{w}, \] where \[ h_u = \left|\ddy{\xb}{u}\right|, \quad h_v = \left|\ddy{\xb}{v}\right|, \quad h_w = \left|\ddy{\xb}{w}\right| \] are called the scale factors.
The scale factors tell you how the \((u,v,w)\) coordinate system is stretched in space compared to Cartesian. The (actual) edge lengths of an infinitesimal cube in \((u,v,w)\) coordinates are as follows:
Although they are mutually orthogonal, the basis vectors in a general orthogonal curvilinear coordinate system are not constant because their direction varies.
Any vector field can be expressed in the new basis.
We can derive expressions for the various differential operators.
Proposition 5.1 If \((u,v,w)\) are orthogonal curvilinear coordinates, with \(\fb(\xb)\) a vector field and \(\phi(\xb)\) a scalar field, then
\((i)\;\;\) \(\displaystyle \nabla \phi = \frac{1}{h_u}\ddy{\phi}{u}\eb_u + \frac{1}{h_v}\ddy{\phi}{v}\eb_v + \frac{1}{h_w}\ddy{\phi}{w}\eb_w\),
\((ii)\;\;\) \(\displaystyle \nabla\cdot\fb = \frac{1}{h_uh_vh_w}\left[\ddy{}{u}(h_vh_wf_u) + \ddy{}{v}(h_uh_wf_v) + \ddy{}{w}(h_uh_vf_w) \right]\),
\((iii)\;\;\) \(\displaystyle \nabla\times\fb = \frac{1}{h_uh_vh_w}\begin{vmatrix}h_u\eb_u & h_v\eb_v & h_w\eb_w\\ \displaystyle\ddy{}{u} & \displaystyle\ddy{}{v} & \displaystyle\ddy{}{w}\\ h_uf_u & h_vf_v & h_wf_w\end{vmatrix}\),
\((iv)\;\;\) \(\displaystyle \nabla^2 \phi = \frac{1}{h_uh_vh_w}\left[\ddy{}{u}\left(\frac{h_vh_w}{h_u}\ddy{\phi}{u}\right) + \ddy{}{v}\left(\frac{h_uh_w}{h_v}\ddy{\phi}{v}\right) + \ddy{}{w}\left(\frac{h_uh_v}{h_w}\ddy{\phi}{w}\right)\right].\)
Proof. \((i)\) Gradient. The \(u\)-component of \(\nabla\phi\) is \[ \eb_u\cdot\nabla \phi = \left(\frac{1}{h_u}\ddy{\xb}{u}\right)\cdot\nabla \phi = \frac{1}{h_u}\ddy{x_j}{u}\ddy{\phi}{x_j} = \frac{1}{h_u}\ddy{\phi}{u}, \] where the last step used the chain rule. The other components are similar.
\((ii)\) Divergence. This is derived the same way as our original Cartesian expression – consider an infinitesimal cube aligned with \(\eb_u\), \(\eb_v\), \(\eb_w\).
The volume of this cube is \[\begin{align*} |V| &= \int_0^\delta\int_0^\delta\int_0^\delta\left|\ddy{\xb}{u}\cdot\left(\ddy{\xb}{v}\times\ddy{\xb}{w}\right)\right|\,du\,dv\,dw\\ &= \int_0^\delta\int_0^\delta\int_0^\delta h_uh_vh_w\Big|\eb_u\cdot(\eb_v\times\eb_w)\Big|\,du\,dv\,dw\\ &= \int_0^\delta\int_0^\delta\int_0^\delta h_uh_vh_w\Big|\eb_u\cdot(\pm\eb_u)\Big|\,du\,dv\,dw\\ &= \int_0^\delta\int_0^\delta\int_0^\delta h_uh_vh_w\,du\,dv\,dw\\ &= \delta^3 h_uh_vh_w \quad \textrm{[for $h_u, h_v, h_w$ evaluated at some point in the cube]}. \end{align*}\] Therefore \[ \nabla\cdot\fb = \lim_{|V|\to 0}\frac{1}{|V|}\oint_S\fb\cdot\dS = \lim_{\delta\to 0}\frac{1}{h_uh_vh_w\delta^3}\sum_{i=1}^6\int_{S_i}\fb\cdot\dS. \] The integral over the right-hand face gives \[ \int_{S_1}\fb\cdot\dS = \int_0^\delta\int_0^\delta\fb\cdot\eb_u \left|\ddy{\xb}{v}\times\ddy{\xb}{w}\right|\,dv\,dw = \int_0^\delta\int_0^\delta f_u h_vh_w\,dv\,dw=\delta^2h_vh_wf_u, \] evaluated at some point \(\xb_1\) on \(S_1\). Overall, we get \[\begin{align*} \nabla\cdot\fb &= \lim_{\delta\to 0}\frac{1}{h_uh_vh_w}\left( \frac{h_vh_wf_u\Big|_{\xb_1} - h_vh_wf_u\Big|_{\xb_2}}{\delta}\right.\\ &\qquad \left. + \frac{h_uh_wf_v\Big|_{\xb_3} - h_uh_wf_v\Big|_{\xb_4}}{\delta} + \frac{h_uh_vf_w\Big|_{\xb_5} - h_uh_vf_w\Big|_{\xb_6}}{\delta}\right),\\ \end{align*}\] which gives the required expression.
\((iii)\) Curl. We can derive the \(\eb_w\) component of \(\nabla\times\fb\) by considering a square in the \(uv\)-plane:
The area inside the loop is \[\begin{align*} |A| &= \int_0^\delta\int_0^\delta\left|\ddy{\xb}{u}\times\ddy{\xb}{v}\right|\,du\,dv\\ &= \int_0^\delta\int_0^\delta h_uh_v\Big|\eb_u\times\eb_v\Big|\,du\,dv\\ &= \int_0^\delta\int_0^\delta h_uh_v\Big|\pm\eb_w\Big|\,du\,dv\\ &= \int_0^\delta\int_0^\delta h_uh_v\,du\,dv\\ &= \delta^2 h_uh_v \quad \textrm{[for $h_u,h_v$ evaluated at some point inside $C$]}. \end{align*}\]
So from the definition of curl, \[ \eb_w\cdot(\nabla\times\fb) = \lim_{|A|\to 0}\frac{1}{|A|}\oint_C\fb\cdot\,d\xb = \lim_{\delta\to 0}\frac{1}{\delta^2h_uh_v}\sum_{i=1}^4\int_{C_i}\fb\cdot\,d\xb. \] Evaluating the line integral along \(C_1\) gives \[\begin{align*} \int_{C_1}\fb\cdot\,d\xb &= \int_0^\delta\fb\cdot\ddy{\xb}{u}\,du\\ &= \int_0^\delta\fb\cdot(h_u\eb_u)\,du\\ &= \int_0^\delta h_u f_u\,du\\ &= \delta h_uf_u \quad \textrm{[evaluated at some point $\xb_1$ on $C_1$].} \end{align*}\] Overall, we get \[\begin{align*} \eb_w\cdot(\nabla\times\fb) &= \lim_{\delta\to 0}\frac{1}{h_uh_v}\left(\frac{h_uf_u\Big|_{\xb_1} - h_uf_u\Big|_{\xb_2}}{\delta} + \frac{h_vf_v\Big|_{\xb_3} - h_vf_v\Big|_{\xb_4}}{\delta} \right)\\ &= \frac{1}{h_uh_v}\left[\ddy{}{u}(h_vf_v) - \ddy{}{v}(h_uf_u)\right]. \end{align*}\] The \(\eb_u\) and \(\eb_v\) components are obtained analogously.
\((iv)\) Laplacian. [See Tutorial Sheet.]
5.2 Differentiability
A function \(\fb:\Real^n\to\Real^m\) is differentiable at \(\ab\in\Real^n\) if there exists an \(m\times n\) matrix \(D\fb(\ab)\) called the Jacobian matrix (or derivative) such that \[ \lim_{|\hb|\to 0}\frac{\Big(\fb(\ab+\hb) - \fb(\ab) \Big) - [D\fb(\ab)]\hb}{|\hb|} = \bfzero. \]
Think of this as saying that \[ \fb(\ab+\hb) = \fb(\ab) + [D\fb(\ab)]\hb + O(|\hb|^2), \] meaning that \(\fb\) is well-approximated by a linear function near to \(\ab\).
To see the form of \(D\fb(\ab)\), start with some special cases.
\((i)\;\;n=m=1\). Then \(\fb\) is a single-variable function \(f(x)\) if \(Df(a)\in\Real\) exists it must satisfy \[ \lim_{h\to 0}\frac{\Big(f(a+h)-f(a)\Big) - hDf(a)}{h} = 0\quad \implies Df(a)=\lim_{h\to 0}\frac{f(a+h)-f(a)}{h} = f'(a). \]
\((ii)\;\;m=1\). Then \(\fb\) is a scalar field \(f(\xb)\), so \(Df(\ab)\) is a \(1\times n\) matrix (a row vector). If it exists it must satisfy \[ [Df(\ab)]\hb = (v_1 \ldots v_n)\begin{pmatrix}h_1\\\vdots\\h_n\end{pmatrix} = v_1h_1 + v_2h_2 + \ldots + v_nh_n. \] To find the \(x\)-component, \(v_1\), let \(\hb=h\eb_1\) so that \[ \lim_{h\to 0}\frac{\Big(f(\ab+h\eb_1) - f(\ab)\Big) - v_1h}{h} = 0 \quad \implies v_1 = \lim_{h\to 0}\frac{f(\ab+h\eb_1) - f(\ab)}{h} = \left.\ddy{f}{x}\right|_{\ab}. \] It follows that when \(f\) is a scalar field, the Jacobian matrix reduces to \[ Df(\ab) = \nabla f\Big|_{\ab} = \left(\left.\ddy{f}{x_1}\right|_{\ab} \ldots \left.\ddy{f}{x_n}\right|_{\ab} \right). \]
For a general function \(\fb:\Real^n\to\Real^m\), the Jacobian matrix has the form \[ D\fb(\ab) = \begin{pmatrix} \displaystyle\left.\ddy{f_1}{x_1}\right|_{\ab} & \displaystyle\left.\ddy{f_1}{x_2}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_1}{x_n}\right|_{\ab}\\ \displaystyle\left.\ddy{f_2}{x_1}\right|_{\ab} & \displaystyle\left.\ddy{f_2}{x_2}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_2}{x_n}\right|_{\ab}\\ \vdots & \vdots & \ddots & \vdots\\ \displaystyle\left.\ddy{f_m}{x_1}\right|_{\ab} & \displaystyle\left.\ddy{f_m}{x_2}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_m}{x_n}\right|_{\ab} \end{pmatrix}. \]
A vector field \(\fb(\xb)\) on \(\Real^3\) corresponds to \(n=m=3\). The divergence is the trace of \(Df(\ab)\), while the curl is made up of the off-diagonal entries.
Warning: except for \(n=m=1\), the fact that the partial derivatives exist does not imply that a scalar field \(f(\xb)\) is differentiable.
A function \(\fb : \Real^n\to\Real^m\) is continuously differentiable (or a \(C^1\) function) at \(\ab\in\Real^n\) if all of its partial derivatives exist and are continuous at \(\ab\).
Theorem 5.1 If a function \(\fb:\Real^n\to\Real^m\) is continuously differentiable in a neighbourhood of \(\ab\in\Real^n\), then \(\fb\) is differentiable at \(\ab\).
Proof. Omitted.
We have \[ \textrm{continuous partials} \implies \textrm{differentiable} \implies \textrm{partials exist}, \] but neither of the converse statements are true.
5.3 Inverse functions
A function \(\fb:\Real^n\to\Real^n\) has an inverse \(\fb^{-1}\) if and only if \[ \fb^{-1}\big(\fb(\xb)\big) = \xb \quad \textrm{and} \quad \fb\big(\fb^{-1}(\ub)\big)=\ub \quad \textrm{for all $\xb, \ub$}. \]
Note that we need \(m=n\) here.
If \(\fb(\xb)\) is nonlinear, then finding a global inverse is hard. However, if \(\fb\) is differentiable, then \[ \fb(\ab+\hb) = \fb(\ab) + [D\fb(\ab)]\hb + O(|\hb|^2), \] so \(\fb\) looks like a linear function near to \(\ab\), with matrix \(D\fb(\ab)\). It follows that we can find a local inverse if and only if \(D\fb(\ab)\) is invertible.
Theorem 5.2 (Inverse Function Theorem) A continuously differentiable function \(\fb:\Real^n\to\Real^n\) has a local differentiable inverse near \(\ab\) if \(\det[D\fb(\ab)]\neq 0\).
Proof. Omitted (need to justify that the higher-order terms behave themselves!). Note that the derivative of \(\fb^{-1}\) will be the inverse matrix \([D\fb(\ab)]^{-1}\).
For \(n=1\), Theorem Theorem 5.2 says that a function \(f(x)\) has a differentiable inverse at \(a\) if \(f'(a)\neq 0.\) This makes sense graphically if you recall that \(y=f^{-1}(x)\) is the reflection of \(y=f(x)\) in the line \(y=x\).
A continuous inverse function \(f^{-1}(x)\) exists everywhere in this example, but \(f^{-1}\) is not differentiable at the point where \(f'(x)=0\).
You may also recall from Calculus I that where \(f'(x)\neq 0\), we have \(\displaystyle\frac{\mathrm{d}}{\mathrm{d}x}(f^{-1}) = \frac{1}{f'(x)}\).
A function \(\fb:\Real^n\to\Real^n\) is called orientation preserving if \(\det[D\fb(\xb)]>0\), or orientation reversing if \(\det[D\fb(\xb)]<0\).
Let \(U,V\) be two subsets of \(\Real^n\). A function \(\fb:U\to V\) is called a diffeomorphism if (i) it is bijective and (ii) its inverse function \(\fb^{-1}:V\to U\) is also differentiable. The domains \(U,V\) are said to be diffeomorphic.
Think of a diffeomorphism as a “non-folding distortion” of a grid on \(U\). In fluid mechanics, for example, this represents the flow of a fluid – reversing the motion moves all of the fluid particles back to where they started.
Here is a diffeomorphism of the unit disk (to itself) given by \[ \fb(r,\theta)= \Big(r^2 \cos\Big[\theta + 2\pi r^3(1-r)\Big], r^2\sin\Big[\theta + 2\pi r^3(1-r)\Big]\Big). \]
The set of all diffeomorphisms from some domain \(U\) to itself form a group, called \(\mathrm{Diff}(U)\). Those that leave points on the boundary unchanged (as in the unit disk example just above) form a subgroup.
5.4 Implicit functions
Suppose a curve is given implicitly by \(f(x,y)=0\). When can we solve for \(y=g(x)\)?
If we assume that \(y=g(x)\), then \[\begin{align*} \frac{\mathrm{d}}{\mathrm{d}x}f\big(x,g(x)\big) = 0 \quad &\iff \ddy{f}{x} + g'(x)\ddy{f}{y} = 0 \quad \textrm{[by the Chain Rule]}\\ &\iff g'(x) = -\displaystyle\ddy{f}{x}\bigg/\displaystyle\ddy{f}{y}. \end{align*}\] So there is a problem when \(\displaystyle\ddy{f}{y}=0\).
This idea generalises.
Theorem 5.3 (Implicit Function Theorem) Given a continuously differentiable function \(\fb:\Real^{n+\textcolor{red}{m}}\to\Real^\textcolor{red}{m}\) with \(\xb\in\Real^n\) and \(\yb\in\Real^\textcolor{red}{m}\), solutions to \(\fb(\xb,\yb)=\bfzero\) near a point \((\xb,\yb)=\ab\) can be realised as an implicit function \[ \yb = \gb(\xb) \quad \textrm{if} \quad \det[D_{\yb}\fb(\ab)] \neq 0. \] Moreover, this local solution is unique and differentiable with \(D_{\xb}\gb(\ab) = -[D_\yb\fb(\ab)]^{-1}D_{\xb}\fb(\ab).\)
Note: the meaning of these “partial” Jacobian matrices means that only the columns corresponding to \(\xb\) or \(\yb\) are included – for example, \[ D_{\yb}\fb(\ab) = \begin{pmatrix} \displaystyle\left.\ddy{f_{1}}{y_1}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_{1}}{y_m}\right|_{\ab}\\ \vdots & \ddots & \vdots\\ \displaystyle\left.\ddy{f_{m}}{y_1}\right|_{\ab} & \ldots & \displaystyle\left.\ddy{f_{m}}{y_m}\right|_{\ab} \end{pmatrix}. \]
Although it tells you the derivative \(D_{\xb}\gb(\ab)\), Theorem 5.3 does not tell you the value of the function \(\gb(\ab)\) itself.
Proof. Omitted.
The idea of linearisation – approximating a nonlinear function by a linear function using its derivative – will be invaluable in future courses, e.g. Mathematical Biology III or Dynamical Systems III (among others).