$$ \def\ab{\boldsymbol{a}} \def\bb{\boldsymbol{b}} \def\cb{\boldsymbol{c}} \def\db{\boldsymbol{d}} \def\eb{\boldsymbol{e}} \def\fb{\boldsymbol{f}} \def\gb{\boldsymbol{g}} \def\hb{\boldsymbol{h}} \def\kb{\boldsymbol{k}} \def\nb{\boldsymbol{n}} \def\tb{\boldsymbol{t}} \def\ub{\boldsymbol{u}} \def\vb{\boldsymbol{v}} \def\xb{\boldsymbol{x}} \def\yb{\boldsymbol{y}} \def\Ab{\boldsymbol{A}} \def\Bb{\boldsymbol{B}} \def\Cb{\boldsymbol{C}} \def\Eb{\boldsymbol{E}} \def\Fb{\boldsymbol{F}} \def\Jb{\boldsymbol{J}} \def\Lb{\boldsymbol{L}} \def\Rb{\boldsymbol{R}} \def\Ub{\boldsymbol{U}} \def\xib{\boldsymbol{\xi}} \def\evx{\boldsymbol{e}_x} \def\evy{\boldsymbol{e}_y} \def\evz{\boldsymbol{e}_z} \def\evr{\boldsymbol{e}_r} \def\evt{\boldsymbol{e}_\theta} \def\evp{\boldsymbol{e}_r} \def\evf{\boldsymbol{e}_\phi} \def\evb{\boldsymbol{e}_\parallel} \def\omb{\boldsymbol{\omega}} \def\dA{\;d\Ab} \def\dS{\;d\boldsymbol{S}} \def\dV{\;dV} \def\dl{\mathrm{d}\boldsymbol{l}} \def\bfzero{\boldsymbol{0}} \def\Rey{\mathrm{Re}} \def\Real{\mathbb{R}} \newcommand{\dds}[1]{\frac{d{#1}}{ds}} \newcommand{\ddy}[2]{\frac{\partial{#1}}{\partial{#2}}} \newcommand{\ddt}[1]{\frac{d{#1}}{dt}} \newcommand{\DDt}[1]{\frac{\mathrm{D}{#1}}{\mathrm{D}t}} $$
1.1 Introduction
This term we will study how to generalise ordinary calculus to functions/maps of the form \[ f\,:\,\Real^m \to \Real^n, \quad m,n \in \mathbb{N}. \] The objects of interest are vectors, hence this subject is usually called vector calculus (or sometimes multivariable calculus).
Depending on the choice of \(m\) and \(n\), there are different applications. We will mostly stick to 3-d space, but even this gives many possibilities. Maps from \(\Real\to\Real^3\) describe curves, while those from \(\Real^2\to\Real^3\) describe surfaces. Once we have a curve or a surface, we will be able to use these as domains over which to integrate either scalar fields, which are maps \(\Real^3\to\Real\), or vector fields, which are maps \(\Real^3\to\Real^3\). We will also study differentiation of scalar and vector fields, and see how the Fundamental Theorem of Calculus generalises to higher dimensions.
The subject is interesting in its own right, but this course is also a basic foundation for many Level III and IV courses in both Pure and (especially) Applied Mathematics.
In this first topic we will review what you have already learnt about scalar fields in Calculus I, and about vectors in Linear Algebra I. We will also learn how to use index notation for simpler manipulation of vector equations.
1.2 Revision: Scalar fields
A scalar field on \(\Real^n\) is a map \(f\,:\,\Real^n\to\Real.\)
In Calculus I you saw examples on \(\Real\) and \(\Real^2\). This term we will also need scalar fields on \(\Real^3\). Here we will review the important concepts from Calculus and give some 3-d examples.
1.2.1 Visualisation
We can visualise a scalar field \(f\) on \(\Real^2\) as the height of a surface, \(z=f(x,y).\) In this case the level sets \(f(x,y)=c\) for \(c\in\Real\) are called contours of \(f.\)
For scalar fields on \(\Real^3\), we can no longer view the field itself as a surface. Moreover, the level sets \(f(x,y,z)=c\) are no longer curves but are themselves 2-d surfaces, sometimes called isosurfaces.
Calculating isosurfaces from a more complicated scalar field on a computer is a non-trivial computational problem.
1.2.2 Differentiation
The partial derivatives of \(f\) are encoded in its gradient vector, which for \(f:\Real^3\to\Real\) has the form \[\begin{eqnarray} \nabla f = \ddy{f}{x}\eb_1 + \ddy{f}{y}\eb_2 + \ddy{f}{z}\eb_3 = \left(\ddy{f}{x}, \ddy{f}{y}, \ddy{f}{z}\right)^\top. \end{eqnarray}\] Each component of \(\nabla f\) tells us the rate of change of \(f\) if we move in the corresponding coordinate direction.
If we want to know the rate of change of \(f\) in a general direction, not necessarily aligned with one of the coordinate axes, then we take the scalar product \(\hat{\nb}\cdot\nabla f\), where \(\hat{\nb}\) is a unit vector pointing in the required direction. This is called a directional derivative of \(f\).
Note that \[ \hat{\nb}\cdot\nabla f = |\hat{\nb}||\nabla f|\cos\theta = |\nabla f|\cos\theta, \] where \(\theta\) is the angle between \(\hat{\nb}\) and \(\nabla f\). The direction with largest increase of the function is therefore given by \(\theta=0\), showing that \(\nabla f\) points in the direction of fastest increase of \(f\), and its magnitude \(|\nabla f|\) is the rate of increase in that direction.
This property can be used to define \(\nabla f\) without the need to choose particular coordinates.
In fact, \(\nabla f\) is always orthogonal to the contours (in 2-d), or isosurfaces (in 3-d). This is because the contours/isosurfaces by definition correspond to directions \(\hat{\nb}\) with zero directional derivative, so \[ |\nabla f|\cos\theta = 0\qquad \implies \theta=\pm\frac{\pi}{2}. \]
1.3 Revision: Vectors in \(\mathbb{R}^3\)
Recall from Linear Algebra I that a vector in \(\Real^3\) (or \(\Real^2\)) is a quantity with both magnitude and direction, unlike a scalar which has only magnitude.
A vector in \(\Real^3\) is specified by its three components \(a_1\), \(a_2\), \(a_3\) with respect to the standard basis \(\{\eb_1, \eb_2, \eb_3\}\). We will write \[ \ab = a_1\eb_1 + a_2\eb_2 + a_3\eb_3 \] or (when it is clear that we are dealing with the standard basis) alternatively \[ \ab = \begin{pmatrix} a_1\\ a_2\\ a_3 \end{pmatrix} = (a_1, a_2, a_3)^\top. \]
When writing by hand, boldface is unavailable so we usually denote vectors by underlining: \(\underline{a}\) (although some people prefer a horizontal arrow on top: \(\vec{a}\)). Pure mathematicians often just write \(a\) for a vector without any special indication. However, this can lead to confusion for beginners so in this course you must underline (or put arrows on) your vectors.
Of course, if we change the basis, the components will change, but they are still describing the same geometrical object.
In fact, one can define a vector by the way in which its components change under a coordinate rotation. Namely \(a_i'=R_{ij}a_j\) where \(R\) is any rotation matrix. Although we won’t develop it further in this course, this viewpoint will be important in many later courses that generalise vectors to tensors, e.g., Differential Geometry, Fluid Mechanics, General Relativity, …
We can also have a change of coordinates that doesn’t amount to a simple rotation of the Cartesian frame. Polar coordinates are the most familiar example of such an alternative basis. In this course you will meet the 3-d versions: cylindrical and spherical polar coordinates.
The magnitude/length of a vector is given in terms of its components by Pythagoras’s Theorem: \[ |\ab| = \sqrt{a_1^2 + a_2^2 + a_3^2}. \] Stretching a vector (changing the magnitude but not direction) corresponds to multiplying by a scalar: \[ \lambda\ab = \lambda a_1\eb_1 + \lambda a_2\eb_2 + \lambda a_3\eb_3. \]
We add two vectors by the “triangle rule”:
Assuming both vectors are expressed in the standard basis, this just corresponds to adding each component: \[ \ab + \bb = (a_1 + b_1)\eb_1 + (a_2 + b_2)\eb_2 + (a_3 + b_3)\eb_3. \]
1.3.1 The scalar product
The simplest way to multiply two vectors in \(\Real^3\) is the scalar product (or dot product), which is defined geometrically as \[ \ab\cdot\bb = |\ab||\bb|\cos\theta, \] where \(\theta\) is the angle between the two vectors.
It doesn’t matter whether \(\theta\) is the smaller or larger angle, because \(\cos(2\pi - \theta)=\cos\theta\).
Note that \(|\bb|\cos\theta\) is the component of \(\bb\) along \(\ab\), and vice versa.
We can see geometrically that the component of \(\bb+\cb\) along \(\ab\) is the sum of the components of \(\bb\) and \(\cb\) along \(\ab\):
It follows that \[\begin{align} \ab\cdot\bb &= (a_1\eb_1 + a_2\eb_2 + a_3\eb_3)\cdot(b_1\eb_1 + b_2\eb_2 + b_3\eb_3)\\ &= a_1b_1\eb_1\cdot\eb_1 + a_2b_2\eb_2\cdot\eb_2 + a_3b_3\eb_3\cdot\eb_3 \quad\textrm{[using orthogonality of $\eb_i$ and $\eb_j$ when $i\neq j$]}\\ &= a_1b_1 + a_2b_2 + a_3b_3. \end{align}\]
So the scalar product has the same component formula in any orthonormal basis – in particular the cylindrical or spherical polar coordinates that you will meet later.
Note that the component formula for the scalar product works for any \(\Real^n\), and can be used to define the notions of angle and length for \(n>3\).
1.3.2 The vector product
The other way to multiply two vectors in \(\Real^3\) (but not any other \(\Real^n\)) is the vector product (or cross product), defined as \[ \ab\times\bb = |\ab|\,|\bb|\sin\theta\,\hat{\nb}, \] where \(\hat{\nb}\) is a unit vector perpendicular to both \(\ab\) and \(\bb\), in the right-handed sense, and \(\theta\) is the angle from \(\ab\) to \(\bb\) with respect to \(\hat{\nb}\).
It follows from the definition that \(\bb\times\ab = -\ab\times\bb\).
Note that \(|\bb|\sin\theta\) is the component of \(\bb\) perpendicular to \(\ab\). Thus \(\ab\times\ab = \bfzero\). And also the magnitude \(|\ab\times\bb|\) is the area of the parallelogram defined by \(\ab\) and \(\bb\).
As with the scalar product, we can show geometrically that \[ \ab\times(\bb + \cb) = \ab\times\bb + \ab\times\cb. \]
Proof. Let \(\ab\) point into the page. Then the effect of taking the cross product with \(\ab\) of any other vector, say \(\bb,\) is to:
Project \(\bb\) into the page (i.e. \(|\bb|\sin\theta\)).
Rotate through \(90^\circ\) clockwise (i.e. \(|\bb|\sin\theta\hat{\nb}\), choosing the direction \(\hat{\nb}\) to be perpendicular to \(\bb\) as well as \(\ab\)).
Multiply by \(|\ab|\) (i.e. \(|\ab||\bb|\sin\theta\,\hat{\nb}\)).
Doing this to \(\bb\), \(\cb\) and \(\bb+\cb\) gives the following picture:
In particular, \(\ab\times(\bb+\cb)\) is the same point as \(\ab\times\bb + \ab\times\cb\).
To find the components, consider first \(\eb_1\times\eb_2\). These two vectors have magnitude 1 and are perpendicular, so \(\sin\theta=1\). Moreover, \(\hat{\nb}=\eb_3\) because the coordinate system is right-handed. So \(\eb_1\times\eb_2=\eb_3\). Similarly \(\eb_2\times\eb_3=\eb_1\) and \(\eb_3\times\eb_1=\eb_2\). Thus we have \[\begin{align*} \ab\times\bb &= (a_1\eb_1 + a_2\eb_2 + a_3\eb_3)\times(b_1\eb_1 + b_2\eb_2 + b_3\eb_3)\\ &= a_1b_1\cancel{\eb_1\times\eb_1} + a_1b_2\eb_1\times\eb_2 + a_1b_3\eb_1\times\eb_3 \\ &\qquad + a_2b_1\eb_2\times\eb_1 + a_2b_2\cancel{\eb_2\times\eb_2} + a_2b_3\eb_2\times\eb_3\\ &\qquad\qquad + a_3b_1\eb_3\times\eb_1 + a_3b_2\eb_3\times\eb_2 + a_3b_3\cancel{\eb_3\times\eb3}\\ &= (a_1b_2 - a_2b_1)\eb_1\times\eb_2 + (a_2b_3 - a_3b_2)\eb_2\times\eb_3 + (a_3b_1 - a_1b_3)\eb_3\times\eb_1\\ &= (a_2b_3 - a_3b_2)\eb_1 + (a_3b_1 - a_1b_3)\eb_2 + (a_1b_2 - a_2b_1)\eb_3. \end{align*}\] This is easier to remember by recognising it as a (formal) determinant \[ \ab\times\bb = \begin{vmatrix} \eb_1 & \eb_2 & \eb_3\\ a_1 & a_2 & a_3\\ b_1 & b_2 & b_3 \end{vmatrix}. \]
1.4 Index notation
Remember that the vector equation \[ \cb = \ab + \bb \] is equivalent to the system of three scalar equations \[\begin{align} c_1 &= a_1 + b_1,\\ c_2 &= a_2 + b_2,\\ c_3 &= a_3 + b_3. \end{align}\] It is often surprisingly useful to write vector equations like this in so-called index notation, as \[ c_i = a_i + b_i. \] It is understood that this equation holds for \(i=1, 2\) and \(3\), and that it refers to any of the components in the standard basis. Here \(i\) is called a free index; the choice of letter is arbitrary: we could equally well write \(c_k = a_k + b_k\), as long as all terms in the equation have the same free index.
In this section we will learn the “tricks” that make index notation so useful; these mainly relate to scalar and vector products.
1.4.1 Scalar products
In index notation, we write the scalar product as \[ \ab\cdot\bb = a_jb_j, \] where the repeated index indicates that the term should be summed from \(j=1\) to \(3\) – called the (Einstein) summation convention. This repeated index is called a dummy index, and again the chosen letter doesn’t matter. Note that \(\ab\cdot\bb\) is a scalar, so there is no free index when we write it in index notation.
To avoid ambiguity, it is essential when using index notation that no index appears more than twice in any term.
Another way to think of the scalar product \(\ab\cdot\bb\) is as a quadratic form \[ \ab\cdot\bb = (a_1,a_2,a_3)\begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{pmatrix}\begin{pmatrix}b_1\\b_2\\b_3\end{pmatrix} = a_1b_1 + a_2b_2 + a_3b_3. \] In index notation, this may be written \[ \ab\cdot\bb = \delta_{ij}a_ib_j, \] where the components of the identity matrix are given by the Kronecker delta, defined as \[ \delta_{ij} = \begin{cases} 1 & \textrm{if $i=j$},\\ 0 & \textrm{if $i\neq j$}. \end{cases} \] This will often be useful when manipulating expressions in index notation.
1.4.2 Vector products
We can write vector products in index notation by introducing the alternating tensor (or Levi-Civita symbol), \[ \epsilon_{ijk} = \begin{cases} 0 & \textrm{if any of $i,j,k$ are equal},\\ +1 & \textrm{if $(i,j,k)=(1,2,3), (2,3,1)$ or $(3,1,2)$},\\ -1 & \textrm{if $(i,j,k)=(1,3,2), (2,1,3)$ or $(3,2,1)$}. \end{cases} \] This object has 27 components, but only 6 of them are non-zero. You can check that \[ \epsilon_{ijk} = \epsilon_{jki} = \epsilon_{kij}, \] and that \[ \epsilon_{ijk} = -\epsilon_{jik}. \] This allows us to express the vector product in index notation as \[ [\ab\times\bb]_i = \epsilon_{ijk}a_jb_k. \] For example, the first component of the right-hand side is \[\begin{align*} \epsilon_{1jk}a_jb_k &= \cancel{\epsilon_{11k}}a_1b_k + \epsilon_{12k}a_2b_k + \epsilon_{13k}a_3b_k\\ &= \cancel{\epsilon_{121}}a_2b_1 + \cancel{\epsilon_{122}}a_2b_2 + \epsilon_{123}a_2b_3 + \cancel{\epsilon_{131}}a_3b_1 + \epsilon_{132}a_3b_2 + \cancel{\epsilon_{133}}a_3b_3\\ &= \epsilon_{123}a_2b_3 + \epsilon_{132}a_3b_2\\ &= a_2b_3 - a_3b_2. \end{align*}\]
1.4.3 Triple products
There are two types of product that involve three vectors:
- The scalar triple product, \(\ab\cdot(\bb\times\cb)\), where the result is a scalar.
- The vector triple product, \(\ab\times(\bb\times\cb)\), where the result is a vector.
When simplifying more complicated expressions, there is a useful identity for the product of two Levi-Civita symbols that share one index in common.
Lemma 1.1 (Useful formula.) \[ \epsilon_{ijk}\epsilon_{klm} = \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}. \tag{1.1}\]
Proof. The left hand side is \[ \epsilon_{ijk}\epsilon_{klm} = \begin{cases} +1 & \textrm{if $(klm)$ is an even permutation of $(ijk)$},\\ -1 & \textrm{if $(klm)$ is an odd permutation of $(ijk)$},\\ 0 & \textrm{if $(klm)$ is not a permutation of $(ijk)$}. \end{cases} \] So there are six possibilities: \[ \epsilon_{ijk}\epsilon_{klm} = \delta_{ik}\delta_{jl}\delta_{km} - \delta_{ik}\delta_{jm}\delta_{kl} + \delta_{il}\delta_{jm}\delta_{kk} - \delta_{il}\delta_{jk}\delta_{km} + \delta_{im}\delta_{jk}\delta_{kl} - \delta_{im}\delta_{jl}\delta_{kk}. \] Using \(\delta_{kk}=3\) and \(\delta_{ij}\delta_{jk}=\delta_{ik}\), we find \[\begin{align*} \epsilon_{ijk}\epsilon_{klm} &= \delta_{im}\delta_{jl} - \delta_{il}\delta_{jm} + 3\delta_{il}\delta_{jm} - \delta_{il}\delta_{jm} + \delta_{im}\delta_{jl} - 3\delta_{im}\delta_{jl}\\ &= \delta_{il}\delta_{jm} - \delta_{im}\delta_{jl}. \end{align*}\]