brownian

A Brownian motion is an example of a Gaussian process. Such processes are characterized by the property that every finite collection of their values sampled at different times has (multivariate) Gaussian distribution. We recall some well known properties of Gaussian distributions.

A random variable $X$ is Gaussian with mean $\mu$ and variance $\sigma^2$ (written $X\sim\mathcal{N}(\mu,\sigma^2)$) if its moment generating function satisfies \[\label{eq:mgf-for-Gaussian-distribution} M_X(t):=\mathsf{E}\,e^{tX}\equiv\exp\Bigl\{\mu t+\frac{\sigma^2t^2}{2}\Bigr\}\,,\qquad t\in\mathbb{C}\,.\] If $\sigma^2>0$ then $X$ has density \[\label{eq:Gaussian-variable-density} f_X(x)=\frac{1}{\sqrt{2\pi}\sigma}\exp\Bigl\{-\frac{(x-\mu)^2}{2\sigma^2}\Bigr\}\,,\] otherwise $X$ is a degenerate variable, $\mathsf{P}(X=\mu)=1$. If $X\sim\mathcal{N}(0,1)$, then $X$ is a standard (Gaussian or normal) variable. A standard Normal variable is symmetric, that is, $-X$ has the same distribution as $X$. Indeed, note that the density of $-X$ is $f_X(-x)$, which is the same as $f_X(x)$ when $X \sim \mathcal{N}(0,1)$.

Proof. We have, since $a \geq 1$, \[\begin{aligned} \mathsf{P}(Z \geq a) &= \frac{1}{\sqrt{2\pi}} \int_a^{\infty} e^{-\frac{x^2}{2}} \, dx \\ & \leq \frac{1}{\sqrt{2\pi}} \int_a^{\infty} x e^{-\frac{x^2}{2}} \, dx \\ &= - \frac{e^{x^2/2}}{\sqrt{2\pi}} \mid_{a}^{\infty} \\ & \leq e^{a^2/2}. \end{aligned}\] Now by symmetry we have that $\mathsf{P}(|Z| \geq a) =2 \mathsf{P}(Z \geq a)$ and the conclusion follows. ◻

Show that $X\sim\mathcal{N}(\mu,\sigma^2)$ if and only if $X=\mu+\sigma Z$, where $Z\sim\mathcal{N}(0,1)$.

Let $X\sim\mathcal{N}(0,1)$ be a standard Gaussian random variable. Show that as $a\nearrow\infty$ we have $\sqrt{2\pi}ae^{a^2/2}\mathsf{P}(X\ge a)=ae^{a^2/2}\int_a^\infty e^{-x^2/2}\,dx\nearrow1$. ,\[\sqrt{2\pi}ae^{a^2/2}\mathsf{P}(X\ge a)=ae^{a^2/2}\int_a^\infty e^{-x^2/2}\,dx\nearrow1\,.\] [Hint: Change the variables $x\mapsto y$ with $x=\sqrt{a^2+2y}$.]

A vector $\mathbf{X}=(X_1,X_2)^\mathsf{T}\equiv\left(\begin{smallmatrix}X_1\\X_2\end{smallmatrix}\right)$ is Gaussian with mean $\boldsymbol{\mu}=(\mu_1,\mu_2)^\mathsf{T}$ and covariance matrix $\Sigma=\left(\begin{smallmatrix} \sigma_{11} & \sigma_{12} \\ \sigma_{21} & \sigma_{22} \end{smallmatrix}\right)$ (written $\mathbf{X}\sim\mathcal{N}(\boldsymbol{\mu},\Sigma)$) iff \[M_\mathbf{X}(\mathbf{t}):=\mathsf{E}e^{t_1X_1+t_2X_2}\equiv\exp\Bigl\{(\mathbf{t},\boldsymbol{\mu})+\tfrac12\mathbf{t}\Sigma\mathbf{t}^\mathsf{T}\Bigr\}\] for all $\mathbf{t}=(t_1,t_2)\in\mathbb{C}^2$, where \[(\mathbf{t},\boldsymbol{\mu})=t_1\mu_1+t_2\mu_2\,,\qquad \mathbf{t}\Sigma\mathbf{t}^\mathsf{T}=\sigma_{11}(t_1)^2+(\sigma_{21}+\sigma_{12})t_1t_2+\sigma_{22}(t_2)^2\,;\] of course, $\sigma_{11}=\mathsf{Var}(X_1)$, $\sigma_{22}=\mathsf{Var}(X_2)$, $\sigma_{21}=\sigma_{12}=\mathsf{Cov}(X_1,X_2)$. If $\mathbf{X}\sim\mathcal{N}(\left(\begin{smallmatrix}0\\0\end{smallmatrix}\right),\left(\begin{smallmatrix} 1 & 0 \\ 0 & 1 \end{smallmatrix}\right))$, the vector $\mathbf{X}$ has standard (Gaussian or normal) distribution.

Show that $\mathbf{X}=(X_1,X_2)^\mathsf{T}\sim\mathcal{N}(\boldsymbol{\mu},\Sigma)$ if and only if $\mathbf{X}=\boldsymbol{\mu}+A\mathbf{Z}$, where $\mathbf{Z}\sim\mathcal{N}(\left(\begin{smallmatrix}0\\0\end{smallmatrix}\right),\left(\begin{smallmatrix} 1 & 0 \\ 0 & 1 \end{smallmatrix}\right))$ and $A$ is a suitable $2\times2$ matrix. Then, in particular, $\Sigma=AA^\mathsf{T}$.

If $\left(\begin{smallmatrix} a & b \\ b & c \end{smallmatrix}\right)$ is a positive definite matrix, show that $(X,Y)^\mathsf{T}\sim\mathcal{N}\bigl((0,0)^\mathsf{T},\left(\begin{smallmatrix} a & b \\ b & c \end{smallmatrix}\right)\bigr)$ has pdf $f_{X,Y}(x,y)=\frac{1}{2\pi\sqrt{ac-b^2}}\exp\bigl\{-\frac{cx^2-2bxy+ay^2}{2(ac-b^2)}\bigr\}$.
[Hint: Reduce the problem to the case of diagonal covariance matrix.]

Let $(X,Y)^\mathsf{T}\sim\mathcal{N}(\boldsymbol{\mu},\Sigma)$ be a Gaussian vector. Show that $X$ and $Y$ are independent if and only if $\mathsf{Cov}(X,Y)=\mathsf{E}\bigl[(X-\mu_X)(Y-\mu_Y)\bigr]=0$. Deduce that a Gaussian vector $(X_1,X_2,\dots,X_n)$ has independent entries if and only if all its covariances vanish, $\mathsf{Cov}(X_i,X_j)=0$ for all $i\neq j$ (i.e., $\Sigma$ is a diagonal matrix).

It is a general fact that if $X$ and $Y$ are independent, then $\mathsf{Cov}(X,Y)=\mathsf{E}\bigl[(X-\mu_X)(Y-\mu_Y)\bigr]=\mathsf{E}(X-\mu_X)\mathsf{E}(Y-\mu_Y)=0$. Conversely, if $\mathsf{Cov}(X,Y)=0$, the joint density factorises, $f_{X,Y}(x,y)=f_X(x)f_Y(y)$, where each factor is a Gaussian density $f_Z(z)=\tfrac1{\sqrt{2\pi}\sigma_Z}\exp\{-z^2/(2\sigma_Z^2)\}$; hence, $X$ and $Y$ are independent.
The general case is similar.

Let $(X,Y)$ be a random Gaussian vector with joint pdf $f_{X,Y}(x,y)=\tfrac1{2\pi\sqrt{1-\rho^2}}\exp\bigl\{-\tfrac{x^2-2\rho xy+y^2}{2(1-\rho^2)}\bigr\}$ with $\rho^2<1$, i.e., $(X,Y)^{\mathsf{T}}\sim\mathcal{N}\bigl((\begin{smallmatrix} 0 \\ 0 \end{smallmatrix}), (\begin{smallmatrix} 1&\rho\\ \rho&1 \end{smallmatrix})\bigr)$. Show that $(Y\mid X=x)\sim\mathcal{N}(\rho x,1-\rho^2)$. For $\rho^2<1$, let $(X,Y)$ be a random Gaussian vector with joint pdf $f_{X,Y}(x,y)=\tfrac1{2\pi\sqrt{1-\rho^2}}\exp\bigl\{-\tfrac{x^2-2\rho xy+y^2}{2(1-\rho^2)}\bigr\}$, i.e., $(X,Y)^{\mathsf{T}}\sim\mathcal{N}\bigl((\begin{smallmatrix} 0 \\ 0 \end{smallmatrix}), (\begin{smallmatrix} 1&\rho\\ \rho&1 \end{smallmatrix})\bigr)$. Show that the distribution of $(Y\mid X=x)$ is $\mathcal{N}(\rho x,1-\rho^2)$. More generally, let $(Z,U)^{\mathsf{T}}\sim\mathcal{N}\Bigl((\begin{smallmatrix} \mu_1 \\ \mu_2 \end{smallmatrix}), \bigl(\begin{smallmatrix} \sigma_1^2&\rho\sigma_1\sigma_2\\ \rho\sigma_1\sigma_2&\sigma_2^2 \end{smallmatrix}\bigr)\Bigr)$. Find the distribution of $(U\mid Z=z)$.

Show that if $X$ and $Y$ are independent with common $\mathcal{N}(0,\sigma^2)$ distribution, then $U=(X+Y)/\sqrt2$ and $V=(X-Y)/\sqrt2$ are independent Gaussian variables with common distribution $\mathcal{N}(0,\sigma^2)$.

The following result is often useful in applications. It holds for vectors $\mathbf{X}=(X_1,X_2,\ldots,X_n)^\mathsf{T}$ in each dimension $n\ge1$.

Show that a vector $\mathbf{X}=(X_1,X_2)^\mathsf{T}$ is Gaussian if and only if $Y:=(\mathbf{t},\mathbf{X})\equiv t_1X_1+t_2X_2$ is a Gaussian variable for all $\mathbf{t}=(t_1,t_2)\in\mathbb{R}^2$.
[Hint: Reduce the problem to the case of diagonal covariance matrix.]

Fix integer $n\ge1$. Show that a vector $\mathbf{X}=(X_1,\ldots,X_n)^\mathsf{T}$ is Gaussian if and only if $Y:=(\mathbf{t},\mathbf{X})\equiv t_1X_1+t_2X_2+\dots+t_nX_n$ is a Gaussian variable for all $\mathbf{t}=(t_1,t_2,\ldots,t_n)\in\mathbb{R}^n$.
[Hint: Reduce the problem to the case of diagonal covariance matrix.]

Brownian motion: definition and basic properties

Brownian motion is perhaps the most important stochastic process in probability theory. It is characterized by a remarkably simple set of properties. The fact that these properties characterize the process is a non-trivial theorem.

Brownian motion is ubiquitous within the sciences because it describes the large scale behaviour of random walks and random walk like processes. It is named after the botanist Robert Brown, who was interested in modelling the motion of pollen in water. The pollen particles are randomly kicked around by water molecules and perform Brownian motion. Albert Einstein studied Brownian motion as the scaling limit of random walks. A mathematically rigorous treatment of Brownian motion was given by Norbert Weiner, who constructed it as a random function over the space of continuous functions. The modern study of Brownian motion is based on the approach of Paul Lévy, which we adopt below.

It is the continuity property (iv) that makes Brownian motion hard to construct as a stochastic process on a common probability space. The “space" where Brownian motion is defined is the set of continuous functions $f: [0,\infty] \to \mathbb{R}$. This space comes with a natural sigma-algebra and Brownian motion is defined as a probability measure over that sigma-algebra. To avoid technicalities we are not going to dwell much on these issues and assume, for now, that a stochastic process $B_t$ satisfying the properties above exists. We are interested in deriving properties of Brownian motion from properties (i)-(iv), which we will treat as axioms. Later on we will give a proper construction of Brownian motion.

Brownian motion is a Gaussian process in that all its finite-dimensional distributions are Gaussian: whenever $0<t_1<t_2<\dots<t_n$, the vector $\bigl(B_{t_1},\dots,B_{t_n}\bigr)$ has a multivariate Gaussian (also called Normal) distribution. Recall that each multivariate Gaussian distribution is uniquely determined by its vector of expectations and its covariance matrix. As a result, the Brownian motion is the only Gaussian process with continuous trajectories for which $\mathsf{E}B_t=0$ and $\mathsf{E}(B_sB_t)=s\land t$.

Various transformations keep the Brownian motion invariant in distribution. Three important examples are below.

Example 4. Brownian motion is scale invariant. If $(B_t)_{t\ge0}$ is a standard Brownian motion, then the process $(X_t)_{t\ge0}$ defined via \[X_t=\tfrac1cB_{c^2t}\] is also a Brownian motion for every $c > 0$. To verify this we check properties (i)–(iv) in Definition Definition 2 for $X_t$. Clearly, $X_0 = 0$ and $X_t$ is continuous since $B_t$ is continuous. Since $X_t$ is a Gaussian process, to check properties (ii) and (iii) it is enough to check that $\mathsf{E}(X_t X_s) = s \land t$ and $\mathsf{E}(X_t) =0$. The latter is obvious. For the former, we have \[\mathsf{E}(X_t X_s) = c^{-2} \mathsf{E}(B_{c^2t}B_{c^2s}) = c^{-2} (c^2s \land c^2 t) = s \land t.\] So $X_t$ satisfies the defining properties of Brownian motion.

Example 5. The time inversion of Brownian motion is defined by \[X_0=0, \quad X_t=tB_{1/t} \;\; \text{for}\;\; t > 0.\] The process $X_t$ is again standard Brownian motion (see [exse:Bm-time-inverse]).
We check the latter claim assuming the property $\mathsf{P}(\lim_{t\to\infty}\tfrac1tB_t=0)=1$, to be verified separately. Property (i) is given in the definition. Next, notice that for $0\le s<t$ we have $\mathsf{E}(X_sX_t)=st\,\mathsf{E}(B_{1/s}B_{1/t})=s=s\land t$; therefore, whenever $0\le t_1<t_2<\dots<t_n$, the vector $(X_{t_1},\dots,X_{t_n})$ has multivariate Gaussian distribution with expectations $\mathsf{E}X_{t_k}=0$ and covariances $\mathsf{E}(X_{t_k}X_{t_j})=t_k\land t_j$. This implies the properties (ii) and (iii). Finally, we need to check property (iv) which is continuity. For $t>0$ trajectories $X_t$ are continuous by property (iv) in Definition Definition 2 and for $t=0$ this follows from the assumed property \[\mathsf{P}(\lim_{t\to0}X_t=0)=\mathsf{P}(\lim_{t\to0}t B_{1/t}=0)=\mathsf{P}(\lim_{s\to\infty}\tfrac1s B_s=0)=1\,,\]

Markov property and stopping times

Similarly to other processes discussed previously, Brownian motion has the Markov property: given the present state $B_s$ the future increment $B_{t+s}-B_s$ is a Brownian motion independent of the past behaviour; formally, whenever $0\le r_1<r_2<\dots<r_n=s$, the process $(X_t)_{t\ge0}$ given by \[X_t:=B_{t+s}-B_s\] is a Brownian motion independent of the vector $(B_{r_1},\dots,B_{r_n})$.

We say that a random variable $T$ is a stopping time for $(B_t)$, if for each $s\ge0$ the event $\{T\le s\}$ is determined by $(B_r)_{r\in[0,s]}$. A classical example of a stopping time is the hitting time $T_a:=\min\{t\ge0:B_t=a\}$, where $a\in\mathbb{R}$. Indeed, the event $\{T_a \leq s\}$ means $B_t$ has reached $a$ by time $s$, which depends on the trajectory of $B_t$ until time $s$.

The strong Markov property claims that if $T$ is a stopping time for $(B_t)$, then the process ¹ $(X_t)_{t\ge0}$ given by \[X_t:=B_{T+t}-B_T\] is a Brownian motion independent of $(B_r)_{r\in[0,T]}$.

Proof. By the strong Markov, the process $X_s:=B_{T_a+s}-B_{T_a}=B_{T_a+s}-a$ is a Brownian motion independent of $(B_r)_{r\in[0,T_a]}$. The event $\{ T_a\le t,B_t>a\}$ is the same as $\{ T_a \leq t, X_{t-T_a} > 0\}$. Now $X_t$ has the same distribution as $-X_t$ by reflection invariance of Brownian motion. So the probability of the event $\{ T_a \leq t, X_{t-T_a} > 0\}$ is the same as the probability of the event $\{ T_a \leq t, -X_{t-T_a} > 0\}$, which is precisely the event that $\{ T_a \leq t, B_t < a\}$. So ([eq:Brownian-motion-reflection-principle]) follows. ◻

Proof. It is immediate that $\{T_a\le t,B_t>a\}=\{B_t>a\}$. Using $\mathsf{P}(B_t=a)=0$, we deduce \[\mathsf{P}(T_a\le t)=\mathsf{P}\bigl(T_a\le t,B_t>a\bigr)+\mathsf{P}\bigl(T_a\le t, B_t<a\bigr)=2\mathsf{P}(B_t>a)\,.\] Now, changing variables $x\mapsto a\sqrt{t/s}$ we get \[\mathsf{P}(T_a\le t)=\int_a^\infty\frac{2}{\sqrt{2\pi t}}e^{-x^2/(2t)}\,dx=\int_0^t\frac{a}{\sqrt{2\pi s^3}}e^{-a^2/(2s)}\,ds\,,\] implying the claim. ◻

Remark 3. By the scaling property of the Brownian motion, \[\mathsf{P}(T_a\le t)=2\mathsf{P}(B_1>a/\sqrt{t})\nearrow2\mathsf{P}(B_1>0)=1\] as $t\nearrow\infty$. As a result, $\mathsf{P}(T_a<\infty)=1$, i.e., $T_a$ is a finite random variable. This can be used to deduce that $\limsup\limits_{t\to\infty}B_t=\infty$ with probability one. Indeed, the event $\{ \limsup_{t \to \infty} B_t = \infty\}$ is implied by the event $\{T_a < \infty\; \text{for every}\; a\}$. The latter event is $\cap_{a > 0} \{T_a < \infty\}$, which is a monotone intersection because $T_a \leq T_b$ is $a \leq b$. So \[\mathsf{P}\bigl(\limsup_{t \to \infty} B_t = \infty \bigr) \geq \lim_{a \to \infty} \mathsf{P}\bigl(T_a < \infty \bigr) = 1.\] By symmetry, $\liminf\limits_{t\to\infty}B_t=-\infty$, implying that a typical trajectory of $B_t$ never stops visiting $0$, ie., $B_t$ is recurrent.

Further properties of Brownian motion

Applying the above theorem to the process $X_t = B_{t+s} - B_s$, which is also a Brownian motion, we find that $B_t$ is not differentiable at any given time $s$.

Proof. Let $X_t = t B_{1/t}$ be the time inversion of Brownian motion. Then by changing $t \to 1/t$ we find that \[\lim_{t \to 0} \frac{B_t}{t} = \lim_{t \to \infty} X_t.\] But $X_t$ is also a Brownian motion and we have seen that $\limsup_{t \to \infty} B_t = \infty$ almost surely, so that $\limsup_{t \to \infty} X_t = \infty$ almost surely as well. Similarly, $\liminf_{t \to \infty} X_t = -\infty$ almost surely. So $X_t$ does not have a limit at $t \to \infty$. ◻

Proof. If $B_t$ is monotone of an interval $[a,b]$ then it is monotone on a subinterval of $[a,b]$ with rational endpoints. Thus it is enough to show $B_t$ is not monotone on any interval with rational endpoints with probability one. As there are countably many such intervals, by a union bound, it is enough to show that given a fixed interval $[a,b]$, $B_t$ has zero probability of being monotone over $[a,b]$.

Suppose $[a,b]$ is a fixed interval of positive length. Subdivide the interval into $n$ intervals of equal length with endpoints $t_0 = a < t_1 < t_2 < \cdots < t_n = b$ and $t_{k} - t_{k-1} = (b-a)/n$. If $B_t$ is monotone on $[a,b]$ then $B_{t_k} - B_{t_{k-1}}$ all have the same sign. These increments are independent and have the distribution $\mathcal{N}(0, (b-a)/n)$. Thus the probability they all have the same sign is $2 \cdot 2^{-n}$. So for every $n$, \[\mathsf{P}\bigl(B_t \;\text{is monotone over}\; [a,b]\bigr) \leq 2^{-n+1}.\] Taking $n \to \infty$ we find that the probability is zero. ◻

The following strong law of large numbers implies the missing property in Example Example 4.

Proof. For every positive integer $n$, let $X_n = B_n - B_{n-1}$. Then the $X_n$ are independent and identically distributed with law $\mathcal{N}(0,1)$. By the law of large numbers, \[\frac{B_n}{n} = \frac{1}{n} \sum_{k=1}^n X_k \to 0 \quad \text{almost surely}.\] Let $A$ be the event where the above convergence takes place, so that $\mathsf{P}(A) = 1$.

Define the random variables $M^{+}_n = \max_{t \in [n,n+1)} |B_t - B_n|$. For every $\epsilon > 0$ consider the event $\{ M^{+}_n > \epsilon n \; \text{infinitely often}\}$. We will use the Borel-Cantelli lemma below to show that this event has probability zero for every $\epsilon > 0$. By taking the union of these events over every $\epsilon = 1, 1/2, 1/3$ and so on, we find that the event \[\{\text{there exists}\; k\; \text{such that}\; M^{+}_n > n/k \; \text{infinitely often over}\; n\}\] has probability zero as well. Let $B$ be the complement of the above event, so that $\mathsf{P}(B) = 1$.

We find that $\mathsf{P}(A \cap B) = 1$. We claim that on the event $A \cap B$, $B_t/t$ tends to zero as required. Indeed, choose an $\epsilon > 0$. On the event $A$ there is a random $N$ such that $|B_n|/n \leq \epsilon/2$ for every $n \geq N$. On the event $B$, for every $k$ where is a random $N_k$ such that if $n \geq N_k$ then $M_n \leq n/k$. Suppose we choose $k$ so that $1/k \leq \epsilon /2$. Denote by $\lfloor t \rfloor$ the rounding down of $t$ to its nearest integer. Then if $t \geq \max \{N, N_k\}$, $\lfloor t \rfloor \geq \max \{N, N_k\}$ as well, and so \[\frac{|B_t|}{t} \leq \frac{|B_t - B_{\lfloor t \rfloor}|}{t} + \frac{|B_{\lfloor t \rfloor}|}{t} \leq \frac{M^{+}_{\lfloor t \rfloor}}{t} + \frac{B_{\lfloor t \rfloor}}{t} \leq \epsilon/2 + \epsilon/2 = \epsilon.\] As $\epsilon$ is arbitary we find that $B_t/t$ tends to zero at $t \to \infty$ on the event $A \cap B$.

Now to prove that $\{ M^{+}_n > \epsilon n \; \text{infinitely often}\}$ has probability zero, it suffices to show due to the Borel-Cantelli lemma that \[\sum_{n \geq 1} \mathsf{P}(M^{+}_n > \epsilon n) < \infty.\] Observe that $M^{+}_n$ has the distribution of $M^{+}_1$ for every $n$ by the Markov property of Brownian motion. Thus, \[\sum_{n \geq 1} \mathsf{P}(M^{+}_n > \epsilon n) = \sum_{n \geq 1} \mathsf{P}(M^{+}_1 > \epsilon n).\] We now relate $M^{+}_1$ to $M_1 = \max_{t \in [0,1]} B_t$. We have \[M^{+}_1 = \max_{t \in [0,1]} |B_t| = \max \left \{ \max_{t \in [0,1]} B_t, \max_{t \in [0,1]} - B_t \right \}.\] Now note that $\mathsf{P}(\max \{X,Y\} > a) \leq \mathsf{P}(X > a) + \mathsf{P}(Y> a)$ by a union bound. If we take $X = \max_{t \in [0,1]} B_t$ and $Y = \max_{t \in [0,1]} - B_t$, we find that they both have the distribution $M_1$ (since $-B_t$ has the same distribution of $B_t$). So $\mathsf{P}( M^{+}_1 > a) \leq 2 \mathsf{P}(M_1 > a)$ for every $a$. We have seen before using the reflection principle that $\mathsf{P}(M_1 > a) = 2 \mathsf{P}(B_1 > a)$. So, \[\sum_{n \geq 1} \mathsf{P}(M_n > \epsilon n) \leq 4 \sum_{n \geq 1} \mathsf{P}(B_1 > \epsilon n).\] For $\epsilon n \geq 1$ we can use Lemma Lemma 1 to conclude that $\mathsf{P}(B_1 > \epsilon n) \leq e^{- \epsilon^2 n^2/2}$. Since $\sum_n e^{-\epsilon^2 n^2/2}$ is finite for every $\epsilon > 0$, we deduce that \[\sum_{n \geq 1} \mathsf{P}(M_n > \epsilon n) < \infty.\] This is what we had to show. ◻

Martingales and the optional stopping theorem

Let $(B_t)_{t\ge0}$ be a standard Brownian motion and let $\mathcal{F}^B:=(\mathcal{F}^B_s)_{s\ge0}$ be its natural filtration, i.e., the increasing collection of $\sigma$-fields where \[\mathcal{F}^B_s:=\sigma\bigl(B_r,0\le r\le s\bigr)\] contains all events in terms of the trajectories $(B_r)_{r\in[0,s]}$.

Example 16. Let $(B_t)$ be a standard Brownian motion. Given positive $a$, $b$ let $T$ be the exit time \[T:=\min\bigl\{t\ge0: B_t\not\in(-a,b)\bigr\}.\] This is the first time $B_t$ hits $-a$ or $b$. Notice that for each $s\ge0$, $\{T\le s\} =\bigl\{B_r\in(-a,b), 0\le r \le s\}$, which is $\mathcal{F}^B_s$-measurable. So $T$ is a stopping time for $B_t$. Also, $T\equiv\min\{T_{-a},T_b\}$, where $T_c:=\min\{t\ge0:B_t=c\bigr\}$ is the hitting time of $c\in\mathbb{R}$. By Remark Remark 3, $\mathsf{P}(T<\infty)=1$. We also have $|B_{T\land t}|\le\max(a,b)\le a+b$. Consequently, the optional stopping theorem implies \[0=\mathsf{E}B_0=\mathsf{E}B_T=-a\mathsf{P}(B_T=-a)+b\mathsf{P}(B_T=b)\,.\] As $\mathsf{P}(B_T=-a) + \mathsf{P}(B_T=b) = 1$, straightforward algebra gives \[\mathsf{P}\bigl(B_T=b\bigr)=\mathsf{P}\bigl(T_b<T_{-a}\bigr)=\tfrac{a}{a+b}\,,\qquad \mathsf{P}\bigl(B_T=-a\bigr)=\mathsf{P}\bigl(T_b>T_{-a}\bigr)=\tfrac{b}{a+b}\,.\]

Example 17. Similarly to Example Example 16, let $T:=\min\{t\ge0:|B_t|\ge K\}$ be the exit time from the symmetric interval $(-K,K)$. Since the exponential martingale $M_t$ from Example Example 13 satisfies $|M_{T\land t}|\le e^{|\alpha|K}$, the optional stopping theorem gives \[1=\mathsf{E}M_0=\mathsf{E}M_T=e^{\alpha K}\mathsf{E}(e^{-\alpha^2T/2}\mathbf{1}_{B_T=K})+e^{-\alpha K}\mathsf{E}(e^{-\alpha^2T/2}\mathbf{1}_{B_T=-K})\,,\] where by symmetry $\mathsf{E}(e^{-\alpha^2T/2}\mathbf{1}_{B_T=K})=\mathsf{E}(e^{-\alpha^2T/2}\mathbf{1}_{B_T=-K})=\tfrac12\mathsf{E}(e^{-\alpha^2T/2})$, so that the previous display becomes \[\label{eq:Bm-exit-time-Laplace-transform} \mathsf{E}e^{-\alpha^2T/2}=1/\cosh(\alpha K)\,.\]

Construction of Brownian motion

Existence of Brownian motion is a highly non-trivial fact. Indeed, any “reasonable” measure in the space of all trajectories, say, for $t\in[0,1]$, will assign zero probability to the set of all continuous trajectories.

Lévy’s construction of Brownian motion

We construct $(B_t)_{t\in[0,1]}$ as a uniform limit of suitably chosen (random) polygons paths $(B_n(t))_{t\in[0,1]}$. The approach is due to Lévy and uses the so-called Schauder basis in $[0,1]$.

The family of Haar functions $\{ h_k(t)\}_{k \geq 0}$ are defined for $0 \leq t \leq 1$ as follows.

\[h_0(t) = 1 \quad \text{for}\quad 0 \leq t \leq 1.\] \[h_1(t) = \left \{ \begin{array}{lr} 1& \quad\text{for}\quad 0 \leq t \leq 1/2\\ -1& \quad\text{for}\quad 1/2 < t \leq 1. \end{array} \right.\]

For $n \geq 1$ and $2^n \leq k < 2^{n+1}$, set \[h_k(t) = 2^{n/2}\, h_1\big(2^nt - (k-2^n)\big).\]

Proof. We have $\int_{0}^1 h_k^2(t) \,dt = 2^n (2\cdot \frac{1}{2^{n+1}}) = 1$. For $m > n$, either $h_m$ and $h_n$ have disjoint support or $h_n$ is constant over the support of $h_m$. If the latter case, \[\int_0^1 h_m h_n \,dt = \pm2^{n/2} \, \int_0^1 h_m \,dt = 0.\]

Now we check that if $\int_0^1 f h_k \,dt = 0$ for all $k$ then $f \equiv 0$ in $L^2([0,1])$. For $k=0$ this means that $\int_0^1 f \,dt = 0$. For $k=1$ we get $\int_0^{1/2} f \,dt = \int_{1/2}^1 f \,dt$. The $k=0$ case implies that both of these integrals are zero. Continuing inductively we deduce that $\int_{k/2^n}^{(k+1)/2^n} f \,dt =0$ for all $0 \leq k < 2^n$. Thus, for all dyadic rationals $0 \leq r < s \leq 1$ we deduce that $\int_r^s f \,dt = 0$. Hence this also holds of all real $0 \leq r < s \leq 1$. This implies that $f(t) = 0$ for almost every $0 \leq t \leq 1$. ◻

The $k$-th Schauder function $s_k(t)$ is defined as \[\label{Schauder} s_k(t) = \int_0^t h_k(s) \, ds \quad \text{for}\quad 0 \leq t \leq 1\;.\]

The graph of $s_k$ is a tent of height $2^{n/2-1}$ lying above the interval $[\frac{k-2^n}{2^n}, \frac{k-2^n +1}{2^n}]$. As a result, \[||s_k||_{\infty} = \max_{t\in [0,1]} |s_k(t)| = 2^{-n/2 -1} \quad \text{for}\;\; 2^n \leq k < 2^{n+1}.\]

Now, we choose a collection of independent standard Gaussian random variables, say $\{Z_k\}_{k \geq 0}$, on a common probability space. We define \[\label{SBM} B(t) = \sum_{k \geq 0} Z_k s_k(t) \quad (0 \leq t \leq 1).\] The following two lemmas guarantee that the series converges almost surely and defines a continuous function.

Proof. Note that for $2^n \leq k < 2^{n+1}$, the functions $s_k$ have disjoint support and their sup norm equals $2^{-n/2 -1}$. Therefore, \[\sum_{k: 2^n \leq k < 2^{n+1}} |s_k(t)| \leq 2^{-n/2 -1} \quad \text{for every}\; 0 \leq t \leq 1.\] Set \[m_n = \max_{2^n \leq k < 2^{n+1}} |z_k| = O(2^{\delta(n+1)}).\] Then for all $0 \leq t \leq 1$, \[\begin{aligned} \sum_{k \geq 2^m} |z_k s_k(t)| =& \sum_{n \geq m} \sum_{k: 2^n \leq k < 2^{n+1}} |z_k| |s_k(t)| \\ &\leq \sum_{n \geq m} m_n \sum_{k: 2^n \leq k < 2^{n+1}} |s_k(t)| \\ & \leq \sum_{n \geq m} m_n 2^{-n/2-1}\\ &= O(1) \sum_{n \geq m} 2^{\{\delta(n+1) -n/2\}}\\ &\leq O(1) 2^{(\delta - 1/2) m} \\ &= o(1) \quad \text{as}\; m \to \infty \;\text{since}\; \delta < 1/2. \end{aligned}\] The statement of the lemma follows since the series is Cauchy uniformly. ◻

Proof. If $Z$ is a standard Gaussian we can use the bound $\mathsf{P}( |Z| > a) \leq 2 e^{-a^2/2}$ from Lemma Lemma 1. Thus, \[\sum_{k \geq 1} \mathsf{P}( |Z_k| > 2\sqrt{\log k}) \leq \sum_{k \geq 1} 2 e^{- 2\,\log k} = \sum_{k \geq 1} \frac{2}{k^2} < \infty\,.\] By the Borel-Cantelli lemma, the event $\{ |Z_k| > 2 \sqrt{\log k} \;\text{infinitely often} \}$ occurs with zero probability. So there is a random constant $K$ such that \[|Z_k| \leq 2 \sqrt{\log k} \quad \text{if}\; k \geq K\,.\] ◻

Combining Lemmas (Lemma 19) and (Lemma 20) it follows that equation ([SBM]) defines $B(t)$ as a random continuous function on $0 \leq t \leq 1$ for almost every realizations of the $Z_k$s. Clearly, $B(0) = 0$ since every $s_k(0) = 0$. We now show that $B(t)$ satisfies the remaining two properties, (ii) and (iii), from Definition Definition 2 of a standard Brownian motion. We will then extend $B(t)$ to all $t \geq 0$.

Proof. Assume that $s \leq t$. Then by Lemma (Lemma 18), \[s = \int_0^1 \mathbf{1}_{x \leq s} \,\mathbf{1}_{x \leq t}\, dx = \sum_{k \geq 0} a_k b_k\,,\] where \[a_k = \int_0^1 \mathbf{1}_{x \leq s} \, h_k(x) \,dx = s_k(s), \quad b_k = \int_0^1 \mathbf{1}_{x \leq t} \, h_k(x) \,dx = s_k(t)\,.\] ◻

Now we prove that for $0 \leq s \leq t \leq 1$, $B(t) - B(s) \sim N(0,t-s)$. In this regard, we compute the moment generating function of $B(t) - B(s)$.

\[\begin{aligned} \mathsf{E}\left [\exp{\lambda (B(t)-B(s))}\right ] &= \mathsf{E}\left [ \exp \{\lambda \sum_{k \geq 0} Z_k(s_k(t)-s_k(s))\}\right ]\\ &= \prod_{k \geq 0} \mathsf{E}\left [ \exp \{\lambda \, Z_k\,(s_k(t)-s_k(s)) \}\right ] \quad (\text{independence})\\ &= \prod_{k \geq 0} \exp \{\frac{\lambda^2}{2} \, (s_k(t) - s_k(s))^2 \} \quad (\text{since $Z_k$ is a standard Gaussian})\\ &= \exp \left \{\frac{\lambda^2}{2} \sum_{k \geq 0} (s_k(t)^2 - 2s_k(s)s_k(t) + s_k(s)^2) \right \}\\ &= \exp \left \{\frac{\lambda^2}{2} (t -2s+s) \right \} \quad (\text{by Lemma (\ref{lem3})})\\ &= \exp \{\frac{\lambda^2}{2} (t-s)\} \,. \end{aligned}\] The uniqueness of moment generating functions implies that $B(t) - B(s) \sim \mathcal{N}(0,t-s)$.

Now we show that if $0 = t_0 < t_1 < t_2 < \cdots < t_n$ then \[\label{eqn:indinc} \mathsf{E}\left [\exp \left \{\sum_{j=1}^n \lambda_j \, (B(t_j)-B(t_{j-1})) \right \}\right ] = \prod_{j=1}^m \exp \left \{\frac{\lambda_j^2}{2} \, (t_j-t_{j-1}) \right \}\,.\]

This will imply, from the uniqueness of moment generating functions, that the Gaussian random variables $B(t_1), B(t_2)-B(t_1), \ldots, B(t_n) - B(t_{n-1})$ are all independent. The proof of Equation ([eqn:indinc]) is a tedious calculation similar to the one above. We shall show it for $n=2$ and leave the reader to finish the rest by induction.

We note that \[\mathsf{E}\left[ \exp \left \{\lambda_1B(t_1) + \lambda_2(B(t_2) - B(t_1)) \right \}\right ] = \mathsf{E}\left [ \exp \left \{(\lambda_1-\lambda_2)B(t_1) + \lambda_2B(t_2) \right \} \right ]\] Then, \[\begin{aligned} & \mathsf{E}\left [ \exp \left \{(\lambda_1-\lambda_2)B(t_1) + \lambda_2B(t_2) \right \} \right ] =\\ &= \mathsf{E}\left [ \exp \left \{(\lambda_1-\lambda_2) \sum_{k \geq 0} Z_k s_k(t_1) + \lambda_2 \sum_{k \geq 0} Z_k s_k(t_2) \right \} \right ]\\ &= \prod_{k \geq 0} \mathsf{E}\left [ \exp \{Z_k [(\lambda_1-\lambda_2)s_k(t_1) + \lambda_2 s_k(t_2)]\}\right ]\\ &= \prod_{k \geq 0} \exp \left \{ \frac{1}{2} [(\lambda_1-\lambda_2)s_k(t_1) + \lambda_2s_k(t_2)]^2 \right \}\\ &= \exp \left \{\frac{1}{2} [(\lambda_1-\lambda_2)^2t_1 + 2(\lambda_1-\lambda_2)\lambda_2t_1 + \lambda_2^2t_2] \right \} \quad (Lemma~(\ref{lem3}))\\ &= \exp \left \{\frac{1}{2}[\lambda_1^2t_1 + \lambda_2^2(t_2-t_1)] \right \}\,. \end{aligned}\]

Proof. We have demonstrated above how to build a standard one dimensional Brownian motion $B(t)$ for $0 \leq t \leq 1$. Now, we can choose a countable collection of countably many independent standard Gaussians to construct independent standard Brownian motions $B_n(t)$ for $0 \leq t \leq 1$ and $n \geq 1$. Define $B(t)$ for $t \geq 0$ inductively by \[B(t) = B(n-1) + B_n(t-(n-1)) \quad \text{for}\; n-1 \leq t < n.\]

Then $B(t)$ is a standard one dimensional Brownian motion defined for all $t \geq 0$. ◻

Nondifferentiability of Brownian trajectories

Notice that if a function $f:[0,1]\to\mathbb{R}$ is differentiable at $s\in[0,1]$ the ratio $|f(t)-f(s)|/|t-s|$ must be uniformly bounded for all $|t-s|>0$ small enough. Fix a positive constant $K<\infty$ and let \[A_n\equiv A_n(K):=\Bigl\{\exists s\in[0,1]\text{ s.t. }|s-t|\le\tfrac3{2^n}\,\Longrightarrow\,|B_t-B_s|\le K|t-s|\Bigr\}\,.\] We will use the Borel-Cantelli to show that \[\mathsf{P}(A_n \; \text{infinitely often})=0.\] This implies that typical Brownian paths are not differentiable (even not Lipschitz!) at any point $s\in[0,1]$.

We approximate the last event by a simpler one, the probability of which can be bounded above. Let $n\ge3$ and assume that the event $A_n(K)$ occurs with ³ some $s\in[0,1/2]$; let $k=\lfloor{2^ns}\rfloor$, so that $2^n-k\ge 2^n(1-s)\ge4$. Then for each $j=0,1,2$ we have (draw a picture!) \[\bigl|B(\tfrac{k+j+1}{2^n})-B(\tfrac{k+j}{2^n})\bigr|\le\bigl|B(\tfrac{k+j+1}{2^n})-B(s)\bigr|+\bigl|B(s)-B(\tfrac{k+j}{2^n})\bigr|\le\tfrac{5K}{2^n}\,.\] Consider the deviation event \[D_{n,k}(K):=\bigcap_{j=0}^2\Bigl\{\bigl|B(\tfrac{k+j+1}{2^n})-B(\tfrac{k+j}{2^n})\bigr|\le\tfrac{5K}{2^n}\Bigr\}\] and let $E_n(K):=\cup_{k=0}^{2^n-3}D_{n,k}(K)$. We obviously have $A_n(K)\subseteq E_n(K)$ with \[\mathsf{P}\bigl(E_n(K)\bigr)\le 2^n\mathsf{P}\bigl(D_{n,0}(K)\bigr)\,,\] where, by independence and stationarity of increments, and scaling, \[\mathsf{P}\bigl(D_{n,0}(K)\bigr)=\bigl(\mathsf{P}(|B(\tfrac1{2^n})|\le\tfrac{5K}{2^n})\bigr)^3=\bigl(\mathsf{P}(|B(1)|\le\tfrac{5K}{\sqrt{2^n}})\bigr)^3\,.\] Since $B(1)$ has a standard Normal distribution, we can use that $\mathsf{P}(|B(1)| \leq \varepsilon) \leq 2 \varepsilon$ so deduce that \[\mathsf{P}\bigl( D_{n,0}(K)\bigr) \leq M(K) \, 2^{-3n/2}.\] So then $\mathsf{P}(A_n(K))\le\mathsf{P}\bigl(E_n(K)\bigr)\le M(K)\,2^{-n/2}$.

Since $2^{-n/2}$ can be summed, the Borel-Cantelli lemma implies that $\mathsf{P}(A_n(K)\; \text{infinitely often}) = 0$. Since this holds for every $K$, an union bound over integer values of $K$ ($K = 1,2,3, \ldots$) shows that \[\mathsf{P}(\exists \; K \; \text{s.t.}\; A_n(K)\; \text{infinitely often over}\; n) = 0.\]

Finally, if $(B_t)_{t\in[0,1]}$ is differentiable at some point in $[0,1]$ then for some integer $K$, all $A_n(K)$ with sufficiently large $n$ must hold. However, this event has probability zero as shown above. So, \[\mathsf{P}\bigl(B_s\text{ is nowhere differentiable in }[0,1]\bigr)=1\,.\]

Problems

Let $(B_t)_{t\ge0}$ be a standard Brownian motion. For fixed $c>0$, show that the process $(X_t)_{t\ge0}$ with $X_t=\tfrac1{c}B_{c^2t}$ is a standard Brownian motion. [Hint: Check that $X_t$ has all properties required for Brownian motions.]

Let $(B_t)_{t\ge0}$ be a standard Brownian motion. Define $(X_t)_{t\ge0}$ via $X_0=0$ and, for $t>0$, via $X_t=tB_{1/t}$. Show that $X_t$ is a standard Brownian motion.
[Hint: You may assume, without proof, that $\limsup\limits_{t\to\infty}\tfrac 1tB_t=0$ with probability one.]

Let $(B_t)_{t\ge0}$ be a standard Brownian motion. For $t\in[0,1]$, define $B^0_t:=B_t-tB_1$. For $0\le s<t\le1$, find the distribution of the vector $(B^0_s,B^0_t)$. The process $(B^0_t)_{t\in[0,1]}$ is known as the Brownian bridge.

Let $(B_t)_{t\ge0}$ be a standard Brownian motion. For all $0\le s<t$, show that $\mathsf{E}(B_t\mid B_s)=B_s$. Deduce that for all $n\ge1$ and all $0=r_0<r_1<\ldots<r_n<s$, we have $\mathsf{E}(B_t\mid B_s,B_{r_n},\dots,B_{r_1})=B_s$.

For $a\ge0$, let $T_a=\min\{t\ge0:B_t=a\}$ be the first time the standard Brownian motion $(B_t)_{t\ge0}$ hits $a$. Show that $(T_a)_{a\ge0}$ has stationary independent increments, namely:
1) if $a<b$, then the distribution of $T_b-T_a$ is the same as $T_{b-a}$;
2) if $0=a_0<a_1<\dots<a_n$, then $T_{a_1}-T_{a_0}$, …, $T_{a_n}-T_{a_{n-1}}$ are independent.

Let $(B_t)_{t\ge0}$ be a standard Brownian motion. By using [exse:Bm-conditional-Gaussian] or otherwise, find the distribution of $(B_t\mid B_1=0)_{t\in[0,1]}$.

Let $Y_t:=\int_0^tB_s\,ds$, where $(B_t)_{t\ge0}$ is a standard Brownian motion. Find $\mathsf{E}Y_t$, $\mathsf{E}\bigl[(Y_t)^2\bigr]$, and $\mathsf{E}\bigl[Y_sY_t\bigr]$ with $0\le s<t$. What is the distribution of the vector $(Y_s,Y_t)$?

Show that $N^4_t:=(B_t)^4-6t(B_t)^2+3t^2$ is a martingale w.r.t. $\mathcal{F}^B$.

By carefully applying the optional stopping theorem to the martingale $N^2_t$ from [exse:N2t-martingale], show that the exit time $T:=\min\{t\ge0:B_t\not\in(-a,b)\}$ satisfies $\mathsf{E}T=ab$.

By carefully applying the optional stopping theorem to the martingale $N^4_t$ from [exse:N4t-martingale], compute the second moment of the exit time $T:=\min\{t\ge0:|B_t|\ge K\}$. [Hint: Use the fact that by [exse:exit-time-first-moment], $\mathsf{E}T=K^2$.]

For positive $a$ and $\mu$, let $X_t=a+\mu t+\sigma B_t$, where $(B_t)$ is a standard Brownian motion. By carefully applying the optional stopping theorem to the exponential martingale $M_t=e^{\alpha t-\alpha^2B_t/2}$ with $\alpha=-\tfrac{2\mu}\sigma$, show that the “ruin time” $R:=\min\{t\ge0:X_t=0\}=\min\{t\ge0:B_t=-(a+\mu t)/\sigma\}$ satisfies $\mathsf{P}(R<\infty)=e^{-2\mu a/\sigma^2}$.
[Hint: Show that $\mathsf{E}\bigl(M_{R\land t}\mathbf{1}_{R\le t}\bigr)=e^{-\alpha a/\sigma}\mathsf{P}(R\le t)$ and $\mathsf{E}\bigl(M_{R\land t}\mathbf{1}_{R>t}\bigr)\to 0$ as $t\to\infty$.]

Strictly speaking, we need to consider $(X_t|T<\infty)_{t\ge0}$ here, as otherwise $X_t$ is undefined.↩︎
we will not do this here!↩︎
otherwise $s\in(1/2,1]$ and we take $k:=\lceil{2^ns}\rceil\ge 2^ns\ge4$ with $j=-1,-2,-3$.↩︎