4 Martingales

4.1 Definition and some examples

A martingale is a generalized version of a “fair game”.

A process $(M_n)_{n\ge0}$ is a martingale if

for every $n\ge0$ the expectation ${\mathsf{E}}M_n$ is finite, equivalently, ${\mathsf{E}}|M_n|<\infty$;
for every $n\ge0$ and all $m_n$, $m_{n-1}$, …, $m_0$ we have \[\begin{equation} \label{eq:martingale-def} {\mathsf{E}}\bigl(M_{n+1}\mid M_n=m_n,\dots,M_0=m_0\bigr)=m_n\,. \end{equation}\]

For those familiar with the notion of conditioning on a random variable (see next section), $\eqref{eq:martingale-def}$ can just be written as ${\mathsf{E}}\bigl(M_{n+1}\mid M_n,\dots,M_0\bigr)=M_n$.

We say that $(M_n)_{n\ge0}$ is a supermartingale ² if the equality in $\eqref{eq:martingale-def}$ holds with $\le$, ie., ${\mathsf{E}}\bigl(M_{n+1}\mid M_n,\dots,M_0\bigr)\le M_n$; and we say that $(M_n)_{n\ge0}$ is a submartingale, if $\eqref{eq:martingale-def}$ holds with $\ge$, ie., ${\mathsf{E}}\bigl(M_{n+1}\mid M_n,\dots,M_0\bigr)\ge M_n$.

Let $(\xi_n)_{n\ge1}$ be independent random variables ³ with \[{\mathsf{E}}\xi_n=0\] for all $n\ge1$. Then the process $(M_n)_{n\ge0}$ defined via \[M_n\stackrel{{\sf def}}{=}M_0+\xi_1+\dots+\xi_n\] is a martingale as long as the random variable $M_0$ is independent of $(\xi_n)_{n\ge1}$ and ${\mathsf{E}}|M_0|<\infty$. For example, we will often take $M_0=0$ or some other deterministic constant.

Indeed, by the triangle inequality, ${\mathsf{E}}|M_n|\le{\mathsf{E}}|M_0|+\sum_{j=1}^n{\mathsf{E}}|\xi_j|<\infty$ for all $n\ge0$; on the other hand, the independence property implies \[{\mathsf{E}}\bigl(M_{n+1}-M_n\mid M_n,\dots,M_0\bigr)\equiv{\mathsf{E}}\bigl(\xi_{n+1}\mid M_n,\dots,M_0\bigr)={\mathsf{E}}\xi_{n+1}=0\,.\]

Notice that if ${\mathsf{E}}\xi_n\ge0$ for all $n\ge1$, then $(M_n)_{n\ge0}$ is a submartingale, whereas if ${\mathsf{E}}\xi_n\le0$ for all $n\ge1$, then $(M_n)_{n\ge0}$ is a supermartingale. More generally, if $(\xi_n)_{n\ge1}$ are independent random variables with ${\mathsf{E}}|\xi_n|<\infty$ for all $n\ge1$, then the process $M_n=M_0+(\xi_1-{\mathsf{E}}\xi_1)+\dots+(\xi_n-{\mathsf{E}}\xi_n)$, $n\ge0$, is a martingale.

Given a sequence $(\xi_k)_{k\ge1}$ of independent Bernoulli variables with common distribution ${\mathsf{P}}(\xi=1)=p$ and ${\mathsf{P}}(\xi=-1)=q=1-p$, define the generated random walk via $X_n=\sum_{k=1}^n\xi_k$, $n\ge0$. Show that the process $M_n=(q/p)^{X_n}$ is a martingale.

Let $(Z_n)_{n\ge0}$ be a branching process with ${\mathsf{E}}Z_1=m<\infty$. We have ${\mathsf{E}}|Z_n|<\infty$ and ${\mathsf{E}}(Z_{n+1}\mid Z_n,\dots,Z_0)=mZ_n$ for all $n\ge0$. In other words, the process $(Z_n)_{n\ge0}$ is a martingale, a submartingale or a supermartingale depending on whether $m=1$, $m>1$ or $m<1$.
Notice also, that for every $m\in(0,\infty)$ the process $(Z_n/m^n)_{n\ge0}$ is a martingale.

Let $\rho$ be the extinction probability for a branching process $(Z_n)_{n\ge0}$; show that $M_n=\rho^{Z_n}$ is a martingale.

We can also construct many examples of submartingales and supermartingales if we have a martingale and apply certain functions to it. This is a consequence of:

[Conditional Jensen’s inequality] Suppose that $X$ is an integrable random variable on $(\Omega, \mathcal{F},{\mathsf{P}})$ and $A\in \mathcal{F}$ with ${\mathsf{P}}(A)>0$.

Suppose that $\varphi$ is a convex function such that $\varphi(X)$ is also integrable. Then \[{\mathsf{E}}(\varphi(X)|A)\ge \varphi({\mathsf{E}}(X|A)).\]

This follows from the non-conditional version of Jensen’s inequality, and using the fact that ${\mathsf{P}}(\cdot|A)$ is a valid probability measure.

Let $(X_n)_{n\ge0}$ be a martingale. If, for some convex function $f(\,\cdot\,)$ we have ${\mathsf{E}}\bigl(|f(X_n)|\bigr)<\infty$ for all $n\ge0$, then the process $f(X):=\bigl(f(X_n)\bigr)_{n\ge 0}$ is a submartingale.

Similarly, if for some concave $f(\,\cdot\,)$ we have ${\mathsf{E}}\bigl(|f(X_n)|\bigr)<\infty$ for all $n\ge0$, then the process $f(X)$ is a supermartingale.

Both these claims follow immediately from the Jensen inequalities for conditional expectations.

4.2 A few remarks on conditional expectation

Recall the following basic definition:

Let $(\Omega,{\mathcal{F}},{\mathsf{P}})$ be a probability triple and let $X$ and $Y$ be random variables taking values $x_1$, $x_2$, …, $x_m$ and $y_1$, $y_2$, …, $y_n$ respectively. On the event $\{Y=y_j\}$ one defines the conditional probability \[{\mathsf{P}}\bigl(X=x_i\mid Y=y_j\bigr)\stackrel{{\sf def}}{=}\frac{{\mathsf{P}}(X=x_i,Y=y_j)}{{\mathsf{P}}(Y=y_j)}\] and the conditional expectation: ${\mathsf{E}}\bigl(X\mid Y=y_j\bigr)\equiv\sum_ix_i{\mathsf{P}}\bigl(X=x_i\mid Y=y_j\bigr)$. Then the conditional expectation $Z={\mathsf{E}}(X\,|\,Y)$ of $X$ given $Y$ is defined as follows: \[\text{if } Y(\omega)=y_j, \quad\text{ then }\quad Z(\omega)\stackrel{{\sf def}}{=}z_j\equiv{\mathsf{E}}\bigl(X\mid Y=y_j\bigr)\,.\] Notice that the value $z_j$ is completely determined by $y_j$; in other words, $Z$ is a function of $Y$, and as such, a random variable. Of course, if $X$ and $Y$ are independent, we have $Z(\omega)\equiv{\mathsf{E}}X$.

If ${\mathcal{D}}=\{D_1,D_2,\dots,D_m\}$ is a finite partition ⁴ of $\Omega$, the collection ${\mathcal{G}}=\sigma({\mathcal{D}})$ of all $2^m$ possible subsets of $\Omega$ constructed from blocks $D_i$ is called the $\sigma$-field generated by ${\mathcal{D}}$. A random variable $Y$ is measurable with respect to ${\mathcal{G}}$ if it is constant on every block $D_i$ of the initial partition ${\mathcal{D}}$.

In [exmle:conditional-expectation] above we see that $Z$ is constant on the events $\{Y=y_j\}$; ie., $Z$ is measurable w.r.t. the $\sigma$-field $\sigma(Y)\equiv\sigma(\{Y=y_j\}_j)$. Moreover, for every $G_j\equiv\{Y=y_j\}$, we have \[\begin{aligned} {\mathsf{E}}\bigl(Z\,\mathsf{1}_{G_j})&=z_j{\mathsf{P}}(Y=y_j)=\sum_ix_i{\mathsf{P}}\bigl(X=x_i\mid Y=y_j\bigr){\mathsf{P}}(Y=y_j) \\&=\sum_ix_i{\mathsf{P}}\bigl(X=x_i,Y=y_j\bigr)={\mathsf{E}}\bigl(X\mathsf{1}_{G_j})\,, \end{aligned}\] where the last equality follows from the observation that the random variable $X\mathsf{1}_{G_j}$ equals $x_i$ on every event $G_j\cap\{X=x_i\}=\{Y=y_j\}\cap\{X=x_i\}$ and vanishes identically outside the set $G_j$.

The last two properties can be used to define conditional expectation in general: Let $(\Omega, {\mathcal{F}},{\mathsf{P}})$ be a probability triple, let ${\mathcal{G}}\subseteq{\mathcal{F}}$ be a $\sigma$-field, and let $X$ be a rv, $X:\Omega\to{\mathbb{R}}$. The conditional expectation of $X$ w.r.t. ${\mathcal{G}}$ is the unique random variable $Z$ such that: $Z$ is ${\mathcal{G}}$ measurable and for every set $G\in{\mathcal{G}}$ we have $\int_GZ{\mathsf{d}}{\mathsf{P}}\equiv{\mathsf{E}}\bigl(Z\mathsf{1}_G)={\mathsf{E}}\bigl(X\mathsf{1}_G)\equiv\int_GX{\mathsf{d}}{\mathsf{P}}$.

Notice that when $\mathcal{G}=\sigma(D)$, the definition in the remark above coincides with Definition [def:ce_partition].

The following are the most important properties of conditional expectation:

Let $(\Omega, {\mathcal{F}},{\mathsf{P}})$ be a probability triple, and let ${\mathcal{G}}\subset{\mathcal{F}}$ a $\sigma$-field. Then for all random variables $X$, $X_1$, $X_2$ and constants $a_1$, $a_2$, the following properties hold:

If $Z={\mathsf{E}}\bigl(X\,|\,{\mathcal{G}}\bigr)$ then ${\mathsf{E}}Z={\mathsf{E}}X$;
If $X$ is ${\mathcal{G}}$-measurable, then ${\mathsf{E}}\bigl(X\,|\,{\mathcal{G}}\bigr)=X$;
${\mathsf{E}}\bigl(a_1X_1+a_2X_2\,|\,{\mathcal{G}}\bigr)=a_1{\mathsf{E}}\bigl(X_1\,|\,{\mathcal{G}}\bigr)+a_2{\mathsf{E}}\bigl(X_2\,|\,{\mathcal{G}}\bigr)$;
If $X\ge0$, then ${\mathsf{E}}\bigl(X\,|\,{\mathcal{G}}\bigr)\ge0$;
[tower-property-conditional-expectation] If ${\mathcal{H}}$ and ${\mathcal{G}}$ are two $\sigma$-fields in $\Omega$ such that ${\mathcal{H}}\subseteq{\mathcal{G}}$, then \[{\mathsf{E}}\Bigl[{\mathsf{E}}\bigl(X\,|\,{\mathcal{G}}\bigl)\,|\,{\mathcal{H}}\Bigr]={\mathsf{E}}\Bigl[{\mathsf{E}}\bigl(X\,|\,{\mathcal{H}}\bigl)\,|\,{\mathcal{G}}\Bigr]={\mathsf{E}}\bigl(X\,|\,{\mathcal{H}}\bigl)\,;\]
If $Z$ is ${\mathcal{G}}$-measurable, then ${\mathsf{E}}\bigl[ZX\,|\,{\mathcal{G}}\bigr]=Z\,{\mathsf{E}}(X\,|\,{\mathcal{G}})$.
If $Z$ is independent of ${\mathcal{G}}$, then ${\mathsf{E}}\bigl[Z\,|\,{\mathcal{G}}\bigr]={\mathsf{E}}[Z]$.

If $(X_k)_{k\ge1}$ is a sequence of random variables, we can define the generated $\sigma$-fields ${\mathcal{F}}_1^X$, ${\mathcal{F}}_2^X$, …, via \[\begin{equation} \label{eq:Xn-sigma-field-def} {\mathcal{F}}_k^X\stackrel{{\sf def}}{=}\sigma(X_1,X_2,\dots,X_k)\,; \end{equation}\] here, the $\sigma$-field ${\mathcal{F}}_k^X$ stores all information about the process $(X_n)_{n\ge1}$ up to time $k$. Observe that these $\sigma$-fields form a filtration $\bigl({\mathcal{F}}_n^X\bigr)_{n\ge1}$ in the sense that \[\begin{equation} \label{eq:FXn-filtration} {\mathcal{F}}_1^X\subseteq{\mathcal{F}}_2^X\subseteq\ldots\subseteq{\mathcal{F}}_k^X\subseteq\ldots\,. \end{equation}\] The notion of filtration is very useful when working with martingales. Indeed, a process $(M_n)_{n\ge0}$ is a martingale if for all $n\ge0$, ${\mathsf{E}}|M_n|<\infty$ and ${\mathsf{E}}\bigl(M_{n+1}\mid{\mathcal{F}}^M_n)=M_n$. We can also generalise the definition of a martingale:

$(M_n)_{n\ge 0}$ is a martingale with respect to a filtration $(\mathcal{F}_n)_{n\ge 0}$ if for all $n\ge 0$, ${\mathsf{E}}(|M_n|)<\infty$ and \[{\mathsf{E}}(M_{n+1}|\mathcal{F}_n)=M_n.\]

The original definition of martingale is sometimes referred to as “being a martingale with respect to the natural filtration”.

We say that $M$ is a martingale with respect to a sequence $(X_n)_{n\ge 0}$ if it is a martingale with respect to the filtration $(\mathcal{F}_n^X)_{n\ge 0}$.
We also notice that by repeatedly applying the tower property in claim [tower-property-conditional-expectation]) of [lem:conditional-expectation-properties] above to the sequence $\eqref{eq:FXn-filtration}$, we get the following result:

If $(M_n)_{n\ge0}$ is a submartingale w.r.t. $(X_n)_{n\ge0}$, then for all integer $n\ge k\ge0$, we have ${\mathsf{E}}M_n\ge{\mathsf{E}}M_k$.

Changing signs, we get the inequality ${\mathsf{E}}M_n\le{\mathsf{E}}M_k$ for supermartingales, and therefore the equality ${\mathsf{E}}M_n={\mathsf{E}}M_k$ for martingales.

4.3 Martingales and stopping times

Martingales are extremely useful in studying stopping times:

Let $(X_k)_{k\ge1}$ be a sequence of i.i.d. Bernoulli random variables with common distribution ${\mathsf{P}}(X_1=1)=p$, ${\mathsf{P}}(X_1=-1)=q=1-p$, where $p\in(0,1)$, $p\neq1/2$. For fixed $k\in\{0,1,\dots,N\}$, define the random walk $(S_n)_{n\ge0}$ defined via $S_n=k+\sum_{j=1}^nX_j$, $n\ge0$, and consider the process $(Y_n)_{n\ge0}$, where $Y_n=(q/p)^{S_n}$. Clearly, $Y_n$ is an $({\mathcal{F}}_n^X)_{n\ge0}$-martingale, so that ${\mathsf{E}}(Y_n)={\mathsf{E}}(Y_0)=(q/p)^k$ for all $n\ge0$, recall [lem:martingale-property].

Let $T$ be the first time $S_n$ hits $0$ or $N$. If an analogue of the above equality, ${\mathsf{E}}(Y_T)={\mathsf{E}}(Y_0)=(q/p)^k$ were true for (random) time $T$, we could find the exit probability ${\mathsf{p}}_k={\mathsf{P}}(\, S \text{ hits $0$ before $N$ }|S_0=k)$ from the expression \[{\mathsf{E}}\bigl(Y_T\bigr)=(q/p)^0{\sf p}_k+(q/p)^N\bigl(1-{\sf p}_k\bigr)\,,\] thus obtaining $\displaystyle{\sf p}_k=\bigl((q/p)^k-(q/p)^N\bigr)/\bigl(1-(q/p)^N\bigr)$.

The method used in the previous example relies on the assumption ${\mathsf{E}}(Y_T)={\mathsf{E}}(Y_0)$ and on the formula for ${\mathsf{E}}(Y_T)$ for a certain random variable $T$. An important part of the theory of martingales is to study random variables $T$ for which the above statements are true. ⁵

Let $S_n=\sum_{k=1}^nX_k$ be the simple symmetric random walk in $\{-K,\dots,K\}$, generated by a sequence of i.i.d. symmetric Bernoulli r.v. $X_k$, where ${\mathsf{P}}(X_1=\pm1)=1/2$. Similarly to [exmle:gamblers-ruin] one can study the hitting time $T$ of the boundary, $T=\inf\bigl\{n\ge0:|S_n|=K\bigr\}$: namely, since ⁶ \[{\mathsf{E}}\bigl((S_T)^2\bigr)=K^2{\mathsf{P}}(S_T=K)+K^2{\mathsf{P}}(S_T=-K)=K^2\,,\] the same heuristics applied to the martingale $(Y_n)_{n\ge0}$, $Y_n\stackrel{{\sf def}}{=}\bigl(S_n\bigr)^2-n$, leads to $0={\mathsf{E}}(Y_0)={\mathsf{E}}(Y_T)={\mathsf{E}}\bigl((S_T)^2\bigr)-{\mathsf{E}}(T)$; i.e., it suggests that ${\mathsf{E}}(T)=K^2$.

One of our aims is to discuss general results that justify the above heuristics. To this end, we need to carefully define what we mean by a “stopping time”.

A variable $T$ is a stopping time for a process $(X_n)_{n\ge0}$, if the occurrence/non-occurrence of the event $\{T=n\}=$“we stop at time $n$” can be determined by looking at the values $X_0$, $X_1$, …, $X_n$ of the process up to time $n$. Equivalently, if we have $\{T\le n\}\in{\mathcal{F}}^X_n\equiv\sigma(X_0,\dots,X_n)$ for every $n\ge 0$.

Let $(X_n)_{n\ge0}$ be a stochastic process with values in $S$, and let $T$ be the hitting time of a set $A\subset S$, namely, $T=\min\bigl\{n\ge0:X_n\in A\bigr\}$. Then $T$ is a stopping time for $(X_n)_{n\ge0}$.

Indeed, for every fixed $n\ge0$, we have $\{T>n\}\equiv\{X_0\notin A,X_1\notin A,\dots,X_n\notin A\}$; therefore, the event $\{T>n\}$ and its complement $\{T\le n\}$ both belong to ${\mathcal{F}}^X_n$.

By contrast, the last time that $X$ visits $A$, $\tilde{T}=\max\{n\ge 0: X_n\in A\}$ is not generally a stopping time. Because we generally cannot tell just by looking at $X_0,\dots, X_n$, whether the process will visit $A$ after time $n$ or not.

Let $k\in{\mathbb{N}}$ be fixed, and let $S$ and $T$ be stopping times for a process $(X_n)_{n\ge0}$. Show that the following are stopping times:

$T\equiv k$,
$S\land T\equiv\min(S,T)$,
$S\lor T\equiv\max(S,T)$.

Let $A$ be a set of states, and let $T=T_A$ be the moment of the first visit to $A$, ie., $T=\min\{n\ge0:X_n\in A\}$. Consider $S=S_A=\min\{n>T_A:X_n\in A\}$, the moment of the second visit to $A$. Show that $S_A$ is a stopping time for $(X_n)_{n\ge0}$. Is the variable $L=\max\bigl\{n\ge0:X_n\in A\bigr\}$ a stopping time for $(X_n)_{n\ge0}$?

Let $(M_n)_{n\ge0}$ be a submartingale w.r.t. $(X_n)_{n\ge0}$, and let $T$ be a stopping time for $(X_n)_{n\ge0}$. Show that the process $(L_n^T)_{n\ge0}$, \[L_n^T\stackrel{{\sf def}}{=}M_{n\land T}\equiv\begin{cases}M_n\,,&\quad n\le T\,,\\ M_T\,,&\quad n>T\,,\end{cases}\] is a submartingale w.r.t. $(X_n)_{n\ge0}$. Deduce that if $(M_n)_{n\ge0}$ is a martingale w.r.t. $(X_n)_{n\ge0}$, then the stopped process $(L_n^T)_{n\ge0}$ is also a martingale w.r.t. $(X_n)_{n\ge0}$.

4.4 Optional stopping theorem

The optional stopping (or sampling) theorem (OST) tells us that, under quite general assumptions, whenever $X_n$ is a martingale, then $X_{T\land n}$ is a martingale for a stopping time $T$. Such results are very useful in deriving inequalities and probabilities of various events associated with such stochastic processes.

[Optional Stopping Theorem][thm:OST] Let $(M_n)_{n\ge0}$ be a martingale w.r.t. $(X_n)_{n\ge0}$, and let $T$ be a stopping time for $(X_n)_{n\ge0}$. Then the equality \[\begin{equation} \label{eq:OST} {\mathsf{E}}\bigl[M_T\bigr]={\mathsf{E}}[M_0] \end{equation}\] holds whenever one of the following conditions holds:

(OST-1): the stopping time $T$ is bounded, i.e., ${\mathsf{P}}(T\le N)=1$ for some $N<\infty$;
(OST-2): ${\mathsf{E}}T<\infty$, and the martingale $(M_n)_{n\ge0}$ has bounded increments, i.e., $|M_{n+1}-M_n|\le K$ for all $n$ and some constant $K$;
(OST-3): ${\mathsf{P}}(T<\infty)=1$, and the martingale $(M_n)_{n\ge0}$ is bounded, i.e., $|M_n|\le K$ for all $n$ and some constant $K$.

If $M_n$ records gambler’s fortune, by (OST-3), one cannot make money from a fair game, unless an unlimited amount of credit is available.

One can generalize (OST-3) by replacing the condition of boundedness, $|M_n|\le K$, by that of uniform integrability for the martingale $(M_n)_{n\ge0}$: a sequence of random variables $(Y_n)_{n\ge0}$ is uniformly integrable if \[\begin{equation} \label{eq:UI-def} \lim_{K\to\infty}\sup_{n\ge0}{\mathsf{E}}\bigl(|Y_n|\mathsf{1}_{\{|Y_n|>K\}}\bigr)=0\,. \end{equation}\]

Let the SSRW $(S_n)_{n\ge0}$ be generated as in [exmle:SSRW-expected-exit-time-via-martingales]. Put \[H\stackrel{{\sf def}}{=}\inf\{n\ge0:S_n=1\}\,.\] Since this RW is recurrent, ⁷ we deduce that ${\mathsf{P}}(H<\infty)=1$. However, the (OST) does not apply, as $0={\mathsf{E}}(S_0)\neq{\mathsf{E}}(S_H)\equiv1$. It is an instructive Exercise to check which conditions in each of the above (OST) are violated.

We now give the proof of [thm:OST].

Let us first assume that $T$ satisfies (OST-1); that is, there is some $N<\infty$ with $0\le T\le N$. Then the decomposition \[M_T=\sum_{n=0}^NM_T\mathsf{1}_{\{T=n\}}=\sum_{n=0}^NM_n\mathsf{1}_{\{T=n\}}%=\sum_{n=0}^NM_n\bl(\ind(T>n-1)-\ind(T>n)\br) =M_0+\sum_{n=0}^{N-1}\bigl(M_{n+1}-M_n\bigr)\mathsf{1}_{\{T>n\}}\] holds. Now for each $0\le n\le N-1$, (a) of Lemma [lem:conditional-expectation-properties] allows us to rewrite ${\mathsf{E}}\bigl( (M_{n+1}-M_n)\mathsf{1}_{\{T>n\}}\bigr)$ as \[{\mathsf{E}}\bigl({\mathsf{E}}\bigl((M_{n+1}-M_n)\mathsf{1}_{\{T>n\}}|\,{\mathcal{F}}^X_n\bigr)\bigr)={\mathsf{E}}\bigl(\mathsf{1}_{\{T>n\}}{\mathsf{E}}\bigl(M_{n+1}-M_n|\,{\mathcal{F}}^X_n\bigr)\bigr)\] where we have used that $\{T>n\}$ is $\mathcal{F}_n^X$-measurable (by definition of a stopping time). Since $M$ is a martingale, ${\mathsf{E}}\bigl(M_{n+1}-M_n|\,{\mathcal{F}}^X_n\bigr)=0$ and hence ${\mathsf{E}}\bigl( (M_{n+1}-M_n)\mathsf{1}_{\{T>n\}}\bigr)=0$ for each $n$. Linearity of expectation then implies that ${\mathsf{E}}(M_T)={\mathsf{E}}(M_0)$ as required.

Now let us assume that $T$ satisfies (OST-2). For fixed $n$, $\min(T,n)=T\wedge n$ is clearly a bounded stopping time (it is less than $n$ with probability one), so by (OST-1) we have that ${\mathsf{E}}M_{T\land n}={\mathsf{E}}M_0$ for all $n\ge0$. We now want to take a limit as $n\to \infty$. For this we write \[|M_{T}-M_{T\land n}|=\bigl|\sum_{ k>n}(M_k-M_n)\mathsf{1}_{\{T=k\}}|=\bigl|\sum_{k>n}\bigl(M_k-M_{k-1}\bigr)\mathsf{1}_{\{T\ge k\}}\bigr|\] (in particular, noting that this gives $|M_T|\le KT$ when $n=0$ and so $M_T$ is integrable). Taking expectations then gives that ${\mathsf{E}}(|M_T-M_{T\land n}|)\le K\sum_{k>n} {\mathsf{P}}(T\ge k)$ which tends to $0$ as $n\to \infty$. Hence ${\mathsf{E}}(M_T)=\lim_{n\to \infty}{\mathsf{E}}(M_{T\land n})=\lim_{n\to \infty} {\mathsf{E}}(M_0)={\mathsf{E}}(M_0)$.

Finally, we can assume (OST-3) and deduce the result in a similar way. The strategy is the same, but instead of writing $M_T-M_{T\wedge n}$ as a telescoping sum, we note that $|M_{T}-M_{T\wedge n}|\le 2K\mathsf{1}_{\{T>n\}}$ so that \[\bigl|{\mathsf{E}}M_T-{\mathsf{E}}M_{T\land n}\bigr|\le 2K{\mathsf{P}}(T>n)\to0\qquad\text{ as $n\to\infty$\,.}\]

For those who are familiar with the dominated convergence theorem. In (OST-3) it is sufficient to assume that $\bigl|M_{n\land T}\bigr|\le K$ for all $n\ge0$ and ${\mathsf{P}}(T<\infty)=1$. Indeed, by the dominated convergence theorem we deduce that then $\bigl|M_T\bigr|\le K$, so that the argument in the proof above applies.

With suitable martingales (OST) gives very powerful results.

The following instructive example is due to D. Williams:

(ABRACADABRA) A monkey types symbols at random, one per unit time, on a typewriter having $26$ keys. How long on average will it take him to produce the sequence ‘ABRACADABRA’?

Solution To compute the expected time, imagine a sequence of gamblers, each initially having £1, playing at a fair gambling casino. Gambler arriving just before time $n$ ($n\ge1$) bets £1 on the event $\{n\text{th letter will be {\sf A}}\}$. If he loses, he leaves. If he wins, he receives £26 all of which he bets on the event $\{n+1\text{\,st letter will be {\sf B}}\}$. If he loses, he leaves. If he wins, he receives £$26^2$ all of which he bets on the event $\{n+2\text{\,nd letter will be {\sf R}}\}$ and so on through the whole ABRACADABRA sequence.

Let $X_n$ denote the total winnings of the casino after the $n$th day. Since all bets are fair the process $(X_n)_{n\ge0}$ is a martingale with mean zero. Let $N$ denote the time until the sequence ABRACADABRA appears. At time $N$, gambler $N-10$ would have won £$26^{11}-1$, gambler $N-3$ would have won £$26^4-1$, gambler $N$ would have won £$26-1$ and all other $N-3$ gamblers would have lost their initial fortune. Therefore, \[X_N=N-3-(26^{11}-1)-(26^4-1)-(26-1)=N-26^{11}-26^4-26\] and since (OST-2) can be applied (check this!), we deduce the ${\mathsf{E}}(X_N)={\mathsf{E}}(X_0)=0$, that is \[{\mathsf{E}}(N)=26+26^4+26^{11}\,.\]

Notice that the answer could also be obtained by considering a finite state Markov chain $X_n$ on the state space of $12$ strings representing the longest possible intersection of the tail of the typed text with the target word ABRACADABRA, ie., $\bigl\{{\sf ABRACADABRA, ABRACADABR, \dots, ABRA, ABR, AB, A},\varnothing \bigr\}$, as there $N$ is just the hitting time of the state ABRACADABRA from the initial condition $X_0=\varnothing$.

Use an appropriate (OST) to carefully derive the probability ${\sf p}_k$ in [exmle:gamblers-ruin].

Use an appropriate (OST) to carefully derive the expectation ${\mathsf{E}}(T)$ in [exmle:SSRW-expected-exit-time-via-martingales].

Consider the simple symmetric random walk $(S_n)_{n\ge0}$, generated by a sequence of i.i.d. Bernoulli r.v. $X_k$ with ${\mathsf{P}}(X_1=\pm1)=1/2$, ie., $S_n=\sum_{k=1}^nX_k$. For integer $a<0<b$, let $T$ be the stopping time $T=\inf\bigl\{n\ge0:S_n\notin(a,b)\bigr\}\equiv\inf\bigl\{n\ge0:S_n\in\{a,b\}\bigr\}$.

Show that $(S_n)_{n\ge0}$ is a martingale and use an appropriate (OST) to find ${\mathsf{P}}(S_T=a)$ and ${\mathsf{P}}(S_T=b)$.
Show that $(M_n)_{n\ge0}$ defined via $M_n=(S_n)^2-n$ is a martingale w.r.t. the process $(S_n)_{n\ge0}$.
For fixed integer $K>0$, carefully apply an appropriate (OST) to $M_n$ and prove that ${\mathsf{E}}(T\land K)={\mathsf{E}}\bigl[(S_{T\land K})^2\bigr]$.
Deduce that ${\mathsf{E}}(T)=-ab$.

A coin showing heads with probability $p$ is tossed repeatedly. Let $w$ be a fixed sequence of outcomes such as ‘HTH’, and let $N$ denote the number of (independent) tosses until the word $w$ is observed. Using an appropriate martingale, find the expectation ${\mathsf{E}}N$ for each of the following sequences: ‘HH’, ‘HTH’, ‘HHTTHH’.

Let $(Y_n)_{n\ge0}$ be a supermartingale w.r.t. a sequence $(X_n)_{n\ge0}$ and let $H_n\in{\mathcal{F}}_{n-1}^X=\sigma(X_0,\dots,X_{n-1})$ satisfy $0\le H_n\le c_n$, where the constant $c_n$ might depend on $n$. Then the process $W_n=W_0+\sum_{m=1}^nH_m\bigl(Y_m-Y_{m-1}\bigr)$, $n\ge0$, is a supermartingale w.r.t. $(X_n)_{n\ge0}$.

Following the proof of the optional stopping theorem, we observe that since $(Y_n)_{n\ge0}$ is a supermartingale w.r.t. $(X_n)_{n\ge0}$, \[{\mathsf{E}}\bigl(H_m(Y_m-Y_{m-1})\bigr)={\mathsf{E}}\bigl[H_m\,{\mathsf{E}}(Y_m-Y_{m-1}\,|\,{\mathcal{F}}_{m-1})\bigr]\le0\,.\]

If $(Y_n)_{n\ge0}$ describes the stock price process, and $H_m$ is the number of stocks held during the time $(m-1,m]$ (decided when the price $Y_{m-1}$ is known), then $W_n$ describes the fortune of an investor at time $n$. As $(W_n)_{n\ge0}$ is a supermartingale w.r.t. $(X_n)_{n\ge0}$, we have ${\mathsf{E}}W_n\le{\mathsf{E}}W_0$ for all $n\ge0$.

The famous “doubling martingale” corresponds to doubling the bet size until one wins, ie., to taking $H_m=2^{m-1}\mathsf{1}_{\{T>m\}}$, where $T$ is the first moment when the price goes up, ie., $T=\min\{m>0:Y_m-Y_{m-1}=1\}$. Since the stopped process $(W_{n\land T})_{n\ge0}$ is a supermartingale, for all $n\ge0$ we have ${\mathsf{E}}(W_{n\land T})\le{\mathsf{E}}(W_0)$, ie., on average, the doubling strategy does not produce money if one bets against a (super)martingale.

Let $(S_n)_{n\ge0}$ be a random walk generated by a sequence $(X_n)_{n\ge0}$ of i.i.d. steps with ${\mathsf{E}}|X|<\infty$ and ${\mathsf{E}}(X)=m$. If $T$ is a stopping time for $(X_n)_{n\ge0}$ with ${\mathsf{E}}(T)<\infty$, then the optional stopping theorem implies that $S_T$ is integrable and \[{\mathsf{E}}(S_T-S_0)=m\,{\mathsf{E}}(T).\]

To show this, first notice that $S_n-n{\mathsf{E}}X=S_n-mn$ is a martingale and for every $n\ge0$ the variable $T\land n$ is a bounded stopping time. By (OST-1), we have \[\begin{equation} \label{eq:Wald-stopped-time} {\mathsf{E}}(S_0)={\mathsf{E}}\bigl(S_{n\land T}-m(n\land T)\bigr). \end{equation}\] This rearranges to \[{\mathsf{E}}\bigl(S_{n\land T}-S_0\bigr)=m{\mathsf{E}}(n\land T)\] for every $n$. Now, the RHS converges to ${\mathsf{E}}(T)$ as $n\to \infty$ since $|{\mathsf{E}}(T)-{\mathsf{E}}(n\land T)|\le {\mathsf{E}}(T\mathsf{1}_{\{T>n\}})=\sum_{k>n} k{\mathsf{P}}(T=k)$, where the tail sums on the right go to zero as $n\to \infty$ by the assumption that $T$ is integrable. Next, by writing $|S_T|=\sum_{k\ge 0} |X_k|\mathsf{1}_{\{T\ge k\}}$ as a telescoping sum, where ${\mathsf{E}}(|X_k|\mathsf{1}_{\{T\ge k\}})={\mathsf{P}}(T\ge k){\mathsf{E}}(|X_1|)$ and $\sum_{k} {\mathsf{P}}(T\ge k)<\infty$, we see that $|S_T|$ is integrable. ⁸ Similarly, we can bound ${\mathsf{E}}(|S_T-S_{T\wedge n}|)\le {\mathsf{E}}(|X_1|)\sum_{k>n} {\mathsf{P}}(T\ge k)$ which tends to $0$ as $n\to \infty$. This implies that ${\mathsf{E}}(S_{T\land n})\to {\mathsf{E}}(S_T)$ as $n\to \infty$, and combining the above completes the argument.

4.5 Martingale convergence theorem

This subsection is optional and will not be examined.
The following example has a number of important applications.

[Pólya’s urn][exmle:Polya-urn] An urns contains one green and one red ball. At every step a ball is selected at random, and then replaced together with another ball of the same colour. Let $X_n$ be the number of green balls after $n$th draw, $X_0=1$. Then the fraction $M_n=X_n/(n+2)$ of green balls is a martingale w.r.t. the filtration $({\mathcal{F}}^X_n)_{n\ge0}$.

Indeed, as $|M_n|\le1$ we have ${\mathsf{E}}|M_n|\le1$ for all $n\ge0$, and since \[{\mathsf{P}}(X_{n+1}=k+1\mid X_n=k)=\frac{k}{n+2}\,,\qquad{\mathsf{P}}(X_{n+1}=k\mid X_n=k)=1-\frac{k}{n+2}\,,\] we get ${\mathsf{E}}\bigl(X_{n+1}\mid{\mathcal{F}}_n^X\bigr)=\frac{n+3}{n+2}\,X_n$, equivalently, ${\mathsf{E}}(M_{n+1}\mid{\mathcal{F}}^X_n)=M_n$.

Show that ${\mathsf{P}}(M_n=\frac{k}{n+2})=\frac{1}{n+1}$ for $1\le k\le n+1$, ie., $M_n$ is uniformly distributed in $\bigl\{\tfrac1{n+2},\tfrac2{n+2},\ldots,\tfrac{n+1}{n+2}\bigr\}$.

[exse:Polya-urn-one] suggests that in the limit $n\to\infty$ the distribution of $M_n$ becomes uniform in $(0,1)$:

Show that $\lim\limits_{n\to\infty}{\mathsf{P}}\bigl(M_n<x\bigr)=x$ for every $x\in(0,1)$.

In view of [exse:Polya-urn-one], a natural question is: does the proportion $M_n$ of green balls fluctuate between $0$ and $1$ infinitely often or does it eventually settle down to a particular value? The following example shows that the latter is true. Our argument is based upon the following observation: if a real sequence $y_n$ does not converge, for some real $a$, $b$ with $-\infty<a<b<\infty$ the sequence $y_n$ must go from the region below $a$ to the region above $b$ (and back) infinitely often.

For fixed $n\ge0$ let $M_n<a\in(0,1)$, and let $N=\min\{k>n: M_n>b\}$ for some $b\in(a,1)$. Since $N_m=N\land m$ is a bounded stopping time, by (OST-1) we have ${\mathsf{E}}{M_{N_m}}={\mathsf{E}}{M_n}<a$ if only $m>n$. On the other hand, \[{\mathsf{E}}{M_{N_m}}\ge{\mathsf{E}}\bigl(M_{N_m}\mathsf{1}_{N\le m}\bigr)\equiv{\mathsf{E}}\bigl(M_{N}\mathsf{1}_{N\le m}\bigr)>b\,{\mathsf{P}}(N\le m)\,.\] In other words, ${\mathsf{P}}(N\le m)<\frac{a}{b}$ and consequently ${\mathsf{P}}(N<\infty)\le\frac{a}{b}<1$, ie., the fraction $M_n$ of green balls ever gets above level $b$ with probability at most $\frac{a}{b}$. Suppose that at certain moment $N\in(n,\infty)$ the fraction of green balls became bigger than $b$. Then a similar argument shows that with probability at most $(1-b)/(1-a)$ the value $M_n$ becomes smaller than $a$ at a later moment.

Put $S_0=\min\{n\ge0:M_n<a\}$, and then, inductively, for $k\ge0$, \[\begin{equation} \label{eq:Polya-urn-ST-times} T_k=\min\bigl\{n>S_k:M_n>b\bigr\}\,,\qquad S_{k+1}=\min\bigl\{n>T_k:M_n<a\}\,. \end{equation}\] The argument above implies that \[\begin{equation} \label{eq:probability-of-upcrossings} {\mathsf{P}}(S_k<\infty)\le\prod_{j=1}^k\Bigl({\mathsf{P}}(T_{j-1}<\infty\mid S_{j-1}<\infty){\mathsf{P}}(S_j<\infty\mid T_{j-1}<\infty)\Bigr) \end{equation}\] with the RHS bounded above by $\bigl(\frac{a}{b}\bigr)^k\bigl(\frac{1-b}{1-a}\bigr)^k\to0$ as $k\to\infty$. As a result, the probability of infinitely many crossing (ie., $S_k<\infty$ for all $k\ge0$) vanishes.

Clearly, the argument above applies to all strips $(a,b)\subset(0,1)$ with rational endpoints. Thus, with probability one, ⁹ trajectories of $M_n$ eventually converge to a particular value. ¹⁰

Write $U_{(a,b)}$ for the total number of upcrossings of the strip $(a,b)$ by the process $(M_n)_{n\ge0}$. By using the approach of [exmle:Polya-urn-convergence] and noticing that $\bigl\{U_{(a,b)}\ge m\bigr\}\subset\bigl\{S_{m-1}<\infty\bigr\}$ or otherwise, show that ${\mathsf{E}}U_{(a,b)}<\infty$.

The argument in [exmle:Polya-urn-convergence] also works in general. Let $(M_n)_{n\ge0}$ be a martingale w.r.t. filtration $({\mathcal{F}}^X_n)_{n\ge0}$. For real $a$, $b$ with $-\infty<a<b<\infty$ let $U_{(a,b)}$ be the total number of upcrossings of the strip $(a,b)$. The following result (or some of its variants) is often referred to as Doob’s Upcrossing Lemma:

Let the martingale $(M_n)_{n\ge0}$ have uniformly bounded expectations, ie., for some constant $K$ and all $n\ge0$, ${\mathsf{E}}|M_n|<K<\infty$. If $U_{(a,b)}$ is the number of upcrossings of a strip $(a,b)$, then ${\mathsf{E}}U_{(a,b)}<\infty$.

With stopping times as in $\eqref{eq:Polya-urn-ST-times}$, put $H_n=1$ if $S_m<n\le T_{m}$ and put $H_n=0$ otherwise. Then the process $W_n=\sum_{k=1}^nH_k(M_k-M_{k-1})$ is a martingale w.r.t. $(M_n)_{n\ge0}$, cf. [lem:betting-martingale-transform]. It is easy to check that $W_n\ge(b-a)\,U_n-|M_n-a|$ (draw the picture!), where $U_n=\max\{m\ge0:T_m\le n\}$ is the number of upcrossings of the strip $(a,b)$ up to time $n$. As a result \[0={\mathsf{E}}W_0={\mathsf{E}}W_n\ge(b-a)\,{\mathsf{E}}U_n-{\mathsf{E}}|M_n-a|\ge(b-a)\,{\mathsf{E}}U_n-(K+|a|)\,,\] so that ${\mathsf{E}}U_n\le(K+|a|)/(b-a)$ for all $n\ge0$, and thus ${\mathsf{E}}U_{(a,b)}<\infty$.

Let $(M_n)_{n\ge0}$ be a martingale as in [lem:Doob-upcrossings]. Then there exists a random variable $M_\infty$ such that $M_n\to M_\infty$ with probability one.

If $M_n$ does not converge, for some rational $a$, $b$ with $-\infty<a<b<\infty$ we must have $U_{(a,b)}=\infty$. However, by [lem:Doob-upcrossings], ${\mathsf{E}}U_{(a,b)}<\infty$ implying that ${\mathsf{P}}(U_{(a,b)}=\infty)=0$. As the number of such pairs $(a,b)$ is countable, the result follows.

Let $(X_k)_{k\ge1}$ be independent variables with \[{\mathsf{P}}\Bigl(X=\frac32\Bigr)={\mathsf{P}}\Bigl(X=\frac12\Bigr)=\frac12\,.\] Put $M_n=X_1\cdot\ldots\cdot X_n$ with $M_0=1$. Show that $M_n$ is an $({\mathcal{F}}^X_n)_{n\ge0}$ martingale. Deduce that $M_n\to M_\infty$ with probability one. Can you compute ${\mathsf{E}}(M_\infty)$?

4.6 Additional problems

Let $(\eta_n)_{n\ge1}$ be independent positive random variables with ${\mathsf{E}}\eta_n=1$ for all $n\ge1$. If a random variable $M_0>0$ is independent of $(\eta_n)_{n\ge0}$ and ${\mathsf{E}}M_0<\infty$, then the process $(M_n)_{n\ge0}$ defined via $M_n=M_0\prod_{j=1}^n\eta_j$ is a martingale w.r.t. $(\eta_n)_{n\ge1}$.

1ex Interpreting $\eta_n-1$ as the (fractional) change in the value of a stock during the $n$th time interval, the martingale $(M_n)_{n\ge0}$ can be used to model stock prices. Two often used examples are:
Discrete Black-Sholes model: take $\eta_j=e^{\zeta_j}$, where $\zeta_j$ is Gaussian, $\zeta_j\sim{\mathcal{N}}(\mu,\sigma^2)$;
Binomial model: take $\eta_j=(1+a)e^{-r}$ and $\eta_j=(1+a)^{-1}e^{-r}$ with probabilities $p$ and $1-p$ respectively.

Let $(S_n)_{n\ge0}$ be the random walk from [exmle:SSRW-expected-exit-time-via-martingales]. Find constants $a$, $b$, $c$ such that the process $(S_n)^4+an(S_n)^2+bn^2+cn$ is an $(X_n)_{n\ge0}$-martingale. Use the heuristc in [exmle:SSRW-expected-exit-time-via-martingales] to predict the value of the second moment ${\mathsf{E}}(T^2)$ of the exit time $T$.

A standard symmetric dice is tossed repeatedly. Let $N$ be the number of (independent) tosses until a fixed pattern is observed. Using an appropriate martingale, find ${\mathsf{E}}N$ for the sequences ‘123456’ and ‘123321’.

Suppose that the process in [exmle:Polya-urn] is modified as follows: for a fixed integer $c>1$, every time a random ball is selected, it is replaced together with other $c$ balls of the same colour. If, as before, $X_n$ denotes the total number of green balls after $n$ draws, show that the the fraction $M_n=\frac{X_n}{2+nc}$ of green balls forms a martingale w.r.t. $({\mathcal{F}}^X_n)_{n\ge0}$.

Find the large-$n$ limit of the distribution of the martingale $(M_n)_{n\ge0}$ from [exse:Polya-urn-general].

Let $(X_n)_{n\ge0}$ be a birth-and-death process in ${\mathcal{S}}=\{0,1,\dots\}$, ie., a Markov chain in ${\mathcal{S}}$ with transition probabilities $p_{00}=r_0$, $p_{01}=p_0$, and $p_{m,m-1}=q_m$, $p_{m,m}=r_m$, $p_{m,m+1}=p_m$ for $m>0$, while $p_{m,k}=0$ for all other pairs $(m,k)\in{\mathcal{S}}^2$. Let $X_0=x$, and for $y\ge0$ denote $T_y\stackrel{{\sf def}}{=}\min\{n\ge0:X_n=y\}$.

Show that the process $\bigl(\varphi(X_n)\bigr)_{n\ge0}$ with $\varphi(z)\stackrel{{\sf def}}{=}\sum\limits_{y=1}^z\prod\limits_{x=1}^{y-1}\dfrac{q_x}{p_x}$ is a martingale.
Show that for all $0\le a<X_0=x<b$ we have \[{\mathsf{P}}\bigl(T_b<T_a\bigr)=\bigl(\varphi(x)-\varphi(a)\bigr)/\bigl(\varphi(b)-\varphi(a)\bigr)\,.\] Deduce that state $0$ is recurrent iff $\varphi(b)\to\infty$ as $b\to\infty$.
Now suppose that $p_m\equiv p$, $q_m\equiv q=1-p$, and $r_m=0$ for $m>0$, whereas $p_0=p$ and $r_0=q$. Show that in this case the result in part b) above becomes \[{\mathsf{P}}\bigl(T_b<T_a\bigr)=\bigl((q/p)^a-(q/p)^x\bigr)/\bigl((q/p)^a-(q/p)^b\bigr)\,.\]
Find ${\mathsf{P}}\bigl(T_b<T_a\bigr)$ if in the setup of part c) one has $p=q=1/2$.

Let $(\xi_k)_{k\ge1}$ be i.i.d. random variables with ${\mathsf{P}}(\xi=1)=p<\tfrac12$, ${\mathsf{P}}(\xi=-1)=q=1-p$, and ${\mathsf{E}}\xi>0$. Let $(S_n)_{n\ge0}$ be the generated random walk, $S_n=x+\xi_1+\dots+\xi_n$, and let $T_0=\min\{n\ge0:S_n=0\}$ be the hitting time of $0$. Deduce that for all $x>0$, ${\mathsf{P}}(T_0<\infty)=(q/p)^x$. Compare this to the result of [exmle:SRW-hitting-time].

Let $(Z_n)_{n\ge0}$ be a homogeneous branching process with $Z_0=1$, $m={\mathsf{E}}Z_1>0$ and finite variance $\sigma^2={\mathsf{Var}}(Z_1)$. Show that $M_n=Z_n/m^n$ is a martingale.

Let $m>1$. By using [exse:branching-process-two-moments] or otherwise show that ${\mathsf{E}}(M_n)$ is uniformly bounded. Deduce that $M_n\to M_\infty$ almost surely. What can you say about ${\mathsf{E}}M_\infty$?
What happens if $m\le1$? Compute ${\mathsf{E}}(M_\infty)$.

Hint: Recall [exse:uniform-integrability].

Let $(X_n)_{n\ge0}$ be a sequence of i.i.d. Bernoulli random variables with ${\mathsf{P}}(X=1)=p$ and ${\mathsf{P}}(X=-1)=1-p=q$. Let $(S_n)_{n\ge0}$ be the generated random walk with $S_0=x>0$, and let $T=\min\{n\ge0:S_n=0\}$ be the hitting time of the origin. [exmle:SRW-hitting-time] suggests that ${\mathsf{E}}(T)=x/(q-p)<\infty$ for $q>p$.

Use $\eqref{eq:Wald-stopped-time}$ to deduce that $(q-p){\mathsf{E}}(n\land T)={\mathsf{E}}\bigl(S_0-S_{n\land T}\bigr)\le{\mathsf{E}}(S_0)=x$; then take $n\to\infty$ to show that ${\mathsf{E}}(T)<\infty$;
Use the Wald equation to deduce that indeed ${\mathsf{E}}(T)=\frac{x}{q-p}$. Can you give a heuristic explanation of this result?
Argue, without using the Wald equation, that ${\mathsf{E}}(T)=cx$ for some constant $c$.
Use the Wald equation and an argument by contradiction to show that if $p\ge q$, then ${\mathsf{E}}(T)=\infty$ for all $x>0$.

Let a variable $Y$ satisfy ${\mathsf{E}}(Y^2)<\infty$. Show that ${\mathsf{E}}|Y|<\infty$. Hint Notice that ${\mathsf{Var}}(|Y|)\ge0$.

Let $(X_n)_{n\ge0}$ be an irreducible Markov chain in ${\mathcal{S}}=\{0,1,\dots\}$ with bounded jumps, and let a function $\varphi:{\mathcal{S}}\to{\mathbb{R}}^+$ satisfy $\varphi(x)\to\infty$ as $x\to\infty$. Let $K\ge0$ be such that \[{\mathsf{E}}_x\varphi(X_1)\stackrel{{\sf def}}{=}{\mathsf{E}}\bigl[\varphi(X_1)|X_0=x\bigr]\le\varphi(x)\] for all $x\ge K$.

If the function $\varphi(x)$ is monotone, show that the set of states $\{0,1,\dots,K\}$, and thus the whole space ${\mathcal{S}}$ is recurrent for $(X_n)_{n\ge0}$.
Hint If $H_K=\min\bigl\{n\ge0:0\le X_n\le K\}$, show that $\varphi(X_{n\land H_K})$ is a supermartingale. Deduce that if $T_M=\min\bigl\{n\ge0:X_n\ge M\}$, then $\varphi(x)\ge\varphi(M){\mathsf{P}}(T_M<H_K)$.
Argue that the result holds for $\varphi(x)\ge0$ not necessarily monotone, but only satisfying $\varphi(x)\to\infty$ as $x\to\infty$.
Hint With $T_M$ as above, show that $\varphi_M^*\stackrel{{\sf def}}{=}\min\{\varphi(x):x\ge M\}\to\infty$ as $M\to\infty$.

Let $(X_k)_{k\ge1}$ be independent variables with ${\mathsf{P}}(X=\pm1)=\frac12$. Show that the process \[M_n=\sum_{k=1}^n\frac1k\,X_k\] is a martingale w.r.t. $({\mathcal{F}}^X_n)_{n\ge0}$ and that ${\mathsf{E}}\bigl[(M_n)^2\bigr]<K<\infty$ for some constant $K$ and all $n\ge0$. By using [exse:uniform-integrability] or otherwise, deduce that with probability one, $M_n\to M_\infty$ for some random variable $M_\infty$. In other words, the random sign harmonic series converges with probability one.

Thinking of a population of $N$ haploid individuals who have one copy of each of their chromosomes, consider a fixed population of $N$ genes that can be of two types $A$ or $a$. In the simplest version of this model the population at time $n+1$ is obtained by sampling with replacement from the population at time $n$. If we let $X_n$ to be the number of $A$ alleles at time $n$, then $X_n$ is a Markov chain with transition probability \[p_{ij}=\binom{N}{j}\Bigl(\frac{i}{N}\Bigr)^j\Bigl(1-\frac{i}{N}\Bigr)^{N-j}\,.\] Starting from $i$ of the $A$ alleles and $N-i$ of $a$ alleles, what is the probability that the population fixates in the all $A$ state? Hint You can use the heuristics of [exmle:gamblers-ruin] but need to justify your computation!

Let $(X_n)_{n\ge0}$ be a Markov chain with a (countable) state space ${\mathcal{S}}$ and the transition matrix ${\mathbf{P}}$, and let $h(x,n)$ be a function of the state $x$ and time $n$ such that ¹¹ \[h(x,n)=\sum_yp_{xy}h(y,n+1)\,.\] Show that $(M_n)_{n\ge0}$ with $M_n=h(X_n,n)$ is a martingale w.r.t. $(X_n)_{n\ge0}$.

Let $(X_n)_{n\ge0}$ be a Markov chain with a (countable) state space ${\mathcal{S}}$ and the transition matrix ${\mathbf{P}}$. If $\boldsymbol{\psi}$ is a right eigenvector of ${\mathbf{P}}$ corresponding to the eigenvalue $\lambda>0$, ie., ${\mathbf{P}}\boldsymbol{\psi}=\lambda\boldsymbol{\psi}$, show that the process $M_n=\lambda^{-n}\psi\bigl(X_n\bigr)$ is a martingale w.r.t. $(X_n)_{n\ge0}$.

current version by Ellen Powell (ellen.g.powell@durham.ac.uk); based on previous notes by Nicholas Georgiou and Ostap Hyrniv↩︎
If $M_n$ traces your fortune, then “there is nothing super about a supermartingale”.↩︎
Notice that we do not assume that all $\xi_n$ have the same distribution!↩︎
[general-measurability] In the general countable setting, if ${\mathcal{D}}=\{D_1,D_2,\dots\}$ forms a denumerable (ie., infinite countable) partition of $\Omega$, then the generated $\sigma$-field ${\mathcal{G}}=\sigma({\mathcal{D}})$ consists of all possible subsets of $\Omega$ which are obtained by taking countable unions of blocks of ${\mathcal{D}}$. Similarly, a variable $Y$ is measurable w.r.t. ${\mathcal{G}}$, if for every real $y$ the event $\{\omega:Y(\omega)\le y\}$ belongs to ${\mathcal{G}}$ (equivalently, can be expressed as a countable union of blocks of ${\mathcal{D}}$.↩︎
Notice that the gambler’s ruin problem can be solved by using the methods of finite Markov chains, so we indeed know that the result above is correct.↩︎
the equality is correct, because ${\mathsf{P}}(T<\infty)=1$ here!↩︎
alternatively, recall the result in [exmle:SRW-hitting-time].↩︎
Here we are really using the monotone convergence theorem: if $0\le Z_n\le Z$ for every $n$ and $Z_n\uparrow Z$ as $n\to \infty$, then ${\mathsf{E}}(Z_n)\to{\mathsf{E}}(Z)$ as $n\to \infty$.↩︎
If $M_n$ does not converge, it must cross at least one of countably many strips $(a,b)$ with rational points infinitely many times.↩︎
which is random and depends on the trajectory↩︎
This result is useful, eg., if $h(x,n)=x^2-cn$ or $h(x,n)=\exp\{x-cn\}$ for a suitable $c$.↩︎