Project IV (MATH4072) 2019-20


Random Walks and DNA (un)zipping

C. da Costa

Description

In 1967 Chernoff described DNA replication using Random Walks in Random Environments (RWRE). The original idea was that the RNA-polymerase (the agent responsible for reading the DNA strand during transcription) performs a random walk, whose transition probabilities depend on the local DNA sequence, traditionally encoded by four letters (G, C, A, T). The DNA sequences, being unknown to us, were modelled by a random environment. Remarkably, 40 years later the first actual data coming from DNA unzipping became available and the theory of RWRE turned out to be very useful.

The emergence of new experimental techniques allowing for accurate mechanical unzipping of the DNA inspire to use the RWRE theory for development of efficient methods of DNA sequencing. From a modeling perspective, one can interpret data of DNA unzipping as independent outcomes of RWRE over different time periods.

The aim of the project is to obtain likelihood estimates on such samples as if they were realisations of independent RWRE trajectories of different lengths. Combining RWRE specific approaches with classical methods for sums of independent random variables allows to detect sensibility of the data to the sample lengths. In this project the student will be encouraged to connect their results with statistical problems related to DNA sequencing.

Prerequisites

Basic knowledge of discrete Markov chains and Probability II. Specific techniques of large deviation theory and RWRE will be developed along the way.

Resources

  • P. Andreoletti, R. Diel - DNA Unzipping via Stopped Birth and Death Processes with Unknown Transition Probabilities
  • L. Avena et al.- Random walk in cooling random environment: ergodic limits and concentration inequalities
  • F. den Hollander- Large Deviations - chap 7
  • email: Conrado da Costa