Frank Coolen: POST-GRADUATE INFORMATION

This page gives some descriptions of research topics that I find very interesting, and on which I would be keen to supervise research students. For additional information see my homepage , in particular the link to more detailed descriptions of my recent and future research . If you are interested in research work in any area of my interest, or have any questions, please feel free to e-mail me , or contact me otherwise.

Bayes (linear) methods for software testing

Good testing of software systems is a huge challenge, with obvious practical interest. Not only if the software is essential for safety, but also purely commercially it would be very valuable if modern statistical methods could support practical software testing.

Together with colleagues, both from our Statistics group and from our Computer Science Department, I am working on development of Bayes (linear) methods for applied software testing. To stay in contact with "the real world", we have an ongoing collaboration with BT, looking at software systems that are frequently upgraded, with new versions being tested by a large group of testers, typically under high time pressure. The main aim is to make testing more effective, and the methodology that we have developed, and applied, so far, indicates clearly the possibility to become a major support tool for the testers.

Our work in this area is developing into a considerable research project with more and more people involved, and enthusiastic students would be very welcome to take on research problems linked to the project. These could range from quite practical work to very theoretical problems, but there is definitely a great chance that results would be quickly implemented in practical environments. And, with ever growing importance of software, and the need for reliable software, research in this area promises excellent career opportunities!

Just to give one example of such a project, Maha Rahrouh has recently started her PhD study considering the following problem. Managers often want numerical statements ('metrics') about the reliability of software, before they agree on it being used. In the computer science literature, many such metrics have been suggested, but hardly ever have they been directly related to statistical methods that can support testing. As we think that Bayes (linear) methods provide an attractive approach for such statistical support, it is interesting to consider what would be good, that is practically useful and understandable, metrics for reliability of software, both during the development process and when the software should (or should not) be released. Clearly, a useful metric must be comparable with practical interests, some software needs to be highly reliable whereas for other software some remaining bugs may hardly be relevant.

Further important challenges are easily found in the area of design of software suites, related to large graphical models for the particular software. Postgraduate work along these lines would ask for a good understanding of Bayes (linear) methods, and Durham is an excellent place to acquire that, and also interest in aspects of Computer Science would be useful. The work would bring you in contact with more people than just statisticians!

Foundations of statistics

I have been working, for some time, on statistical inference based on low stochastic structure assumptions. Basically, in this work I attempt to come to statistical or decision theoretic results on the basis of available data, while adding only few mathematical assumptions. The work has, so far, led to several interesting results, which could be considered as results in 'robust statistics', and several big challenges lay ahead, such as dealing with multinomial data or even multivariate data (where conditioning will be a major, yet very interesting, problem). As a PhD student, you could make substantial contributions to this research, and you would be very welcome! This work not only promises to be theoretically challenging, you could also develop good insights into more standard approaches of statistics, for example in studies to compare these new methods with existing methods.

More on 'foundations'... As I am also interested in development of Bayes linear methods (BLM), in which field our group are the 'world leaders', it would be great to have more PhD students working in this area. In particular, I would be interested to supervise postgraduate research work on: (1) use of censored data (see also below, 'reliability theory and survival analysis'); (2) comparison of BLM with maximum entropy Bayes methods (these are often used, e.g. in applied physics). Possibly, supervision could be jointly with Michael Goldstein or David Wooff, who are the main developers of BLM at Durham.

Bootstrap methods

If you are interested in modern calculation methods within statistics, one area to work in is bootstrap methods. Briefly, this can be regarded as a method to derive inferences by calculating statistics related to repeated samples from an empirical distribution function, based on data available. Banks (`Histospline smoothing the Bayesian bootstrap', Biometrika, 1988), analyzed a bootstrap method not using the empirical distribution function, but an `A(n)'-type assumption together with the data. His results showed some improvement on the classical method. It is very interesting to develop such methods further, and more precisely in line with the low stochastic structure method. Typically, slower yet probably more realistic convergence of estimators would be achieved, as observations (re-)sampled would not be independent. This work would require interest to work on challenging problems in the theory of statistics, as well as the willingness to develop and implement algorithms to calculate statistics.

Reliability theory and survival analysis

My personal interest in these topics goes back to the days when I worked on my MSc and PhD theses, and there are several possible projects closely related to my earlier work. Next to some more detailed proposals as described below, the use of Bayes linear methods (see above) in reliability and survival analysis provides challenging opportunities for research leading to a PhD. An important aspect of 'lifetime data' (also called 'time-to-event data'), as often appear in reliability or survival studies, is censoring, for example, an observation that a machine has not failed before maintenance, or that a patient, using some prescribed drugs in the effect of which we are interested, is still feeling fine one year after starting taking the drugs. Statistical methods to deal with such data have been studied extensively, both from a classical (frequentist) and from a Bayesian point of view, but not yet (in detail) by use of Bayes linear methods. Work in this area would be of main practical interest.

A second aspect related to censored data, in which I am very interested, is related to the fact that most of the currently available statistical results for such data rely on the assumption that the censoring mechanism is uninformative for the lifetimes. This assumption would, for example, not hold if the machine mentioned above is taken out of a production process for preventive maintenance because an experienced engineer thinks it likely that the machine would fail pretty soon. Although information of such kind has been studied, there is still a lot to be done, and my particular interest would be to look at this from a Bayesian point of view.

In addition to the topics described here, my page with recent and future research interests provides more suggestions of research areas and topics that you may find challenging to work on!

Last revision: 6/9/00