With complex simulation models, or when many simulation runs
are required, the necessary computational time may become
prohibitively large. Using a reasonably large number of model
outputs, a regression function might be established that
estimates the simulation model output from some of its
inputs. This regression equation is called a metamodel. It can be
used as a cheap (i.e. fast) approximation to the simulation
model. In this presentation the result of a (linear) regression
metamodel will be compared to a neural network approximation. The
data come from a model simulating the amount and concentrations
of nitrate in surface waters in the Netherlands, depending, among
others, on soil characteristics, land use, and ground water
level.
Using fuzzy measures in sensitivity analysis
and uncertainty estimation of environmental models
Keith Beven, Institute of Environmental and Natural Sciences,
Lancaster University, Lancaster LA1 4YQ, UK (K.Beven@Lancaster.ac.uk)
In many applications of environmental models it is difficult
to make a proper estimation of a likelihood function expressing
the probability of predicting an observation given the model. In
fact, it is more often a matter of trying to assess the degree to
which a model can be considered a likely simulator of a system
or, from another point of view, which models can be rejected on
the basis of limited observational data. In such a framework,
different types of objective function can be useful and in this
presentation the use of fuzzy objective functions are explored
for bothsensitivity analysis and uncertainty estimation. A range
of applications from the prediction of the possibility of flood
inundation in real time to assessing the spatial heterogeneity of
the landscape in controlling fluxes from the land surface to the
atmosphere will be demonstrated. An interpretation of the GLUE
methodology, consistent with this framework, as a landscape space
to model space mapping will be described.
For information about TOPMODEL or GLUE try the Web pages at http://www.es.lancs.ac.uk/es/research/hfdg/hfdg.html
Parameters in models are often unsuitable as elicitation
variables, either because they lack a clear operational meaning
or because experts have no experimental base for these
parameters. Examples abound. In such cases the analyst must
elicit uncertainty on observables which are predicted by the
models, then invert the models to pull this uncertainty back onto
the model parameters. This probabilistic inversion or
pull-back-operation typically introduces dependencies, and hence
constitutes one source of dependence. A second source is direct
elicitation. Methods for this have been under development in
recent years. The talk discusses examples of both sources of
dependence in recent applications.
Comparison of observational data with a theoretical model
which is approximately implemented in a complex computer code
offers many challenges to the statistician. Classical
statistical approaches tend to ignore bias in data and model and
treat the two asymmetrically. Furthermore, the modeler is often
interested in accurate predictions of quantities which are not
directly measured. We consider an approach which treats both
model and data as realizations of random processes and utilizes a
generalized correlation analysis to assess agreement. The
methodology is being used in an ongoing assessment of an Ocean
Circulation Model which is a component of a Global Climate Model.
A main quantity of interest is heat transfer between the ocean
and atmosphere, which can only be indirectly inferred from a data
base of ocean temperature measurements. Numerous other
complications are discussed including comparisons with multiple
models, e.g. from the same theoretical model run at different
grid sizes or boundary conditions.
We describe a general Bayesian approach for using computer
codes for a complex physical system to assist in forecasting
system outcomes. Our approach is based on expert judgements and
experiments on fast versions of the computer code. These are
combined to construct models for the relationships between the
code's inputs and outputs, respecting the natural space/time
features of the physical system. The resulting beliefs are
systematically updated as we make evaluations of the code for
varying input sets and calibrate the input space against past
data on the system. The updated beliefs are then used to
construct forecasts for future system outcomes. While the
approach is quite general, it has been developed particularly to
handle problems with high-dimensional input and output spaces,
for which each run of the computer code is expensive. The
methodology will be applied to problems in uncertainty analysis
for hydrocarbon reservoirs.
A key issue in the consolidation process of the nuclear fuel
cycle is the safe disposal of radioactive waste. Deep geological
disposal based on a multibarrier concept is at present the most
actively investigated option (visualize a deep underground
facility within which radioactive materials such as spent fuel
rods or reprocessed waste, previously encapsulated, are emplaced,
surrounded by other man-made barriers). While the safety of this
concept ultimately relies on the safety of the mechanical,
chemical and physical barriers offered by the geological
formation itself, the physico-chemical behavior of such a
disposal system over geological time scales (hundreds or
thousands of years) is far from known with certainty.
From 1996 to 1999, with partners in Italy, Spain, and Sweden,
I was involved in a project for the European Commission, GESAMAC,
which aimed in part to capture all relevant sources of
uncertainty in predicting what would happen if the disposal
barriers were compromised in the future by processes such as
geological faulting, human intrusion, and/or climatic change. One
major goal of the project was the development of a methodology to
predict the radiologic dose for people in the biosphere as a
function of time, how far the disposal facility and the other
components of the multibarrier system are underground, and other
factors likely to be strongly related to dose. For this purpose
we developed a complex computer simulation environment called
GTMCHEM which "deterministically" models the one-dimensional
migration of radionuclides through the geosphere up to the
biosphere. In this talk I will describe the application of
Bayesian and non-Bayesian methods of functional data analysis to
explore the dependence of predicted radiologic dose curves as a
function of time on inputs to the computer simulations.
During the week we shall discuss many approaches to
uncertainty analysis and many application areas. But - pardon me
for noticing - all of us are technically trained, well able to
understand the methods and their results. Unfortunately - or
perhaps thankfully - we are not the ones who have to act on the
results. Others use the results to inform their response to
risks and guide their decisions. To do that they need to
understand the import of our analyses. How do we convey that
import? To do that we need also to understand the questions that
they are asking. Recently, a colleague used eight words to
describe good operational research, risk and decision analysis.
The process should: "create questions; question questions; answer
questions; question answers." Uncertainty analysis is mainly
about questioning answers and partly about answering questions.
Can it inform and help in the first two stages, building a shared
understanding of the first two stages?
The talk will not answer that question, but it will float some
ideas, hopefully sufficient ideas to keep discussion going late
into the evening in the bowels of Gregynog's bar!
Forest scientists are relying increasingly on projections from
mechanistic models to answer questions regarding the effect(s) of
global change on forest growth. Model outputs are also used to
assess the plausibility of hypotheses regarding physiological
processes.
We will discuss analysis of projections from the forest growth
model PIPESTEM (Valentine et al. 1997), given increasing
concentration of carbon dioxide in the atmosphere. Our analysis
is based on Raftery et al.s (1995) Bayesian synthesis approach
and includes the construction of credible intervals, and
posterior distributions for model outputs. We will also
illustrate a Bayes factor approach for testing hypotheses
regarding assimilation rate (net carbon exchange per unit of
foliar dry matter).
Raftery, A.E., G.H. Givens, and J.E. Zeh. 1995. Inference from a
deterministic population dynamics model for Bowhead whales. JASA 90:402-442,
with discussion
Valentine, H.T., T.G. Gregoire, H.E. Burkhart, and D.Y. Hollinger. 1997.
Pipestem: A stand-level model of carbon allocation and growth, calibrated
for loblolly pine. Can. J. For. Res. 27:817-830.
Geophysics makes use of complex computer models of properties
of the Earth and of waves propagating through this, to estimate
characteristics of these models, which form images of the Earth's
structure. These images are used, for example to make decisions
about oil exploration. A large amount of data and knowledge goes
into specifying and estimating the models. However, there remain
many uncertainties, particularly due to the differing scales of
the data, models and the physical system. The sources of
uncertainty and some simple methods for assessing their effect on
the images and derived measures will be discussed and illustrated
with examples from oil exploration.
Sampling-based sensitivity analysis involves the generation
and exploration of a mapping from analysis inputs to analysis
outcomes. The essence of this exploration is a search for
patterns involving analysis inputs and outcomes. Various
possibilities for this exploration will be discussed and
illustrated, including identification of linear patterns,
monotonic patterns, and nonmonotonic patterns. The proposition is
advanced that there are probably many pattern recognition
procedures that have been developed for various purposes that
could be productively applied in the sensitivity analysis of
complex models.
In this talk I discuss several statistics for comparisons of
(1) a simulation model with its corresponding real system, and
(2) a metamodel (also known as a response surface) with its
underlying simulation model. These validation statistics are
inspired by practice and statistical analysis.
To derive the distributions of these statistics, I do not
assume particular distributions (no normality!); instead I use
bootstrapping. This bootstrapping requires different
formulations for situations of type (1) and (2) respectively.
To evaluate these many validation statistics and bootstrap
procedures, I use extensive Monte Carlo experiments. These
experiments quantify the type I (alpha) error and type II (beta)
error of the bootstrapped tests.
Pearsons correlation ratio can be used to measure the average
reduction in model output variance when a single model input
variable or subset of the inputs is assumed specified. In this
sense, the correlation ratio is a well defined measure of the
uncertainty importance of model inputs. When sample estimates of
the correlation ratio are used to measure importance, sampling
variability should be taken into account when distinguishing
among inputs. Naively applied procedures for tests of
significance are not appropriate: the underlying hypothesis of a
zero correlation ratio is almost sure to be invalid because all
inputs are used to calculate the output value. This paper
examines a procedure based on an hypothesis that underlying
correlation ratios of some individual inputs or subsets of inputs
are larger than others in some practical sense. The procedure
can be used to evaluate the adequacy of a set of computer runs to
distinguish or identify those inputs with significantly larger
correlation ratios.
This talk will overview a Bayesian approach to analysis of
computer code outputs. It is based on Gaussian process modelling
of the code, and so relates closely to the pioneering work of
Sacks, Mitchell and co. The approach is extended to uncertainty
analysis and calibration problems. The latter is achieved by
modelling the relationship between the code output and the real
process that the computer model represents. This provides a
unified framework for problems associated with use of computer
codes, that allows all sources of uncertainty to be addressed.
The propagation of radiation through media or vacuum is
mathematically described by the Boltzmann transport
equation. Analytic solutions to this equation are limited to
simple classes of problems such as those in one space-dimension
with a mono-energetic point source of radiation and homogeneous
media. For more realistic problems one must turn to numerical
solutions using either Monte Carlo simulation or deterministic
methods. Monte Carlo simulation is the method of choice when
dealing with problems that have complex three-dimensional
boundaries, heterogeneous materials and sources with complicated
energy distributions. The method has, since the early 40's, been
used to solve problems in the nuclear industry but it is now
increasingly used in all areas in which the transport of
radiation is important. Examples are: medical physics (radiation
protection, dosimetry and the planning of radiotherapy
treatment), the analysis and development of pulsed radiation
devices for oil and mineral exploration, the detection of illicit
materials and the propagation of light through the atmosphere for
remote sensing studies.
In order to solve a radiation transport problem one needs to
specify input data such as: material composition, atom fractions,
densities, volumes, areas, lengths, radiation energy spectra,
etc. Further one needs as input, values for the nuclear and
atomic cross-sections that describe the probability of different
events such as scatter, absorption and bifurcation of particles.
In all scientific endeavours, a determined value, be it
obtained from experimental measurement or via computation, must
have associated with it a degree of uncertainty and it is often
desirable or even necessary to estimate the level of this
uncertainty. This is especially true if the results are used in
critical areas, sensitivity studies or safety analysis.
Apart from systematic uncertainties associated with an
incomplete model, the uncertainties in any Monte Carlo
calculation arise mainly from three sources:
Monte Carlo (MC) practitioners solving radiation transport
problems will often report the uncertainty associated with
sampling statistics and will almost always ignore the effects of
uncertainty in the input data. Those engaged in deterministic
methods will hardly ever report any uncertainty on their final
calculation. It is clear, however, that uncertainties do exist
in the input data and that models used are generally incomplete
representations of reality. Evaluating the effect on the
calculated result of these uncertainties and discrepancies is
therefore mandatory if a full and meaningful solution to a
problem is to be produced.
In this paper, the various sources of uncertainty in a
computational model of radiation transport will be discussed.
Following this, some methods available to study the propagation
of uncertainties in input parameters during a Monte Carlo
simulation will be presented. In particular, we will look at the
so-called 'brute force' method, randomisation of input data,
correlated sampling, and the differential operator technique. A
central theme will be methods to assess the sensitivity of
computed results with respect to variances associated with the
input data.
It will be concluded that all but the 'brute force' method are
partially suitable for the assessment of uncertainties in Monte
Carlo radiation transport calculations and that much more effort
is needed by code developers to incorporate uncertainty analysis
in production codes.
Deterministic simulation models are used in many areas,
including population projections, the investigation of social
scientific theories, the making of environmental and other policy
decisions, atmospheric science, engineering, and pharmaceutical
research. They tend to be complex, and to require the
specification of many inputs. This is often done in an ad-hoc
manner, and little attention has been given to taking proper
account of uncertainty and evidence about the inputs and outputs
to the model. Statisticians have only started to be involved in
the analysis of such models, although their skills have the
potential to contribute a great deal.
I got involved in this problem through my work for the
International Whaling Comission on determining if bowhead whales
could safely be subjected to aboriginal subsistence hunting by
the Inuit people of Alaska, and on setting the quota. This has
traditionally done using deterministic population dynamics
models. Our first effort to take proper account of the
uncertainties involved was the Bayesian synthesis method of
Raftery, Givens and Zeh (1995, JASA). However, this suffers from
the Borel paradox, according to which the results may not be
invariant to reparameterizations of the model. I will describe
the Bayesian melding method, which overcomes this difficulty by
bringing together ideas from modeling, measure theory and the
pooling of expert opinions. This is joint work with David Poole.
Complex decision making problems introduce difficult
computational problems. We describe and compare various
computational schemes, with emphasis on MCMC related methods.
Evaluation of simulators faces basic questions:
We explore the tasks where sensitivity analysis (SA) can be
useful and try to assess its relevance within the modelling
process. We suggest that SA could considerably assist in the use
of models, by providing objective criteria of judgement for
different phases of the model building process: model
identification/discrimination, model calibration, model
corroboration. We review some new global quantitative SA
methods, and suggest that these might enlarge the scope for
sensitivity analysis in computational and statistical modelling
practice. Among the advantages of the new methods are their
robustness, model independence, and computational convenience.
The discussion is made on the basis of worked examples that
address the issues of model identification, calibration and
corroboration.
Confidence distributions are the frequentist's analogues of
Bayesian priors and posteriors. Being a frequentist concept, the
possible bias in a confidence distribution is well
defined. Approximately unbiased posterior confidence
distributions are obtained by bootstrapping of the likelihood
based on the data and on possible prior confidence distributions
based on previous data. As distinct from Bayesian analysis,
information on the probability basis of a prior distribution is
necessary to combine it with the data likelihood. Being a
likelihood analysis, no problem arise if there are more or fewer
prior distributions (with corresponding likelihood components)
than there are free parameters in the model.
The terms `Good model' or `better model' imply a value
judgement, based on the output of a model evaluation. The
evaluation is carried out to a greater of lesser extent based on
a number of different criteria, not necessarily of equal
importance and not necessarily all easily quantified and is a
process which has a role to play in every stage of model
development and construction when decisions need to be made about
how to proceed. Calibration and validation are two procedures
which can be linked to the evaluation process. In calibration,
we attempt to find parameter estimates which provide a good fit
between model predictions and a training data set while one
definition of validation is that we aim to demonstrate the
similarity between the model predictions and an independent test
data set. However, there continues to be some debate about the
concept, the terminology used and its meaningfulness.
Regardless of terminology, with complex models, immediate
difficulties become apparent in model evaluation even for these
two processes; the data requirements of complex computer models
are often large and the requirement for independent training and
test data sets is one which may not be realised. In many
application areas, the available data may not have been collected
with the specific model in mind, so that there may be immediate
mismatches between the spatial and temporal extent of the data
and model: the data is often observational rather than resulting
from a designed experiment and will be subject to variation and
uncertainty.
Issues which are raised by this consideration of calibration
and validation include the design of experiments, handling
uncertainties, assessing goodness of fit, missing data (often
paraphrased as data poor, process rich modelling), combining
different types of data (in a Bayesian framework) and model
complexity.
It is typical for large scale environmental models to bolt
together a number of deterministic pieces of software. The
internal structure of these component dynamic models is often
both highly technical and complex. In this paper we will discuss
when it is appropriate to handle the uncertainty associated with
each component separately. We explore a number of algorithms for
propagating uncertainty through the system when it is appropriate
to do so by adapting technologies originally developed for
somewhat simpler stochastic Bayes nets. The methodology will be
illustrated by two practical environmental applications.
One of the issues in previous discussions of the application
of HSSS techniques to environmental models has been the lack of
comparable application of HSSS techniques with those used in
traditional model sensitivity analyses. This talk will report on
an investigation of a range of techniques which may be applied to
the analysis and predictive modelling of three pollution
scenarios, each with a process-based physical/chemical model and
with monitored data.
The first scenario uses a set of 4 time series of hourly ozone
concentrations on a transect of about 40km from Edinburgh city
centre to a totally rural hilltop with very clean air. There is a
possible Bayesian analysis developed for Norwegian ozone data by
Gudmund Host and a number of alternative ideas (e.g. Anderson and
Smith) have been proposed for the UK ozone monitoring data.
The second model/data set is for wet deposition with
orographic enhancement in the UK. There are rain concentrations
collected at a number of sites (about 40) and these are
interpolated to provide a continuous concentration
field. Rainfall is provided from the UK Meteorological Office at
a 5km spatial scale. The two are multiplied together but there is
an adjustment to account for increased deposition at higher
altitudes. This is a fairly simple mechanistic model for which
there has been no rigorous attempt at an uncertainty analysis so
far. Schmidt and O'Hagan presented a poster at the HSSS
conference at Pavia which is relevant to both this work and to
the third example.
The third example is a complex atmospheric
chemical/meteorology model for predicting ammonia concentrations
from ammonia emissions. Each model run requires considerable
computing time and the options for a traditional sensitivity
analysis are restricted. In this case there are some runs already
available to fulfill requests for particular scenarios which may
provide sufficient additional information. This case study may
only result in identifying the scale of work required to
implement a fuller uncertainty analysis using the various
possible techniques.
These three example scenarios have an increasing degree of
mechanistic model complexity and provide a basis for the
comparisons of the two sets of techniques. They are simple enough
that such a comparison should not get distracted by obscurer
details of the science behind the models. Although using UK
data, the examples are all very general and the comparisons
should be applicable to many environmental models.
A European project which had as its aim the full-scale
application of the ideas of computer experiments to emulate
computer code and their use in multi-objective optimisation is
discussed. The main objective of the project is to be able to
optimise several objectives from the same or different simulators
simultaneously. Case studies come form the automotive and aero
industries via the European partnership. This is an example of
close collabortaion between statisticians, engineers, software
developers and additional specialsits in, for example,
optimisation. The project should result in a commercial product
of wide applicability. Enhancements are planned and will be
discussed. This represents an opportunity for fast technology
transfer of new statistical functions into a ready-made
environment.
This page is maintained by Jonathan Rougier.
It was last updated on 28.02.00.
Probabilistic Inversion and Dependence
Roger Cooke, Applications of Decision Theory, Department of
Mathematics Delft University of Technology
Comparison of an Ocean Circulation Model with
Observational Data
Dennis Cox, National Center for Atmospheric Research
Bayesian forecasting and calibration for
complex physical systems using multi-level computer codes
Peter Craig, Michael Goldstein, Jonathan Rougier and Allan
Seheult, Dept. Mathematical Sciences, Univ. of Durham (J.C.Rougier@durham.ac.uk)
Functional data analysis of complex computer
simulation output: a case study in nuclear waste disposal risk
assessment
David Draper, University of Bath, UK
OK we have done the analysis ... now what????
Simon French, Manchester Business School
Analysis of Projections from a Forest Growth Model
Edwin J. Green, Rutgers University, New Brunswick, NJ USA, and
Harry T. Valentine, USDA Forest Service, Durham, NH USA
Uncertainty in Geophysical Imaging for Oil
Exploration
Howard Grubb,
Dept. Applied Statistics, Univ. Reading, UK
Sensitivity Analysis as a Problem in Pattern
Recognition
Jon C. Helton
Validation In Simulation: Bootstrap Tests
Jack
P.C. Kleijnen, Dept. Information Systems, Centre for Economic
Research, Tilburg University, Netherlands
Distinguishing Importance Among Model Input
Variables
Michael D. McKay and Todd L. Graves, Los Alamos National
Laboratory, Los Alamos, NM, USA
Bayesian calibration and uncertainty
analysis
Tony O'Hagan, University
of Sheffield
Attempting an overview
At roughly the mid-point of the workshop, it is appropriate to
try to form an overview of the various methodologies being
presented. Tony O'Hagan will rashly attempt to
do this on the Tuesday evening. Hopefully, the imbibing of a
certain amount of alcohol by some of the audience will foster an
atmosphere of bonhomie - "don't shoot the pianist, he's doing his
best"!
Uncertainty assessment in Monte Carlo
radiation transport simulation.
Robert Alan Price,
Physics Department,
Clatterbridge Centre for Oncology,
Bebington, Wirral, Mersyside CH63 4JY
(robertp@ccotrust.co.uk)
Statistical Inference for Deterministic
Simulation Models: The Bayesian Melding Approach
Adrian E. Raftery, University of Washington
Computational methods for decision
theoretical problems
David Rios Insua, Dept. Engineering, U. Rey Juan Carlos, Madrid,
Spain, and Mike Wiper Dept. Statistics, U. Carlos III, Madrid, Spain
Evaluating Complex Computer Models
Jerome Sacks, NISS
Through ongoing case studies of traffic simulators and
transportation models used for timing of traffic signals and
modeling traffic flows on urban networks, these questions will be
discussed in connection with specific issues of:
Sensitivity analysis as an ingredient of
modelling
Andrea Saltelli, Stefano Tarantola and Francesca Campolongo,
Institute for Systems,
Informatics and Safety, Joint Research Centre of the European
Commission, Ispra (I)
Confidence distributions and likelihood
Tore Schweder
Evaluation, calibration and validation of
complex computer models
Marian Scott, Dept of Statistics, University of Glasgow
(marian@stats.gla.ac.uk)
Bayes Meta-Nets for Large Scale Dynamic
Models: with Environmental Applications
Jim Q. Smith and K. Tze, University of Warwick
Comparison of techniques to assess the
uncertainty in output from environmental pollution models
Ron Smith (and others),
Institute of Terrestrial Ecology, Edinburgh (ris@ite.ac.uk)
CE2: Computer Experiments in Concurrent
Engineering, results of a European project
Henry Wynn, Dept. Statistics, Univ. Warwick, UK