Project IV


Evaluate Probabilistic Forecast via Scoring Rules

Hailiang Du

Description

Forecast evaluation has a long history of being a crucial topic for model development and decision support. The outputs from a stochastic model can be naturally interpreted in the form of probabilistic forecast. Given a deterministic model, uncertainty in the initial state due to the observational noise; limited computational power; and model discrepancy prevent one from making a perfect deterministic forecast of the future or even identifying the Truth in the past. In order to account for all sorts of uncertainties, the model outputs are often interpreted as probabilistic forecasts with the aim of providing useful information for decision support. Probabilistic forecasts have been widely adopted in various fields including meteorology, social science, pharmacology, economics and finance; and have become common in operational forecasting over the last quarter century.

The evaluation of probabilistic forecasts plays a central role both in the interpretation and in the use of forecast systems and their development. Such evaluation has not yet been standardized, with many different probabilistic scores being used. As probabilistic forecasts become more common, there is a need to select probabilistic score(s) for constructing probabilistic forecasts, calibrating forecast systems, ranking competing forecast systems and quantifying forecast improvement.

The importance of using strictly proper scores has been noted in the literature, as only strictly proper scores are “internally consistent”, that is given a strictly proper scoring rule, the True forecast system will always be preferred whenever it is included amongst those under consideration. When the discussion is restricted to strictly proper scores, however, there remains considerable variability between scores (there are, in fact, an infinite number of strictly proper scores). And strictly proper scores need not rank competing forecast systems in the same order when none of these systems are perfect.

This project aims to explore a number of popular scoring rules in the literature, investigate their strengths and weaknesses and provide recommendations for decision support.

Prerequisites

Calculus and Probability I, Statistical Concepts II

References

  • J. Brocker and L.A. Smith, Scoring probabilistic forecasts: the importance of being proper, Weather and Forecasting, 22 (2): 382-388, (2007).
  • T. Gneiting, A. E. Raftery, Strictly proper scoring rules, prediction and estimation. Journal of the American Statistical Association, 477:359-378 (2007).
  • H. Du. Beyond Strictly Proper Scoring Rules: The Importance of Being Local, Weather and Forecasting, 36(2), 457-468, (2021).
  • M. S. Roulston and L.A. Smith, Evaluating Probabilistic Forecasts Using Information Theory. Monthly Weather Review, 6: 1653-1660, (2002).
  • R. L. Winkler et. al., Scoring rules and the evaluation of probabilities. Test, 5(1):1-60, (1996).

email: Hailiang Du


Back