Advanced methods for nonparametric modelling
Carl Edward Rasmussen
Cyril Goutte
Roderick MurraySmith
ed,cg,rod@eivind.imm.dtu.dk
IMM, Section
for digital signal processing
Preliminary program
In this course we will discuss various topics for nonlinear, nonparametric
modelling. We will discuss the fundamental issues involved in modelling
and describe some prominent methods including both nonBayesian and Bayesian
approaches to neural networks. The choice of topics is not exhaustive,
but rather governed by our experience and personal inclinations.
Place: The lectures will take place in building 305, room 205 (2nd floor).
The assignments will be performed in the terminalroom in the basement of
building 305 using the sections' linux machnine.
Time: The course will take place in week 3 (january 1822). It will
contain 11 lectures given in English, 1,5 hours each, and computer exercises
for 3 to 4 hours in the afternoons. The course is based on testbook material
as well as discussion of recent research papers.
Check also the administration
page for the course.
Basic readings: These references are given as indication of the material
covered last year. They will change in the coming weeks. This is meant
as a list of work you can refer to after the course if you want to get
more information about a given topic.

On numerical optimisation:
Chapters 2 to 4 (pp. 1294) of: Fletcher (1987) Practical methods of
optimization, 2nd ed., John Wiley: new York., .
Chapter 10 (pp. 394455) of: Press & al. (1992) Numerical Recipes
in C, 2nd ed., Cambridge. online
Appendix B (pp. 121125) of: Rasmussen, C.E. (1996) Evaluation of
Gaussian processes and other methods for nonlinear regression, PhD
thesis, U. of Toronto. postscript

On generalisation estimation:
Chapter 17 of: Efron, B. and Tibshirani, R. (1993) An Introduction
to the Bootstrap, Monograph on Statistics and Probability 57, Chapman&Hall.
Chapter 3 "
Hyperparameters" (pp. 4156) of Cyril Goutte (1997) Statstical
learning and regularisation, PhD thesis, U. Paris 6.
Chapter 3 and 5 (pp. 5889 and 114145) of: Wand, M.P. and Jones, M.C.
(1995) Kernel Smoothing, Monograph on Statistics and Probability
60, Chapman&Hall.

On multivariate kernel regression:
Lowe, D. (1995) Similarity metric learning for a variablekernel
classifier, Neural Computation 7:1, pp. 7285. [HTML],
[PostScript,
145K]
Goutte, C. and Larsen, J. Adaptive Metric Kernel Regression,
Neural Networks for Signal Processing VIIIProceedings of the 1998 IEEE
workshop (Cambridge UK, Sept. 1998), pp. 184193.
[HTML]

On regularisation:
Chapter 2 "
Regularisation" (pp. 1940) of Cyril Goutte (1997) Statstical learning
and regularisation, PhD thesis, U. Paris 6.
Chapter 4 of C. Bishop (1995) Neural Networks for Pattern Recognition,
Oxford U. Press.

On MCMC for neural networks:
Carl Edward Rasmussen (1996) A Practical Monte Carlo Implementation
of Bayesian Learning, Advances in Neural Information Processing Systems
8, eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press: postscript.
Chapter 3, 4 and 5 (pp. 3086) of: Radford M. Neal (1993)
Probabilistic
inference using Markov chain Monte Carlo methods, Technical Report
CRGTR931, Dept. of Computer Science, University of Toronto, 144 pages:
list
of contents, postscript.
David J. C. MacKay (1997) Introduction to Monte Carlo Methods,
review paper to appear in the proceedings of an Erice summer school, ed.
M. Jordan: abstract,
postscript.
Radford M. Neal (1996) Bayesian Learning for Neural Networks,
Lecture Notes in Statistics 118, SpringerVerlag New York: info.

On RBF nets & Mixture models, Chapters 2, 5 & 9 of Bishop
(1995) Neural Networks for Pattern Recognition. While Titterington
et. al. give a more in depth statistical treatment: D.M. Titterington,
A.F.M. Smith and U.E. Makov, Statistical Analysis of Finite Mixture
Distributions, John Wiley \& Sons, 1985. Hierarchical mixtures
of experts are described in M. I. Jordan and R. A. Jacobs, Hierarchical
mixtures of experts and the EM algorithm.

The multiple model approach is reviewed in:
T. A. Johansen and R. MurraySmith, The Operating Regime Approach
to Nonlinear Modelling and Control, p 372, in R. MurraySmith and
T. A. Johansen, eds., Multiple
Model Approaches to Modelling and Control, Taylor and Francis, 1997.
Online version at ftp://eivind.imm.dtu.dk/pub/rodbookch1.ps.gz

On Gaussian Processes:
Christopher K. I. Williams & Carl Edward Rasmussen (1996)
Gaussian
Processes for Regression, Advances in Neural Information Processing
Systems 8, eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press:
postscript.
Chapter 4 (pp. 4967) of: Carl Edward Rasmussen (1996) Evaluation
of Gaussian processes and other methods for nonlinear regression,
PhD thesis, U. of Toronto: postscript.
Radford M. Neal (1997) Monte Carlo implementation of Gaussian process
models for Bayesian regression and classification Technical Report
No. 9702, Dept. of Statistics (January 1997), 24 pages: abstract,
postscript.
The
datasets, Matlab functions and assignments are available on a separate
webpage.
The topics covered will be:

probabilistic model fitting,

generalisation,

regularisation,

Bayesian learning,

Markov Chain methods.
We will focus our attention on the following nonparametric models:

Neural Networks,

Kernel methods,

Radial Basis Function networks,

Gaussian Processes,

Gaussian mixture models.
Monday 18, 9.00 9.30: Course introduction
Monday 18, 9.4012.00: Numerical methods for model
fitting
Onedimensional optimisation and line search, Multidimensional optimisation,
steepest descent, Newton and quasiNewton, conjugate gradient.
Monday 18, 14.0015.20: Introduction to Neural Networks
The multilayer perceptron (MLP), training of MLP, regularisation techniques,
parameter selection (pruning).
Monday 18, 15.3017.00: Generalisation and Regularisation
Wellposed and illposed problems, regularisation, example on linear models
Risk minimisation, generalisation, generalisation bounds, generalisation
estimators, resampling.
Tuesday 19, 9.0013.00: Assignment 1: Optimisation
and NN (in Matlab)
Comparison of optimisation methods on a simple problem.
Training a neural network.
Experiments on overfitting.
Tuesday 19, 14.0015.20: Kernel methods
Kernel density estimation, kernel regression, local smoothing, bandwidth
estimation, multivariate regression, variable metric.
Tuesday 19, 15.3017.00: Introduction to Radial Basis Function networks & Mixture models
Multiple component approaches, Basis function models, Mixture models, Expectation Maximisation (EM)
Wednesday 20, 9.0013.00: Assignment 2: Kernel
and RBF (in Matlab)
Multivariate kernel regression.
Mixture of Gaussians for probability density function estimation.
Mixture of linear models for regression
RBF net for regression.
Wednesday 20, 14.0015.20: Bayesian inference
and MCMC
Wednesday 20, 15.3017.00: Bayesian training of
Neural networks
Thursday 21, 9.0013.00: Assignment 3: MCMC and
NN
Bayesian training of neural networks using Radford Neal's flexible Bayesian
modelling software.
Thursday 21, 14.0015.20: Gaussian Processes
Thursday 21, 15.3017.00: Bayesian training of
Gaussian Processes
Friday 22, 9.0013.00: Assignment 4: Gaussian Processes
(in Matlab)
MAP training of Gaussian Processes.
Bayesian trianing of GP using Radford Neal's fbm software.
Friday 22, 14.0015.20: Local models, effective degrees of freedom & equivalent kernels
Mixture models where each mixture has a local linear function instead of a constant weight.
Effective degrees of freedom.
Equivalent kernel interpretation of linearintheparameters identification of basis function models.
Friday 22, 15.3017.00: Infinite Gaussian mixture
models
If you access this page through frames, you can click or copy the following
link in your browser
http://eivind.imm.dtu.dk/teaching/phd_NonParReg.html
Last modified October 29, 1998
Write to the DSP, IMM webmaster at www@eivind.imm.dtu.dk
©
Copyright 1998 by Section for DSP, IMM.