Advanced methods for non-parametric modelling
Carl Edward Rasmussen
Cyril Goutte
Roderick Murray-Smith
ed,cg,rod@eivind.imm.dtu.dk
IMM, Section
for digital signal processing
Preliminary program
In this course we will discuss various topics for non-linear, non-parametric
modelling. We will discuss the fundamental issues involved in modelling
and describe some prominent methods including both non-Bayesian and Bayesian
approaches to neural networks. The choice of topics is not exhaustive,
but rather governed by our experience and personal inclinations.
Place: The lectures will take place in building 305, room 205 (2nd floor).
The assignments will be performed in the terminal-room in the basement of
building 305 using the sections' linux machnine.
Time: The course will take place in week 3 (january 18-22). It will
contain 11 lectures given in English, 1,5 hours each, and computer exercises
for 3 to 4 hours in the afternoons. The course is based on test-book material
as well as discussion of recent research papers.
Check also the administration
page for the course.
Basic readings: These references are given as indication of the material
covered last year. They will change in the coming weeks. This is meant
as a list of work you can refer to after the course if you want to get
more information about a given topic.
-
On numerical optimisation:
Chapters 2 to 4 (pp. 12-94) of: Fletcher (1987) Practical methods of
optimization, 2nd ed., John Wiley: new York., .
Chapter 10 (pp. 394-455) of: Press & al. (1992) Numerical Recipes
in C, 2nd ed., Cambridge. on-line
Appendix B (pp. 121-125) of: Rasmussen, C.E. (1996) Evaluation of
Gaussian processes and other methods for non-linear regression, PhD
thesis, U. of Toronto. postscript
-
On generalisation estimation:
Chapter 17 of: Efron, B. and Tibshirani, R. (1993) An Introduction
to the Bootstrap, Monograph on Statistics and Probability 57, Chapman&Hall.
Chapter 3 "
Hyper-parameters" (pp. 41-56) of Cyril Goutte (1997) Statstical
learning and regularisation, PhD thesis, U. Paris 6.
Chapter 3 and 5 (pp. 58-89 and 114-145) of: Wand, M.P. and Jones, M.C.
(1995) Kernel Smoothing, Monograph on Statistics and Probability
60, Chapman&Hall.
-
On multivariate kernel regression:
Lowe, D. (1995) Similarity metric learning for a variable-kernel
classifier, Neural Computation 7:1, pp. 72-85. [HTML],
[PostScript,
145K]
Goutte, C. and Larsen, J. Adaptive Metric Kernel Regression,
Neural Networks for Signal Processing VIII--Proceedings of the 1998 IEEE
workshop (Cambridge UK, Sept. 1998), pp. 184-193.
[HTML]
-
On regularisation:
Chapter 2 "
Regularisation" (pp. 19-40) of Cyril Goutte (1997) Statstical learning
and regularisation, PhD thesis, U. Paris 6.
Chapter 4 of C. Bishop (1995) Neural Networks for Pattern Recognition,
Oxford U. Press.
-
On MCMC for neural networks:
Carl Edward Rasmussen (1996) A Practical Monte Carlo Implementation
of Bayesian Learning, Advances in Neural Information Processing Systems
8, eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press: postscript.
Chapter 3, 4 and 5 (pp. 30-86) of: Radford M. Neal (1993)
Probabilistic
inference using Markov chain Monte Carlo methods, Technical Report
CRG-TR-93-1, Dept. of Computer Science, University of Toronto, 144 pages:
list
of contents, postscript.
David J. C. MacKay (1997) Introduction to Monte Carlo Methods,
review paper to appear in the proceedings of an Erice summer school, ed.
M. Jordan: abstract,
postscript.
Radford M. Neal (1996) Bayesian Learning for Neural Networks,
Lecture Notes in Statistics 118, Springer-Verlag New York: info.
-
On RBF nets & Mixture models, Chapters 2, 5 & 9 of Bishop
(1995) Neural Networks for Pattern Recognition. While Titterington
et. al. give a more in depth statistical treatment: D.M. Titterington,
A.F.M. Smith and U.E. Makov, Statistical Analysis of Finite Mixture
Distributions, John Wiley \& Sons, 1985. Hierarchical mixtures
of experts are described in M. I. Jordan and R. A. Jacobs, Hierarchical
mixtures of experts and the EM algorithm.
-
The multiple model approach is reviewed in:
T. A. Johansen and R. Murray-Smith, The Operating Regime Approach
to Nonlinear Modelling and Control, p 3-72, in R. Murray-Smith and
T. A. Johansen, eds., Multiple
Model Approaches to Modelling and Control, Taylor and Francis, 1997.
Online version at ftp://eivind.imm.dtu.dk/pub/rod-bookch1.ps.gz
-
On Gaussian Processes:
Christopher K. I. Williams & Carl Edward Rasmussen (1996)
Gaussian
Processes for Regression, Advances in Neural Information Processing
Systems 8, eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press:
postscript.
Chapter 4 (pp. 49-67) of: Carl Edward Rasmussen (1996) Evaluation
of Gaussian processes and other methods for non-linear regression,
PhD thesis, U. of Toronto: postscript.
Radford M. Neal (1997) Monte Carlo implementation of Gaussian process
models for Bayesian regression and classification Technical Report
No. 9702, Dept. of Statistics (January 1997), 24 pages: abstract,
postscript.
The
datasets, Matlab functions and assignments are available on a separate
webpage.
The topics covered will be:
-
probabilistic model fitting,
-
generalisation,
-
regularisation,
-
Bayesian learning,
-
Markov Chain methods.
We will focus our attention on the following non-parametric models:
-
Neural Networks,
-
Kernel methods,
-
Radial Basis Function networks,
-
Gaussian Processes,
-
Gaussian mixture models.
Monday 18, 9.00- 9.30: Course introduction
Monday 18, 9.40-12.00: Numerical methods for model
fitting
One-dimensional optimisation and line search, Multi-dimensional optimisation,
steepest descent, Newton and quasi-Newton, conjugate gradient.
Monday 18, 14.00-15.20: Introduction to Neural Networks
The multi-layer perceptron (MLP), training of MLP, regularisation techniques,
parameter selection (pruning).
Monday 18, 15.30-17.00: Generalisation and Regularisation
Well-posed and ill-posed problems, regularisation, example on linear models
Risk minimisation, generalisation, generalisation bounds, generalisation
estimators, resampling.
Tuesday 19, 9.00-13.00: Assignment 1: Optimisation
and NN (in Matlab)
Comparison of optimisation methods on a simple problem.
Training a neural network.
Experiments on over-fitting.
Tuesday 19, 14.00-15.20: Kernel methods
Kernel density estimation, kernel regression, local smoothing, bandwidth
estimation, multivariate regression, variable metric.
Tuesday 19, 15.30-17.00: Introduction to Radial Basis Function networks & Mixture models
Multiple component approaches, Basis function models, Mixture models, Expectation Maximisation (EM)
Wednesday 20, 9.00-13.00: Assignment 2: Kernel
and RBF (in Matlab)
Multivariate kernel regression.
Mixture of Gaussians for probability density function estimation.
Mixture of linear models for regression
RBF net for regression.
Wednesday 20, 14.00-15.20: Bayesian inference
and MCMC
Wednesday 20, 15.30-17.00: Bayesian training of
Neural networks
Thursday 21, 9.00-13.00: Assignment 3: MCMC and
NN
Bayesian training of neural networks using Radford Neal's flexible Bayesian
modelling software.
Thursday 21, 14.00-15.20: Gaussian Processes
Thursday 21, 15.30-17.00: Bayesian training of
Gaussian Processes
Friday 22, 9.00-13.00: Assignment 4: Gaussian Processes
(in Matlab)
MAP training of Gaussian Processes.
Bayesian trianing of GP using Radford Neal's fbm software.
Friday 22, 14.00-15.20: Local models, effective degrees of freedom & equivalent kernels
Mixture models where each mixture has a local linear function instead of a constant weight.
Effective degrees of freedom.
Equivalent kernel interpretation of linear-in-the-parameters identification of basis function models.
Friday 22, 15.30-17.00: Infinite Gaussian mixture
models
If you access this page through frames, you can click or copy the following
link in your browser
http://eivind.imm.dtu.dk/teaching/phd_NonParReg.html
Last modified October 29, 1998
Write to the DSP, IMM webmaster at www@eivind.imm.dtu.dk
©
Copyright 1998 by Section for DSP, IMM.