TITLE: ADAPTIVE REGULARIZATION
AUTHORS: L.K. Hansen, C. E. Rasmussen C. Svarer, and J. Larsen
ABSTRACT:
Regularization, e.g., in the form of weight decay, is important for
training and optimization of neural network architectures.
In this work we
provide a tool based on asymptotic sampling theory, for iterative estimation
of weight decay parameters. The basic idea is to do a gradient descent
in the estimated generalization error with respect to the
regularization parameters. The scheme is implemented in
our {\it Designer Net\/} framework for network training and pruning, i.e.,
is based on the diagonal Hessian approximation. The scheme
does not require essential
computational overhead in addition to what is needed for training
and pruning. The viability of the approach is demonstrated in an
experiment concerning prediction of the chaotic Mackey-Glass series.
We find that the optimized weight decays are relatively large for
densely connected networks in the initial pruning phase, while they
decrease as pruning proceeds.
Accpeted for the 4th IEEE Workshop on Neural Networks for Signal Processing,
Greece, September 1994.