ABSTRACT OF Professor Bengio's TALK


Faking unlabeled data for geometric regularization

September 5, 2002, 8:30-9:20

Metric-based methods, which use unlabeled data to detect gross differences in behavior away from the training points, have recently been introduced for model selection, often yielding very significant improvements over alternatives (including cross-validation). We introduce extensions that take advantage of the particular case of time-series data in which the task involves prediction with a horizon h. The ideas are (i) to use at t the h unlabeled examples that precede t for model selection, and (ii) take advantage of the different error distributions of cross-validation and the metric methods. Experimental results establish the effectiveness of these extensions in the context of feature subset selection for time-series data.