TITLE: Imputating Missing Values in Diary Records of Sun-exposure Study

AUTHORS: A. Szymkowiak, P.A. Philipsen, J. Larsen, L.K. Hansen, E. Thieden, H.C. Wulf

Informatics and Mathematical Modelling, Building 321
Technical University of Denmark, DK-2800 Lyngby, Denmark
emails: asz,jl,lkhansen@imm.dtu.dk
www: http://eivind.imm.dtu.dk

Department of Dermatology, Bispebjerg Hospital
University of Copenhagen, Bispebjerg Bakke 23
DK-2400 Copenhagen, Denmark

ABSTRACT:

In a sun-exposure study, questionnaires concerning sun-habits were collected from 195 subjects. This paper focuses on the general problem of missing data values which occurs when some, or even all, the questions have not been answered in a questionnaire. Here only missing values of low concentration are investigated. We consider and compare two different models for imputating missing values: the Gaussian model and the non-parametric $K$-Nearest Neighbor model.

To appear in Proc. of NNSP-2001, Falmouth, Massachusetts, Sept. 10-12, 2001.