Conditional mean imputation is a common way to deal with missing data. Although very simple to implement, the method might suffer from model misspecification and it results unsatisfactory for non linear data. We propose the iterative use of tree based models for missing data imputation in large data bases. The proposed procedure uses lexicographic order to rank missing values that occur in different variables and deals with these incrementally, i.e, augmenting the data by the previously filled in records according to the defined order.
Missing data incremental imputation through tree based methods / C., Conversano; Cappelli, Carmela. - STAMPA. - (2002), pp. 455-460. (Intervento presentato al convegno COMPSTAT 2002 tenutosi a Berlino nel 24-28 agosto).
Missing data incremental imputation through tree based methods
CAPPELLI, CARMELA
2002
Abstract
Conditional mean imputation is a common way to deal with missing data. Although very simple to implement, the method might suffer from model misspecification and it results unsatisfactory for non linear data. We propose the iterative use of tree based models for missing data imputation in large data bases. The proposed procedure uses lexicographic order to rank missing values that occur in different variables and deals with these incrementally, i.e, augmenting the data by the previously filled in records according to the defined order.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.