Data fusion techniques usually have the task to create a complete data file from different sources which do not contain the same units. Generally, it is obtained considering variables common to all files. In this paper, we provide an innovative methodology for Data Fusion based on an incremental imputation algorithm using tree-based methods. In addition, we consider robust tree validation by boosting procedures. As benchmarking methods we consider a classical technique, multiple regression, as well as an implicit method based on principal component analysis. A widely extended simulation study shows how the proposed method is more accurate than the other ones.

Robust Incremental Trees for Missing Data Imputation and data Fusion

ARIA, MASSIMO;D'AMBROSIO, ANTONIO;SICILIANO, ROBERTA
2007

Abstract

Data fusion techniques usually have the task to create a complete data file from different sources which do not contain the same units. Generally, it is obtained considering variables common to all files. In this paper, we provide an innovative methodology for Data Fusion based on an incremental imputation algorithm using tree-based methods. In addition, we consider robust tree validation by boosting procedures. As benchmarking methods we consider a classical technique, multiple regression, as well as an implicit method based on principal component analysis. A widely extended simulation study shows how the proposed method is more accurate than the other ones.
9788860560209
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11588/203739
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact