One of the main issues in machine learning is related to the quality of data used to efficiently train statistical models for classification/regression tasks. Among these issues, the presence of missing values in data sets is particularly prone in affecting the accuracy performance of learning methods. As a consequence there is a strong emergence of software tools aimed at supporting machine learning users in "filling-in"their data sets before inputting them to training algorithms. This paper bridges this gap by introducing a web-based tool for MIssing DAta imputation (MIDA) based on a novel supervised learning method, namely Generalized Boosted Incremental Non Parametric Imputation algorithm (G-BINPI), able to address the missing values issue in scenarios where a "missing at random"assumption occurs. The proposed approach enables machine learning users to remotely imputing their data sets by means of an intuitive graphical user interface. As highlighted in the experimental section, the proposed approach yields better performance than conventional approaches for missing data imputation on different benchmark data sets.

MIDA: A web tool for missing data imputation based on a boosted and incremental learning algorithm / Acampora, G.; Vitiello, A.; Siciliano, R.. - 2020-:(2020), pp. 1-6. (Intervento presentato al convegno 2020 IEEE International Conference on Fuzzy Systems, FUZZ 2020 tenutosi a gbr nel 2020) [10.1109/FUZZ48607.2020.9177644].

MIDA: A web tool for missing data imputation based on a boosted and incremental learning algorithm

Acampora G.;Vitiello A.;Siciliano R.
2020

Abstract

One of the main issues in machine learning is related to the quality of data used to efficiently train statistical models for classification/regression tasks. Among these issues, the presence of missing values in data sets is particularly prone in affecting the accuracy performance of learning methods. As a consequence there is a strong emergence of software tools aimed at supporting machine learning users in "filling-in"their data sets before inputting them to training algorithms. This paper bridges this gap by introducing a web-based tool for MIssing DAta imputation (MIDA) based on a novel supervised learning method, namely Generalized Boosted Incremental Non Parametric Imputation algorithm (G-BINPI), able to address the missing values issue in scenarios where a "missing at random"assumption occurs. The proposed approach enables machine learning users to remotely imputing their data sets by means of an intuitive graphical user interface. As highlighted in the experimental section, the proposed approach yields better performance than conventional approaches for missing data imputation on different benchmark data sets.
2020
978-1-7281-6932-3
MIDA: A web tool for missing data imputation based on a boosted and incremental learning algorithm / Acampora, G.; Vitiello, A.; Siciliano, R.. - 2020-:(2020), pp. 1-6. (Intervento presentato al convegno 2020 IEEE International Conference on Fuzzy Systems, FUZZ 2020 tenutosi a gbr nel 2020) [10.1109/FUZZ48607.2020.9177644].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/838285
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact