A missing value represents a piece of incomplete information that might appear in database instances. Data imputation is the problem of filling missing values by means of consistent data with respect to the semantic of the entire database instance they belong to. To overcome the complexity of considering all possible candidates for each missing value, heuristic methods have become popular to enhance execution times, while keeping high accuracy. This paper presents RENUVER, a new data imputation algorithm relying on relaxed functional dependencies RFDs for identifying value candidates best guaranteeing the integrity of data. More specifically, the RENUVER imputation process focuses on the fds involving the attribute whose value is missing. In particular, they are used to guide the selection of best candidate tuples from which to take values for imputing a missing value, and to evaluate the semantic consistency of the imputed missing values. Experimental results on real-world datasets highlighted the effectiveness of RENUVER in terms of both filling accuracy and execution times, also compared to other well-known missing value imputation approaches.

RENUVER: A Missing Value Imputation Algorithm based on Relaxed Functional Dependencies / Breve, Bernardo; Caruccio, Loredana; Deufemia, Vincenzo; Polese, Giuseppe. - (2022), pp. 52-64. ( 25th International Conference on Extending Database Technology (EDBT) Edinburgh, UK March 29 - April 1, 2022) [10.5441/002/edbt.2022.05].

RENUVER: A Missing Value Imputation Algorithm based on Relaxed Functional Dependencies

Bernardo Breve;
2022

Abstract

A missing value represents a piece of incomplete information that might appear in database instances. Data imputation is the problem of filling missing values by means of consistent data with respect to the semantic of the entire database instance they belong to. To overcome the complexity of considering all possible candidates for each missing value, heuristic methods have become popular to enhance execution times, while keeping high accuracy. This paper presents RENUVER, a new data imputation algorithm relying on relaxed functional dependencies RFDs for identifying value candidates best guaranteeing the integrity of data. More specifically, the RENUVER imputation process focuses on the fds involving the attribute whose value is missing. In particular, they are used to guide the selection of best candidate tuples from which to take values for imputing a missing value, and to evaluate the semantic consistency of the imputed missing values. Experimental results on real-world datasets highlighted the effectiveness of RENUVER in terms of both filling accuracy and execution times, also compared to other well-known missing value imputation approaches.
2022
RENUVER: A Missing Value Imputation Algorithm based on Relaxed Functional Dependencies / Breve, Bernardo; Caruccio, Loredana; Deufemia, Vincenzo; Polese, Giuseppe. - (2022), pp. 52-64. ( 25th International Conference on Extending Database Technology (EDBT) Edinburgh, UK March 29 - April 1, 2022) [10.5441/002/edbt.2022.05].
File in questo prodotto:
File Dimensione Formato  
RENUVER.pdf

non disponibili

Licenza: Non specificato
Dimensione 873.58 kB
Formato Adobe PDF
873.58 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/977627
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 29
  • ???jsp.display-item.citation.isi??? ND
social impact