Remote monitoring and collection of water consumption has gained pivotal importance in the field of demand understanding, modelling and prediction. However, most of the analyses that can be performed on such databases could be jeopardized by inconsistencies due to technological or behavioural issues causing significant amounts of missing or anomalous values. In the present paper, a nonparametric, unsupervised approach is presented to investigate the reliability of a consumption database, applied to the dataset of a district metering area in Naples (Italy) and focused on the detection of suspicious amounts of zero or outlying data. Results showed that the methodology is effective in identifying criticalities both in terms of unreliable time series, namely time series having huge amounts of invalid data, and in terms of unreliable data, namely data values suspiciously different from some suitable central parameters, irrespective of the source causing the anomaly. As such, the proposed approach is suitable for large databases when no prior information is known about the underlying probability distribution of data, and it can also be coupled with other nonparametric, pattern-based methods in order to guarantee that the database to be analysed is homogeneous in terms of water uses.

A nonparametric framework for water consumption data cleansing: An application to a smart water network in Naples (Italy) / Padulano, R.; Del Giudice, G.. - In: JOURNAL OF HYDROINFORMATICS. - ISSN 1464-7141. - 22:4(2020), pp. 666-680. [10.2166/hydro.2020.133]

A nonparametric framework for water consumption data cleansing: An application to a smart water network in Naples (Italy)

Padulano R.;Del Giudice G.
2020

Abstract

Remote monitoring and collection of water consumption has gained pivotal importance in the field of demand understanding, modelling and prediction. However, most of the analyses that can be performed on such databases could be jeopardized by inconsistencies due to technological or behavioural issues causing significant amounts of missing or anomalous values. In the present paper, a nonparametric, unsupervised approach is presented to investigate the reliability of a consumption database, applied to the dataset of a district metering area in Naples (Italy) and focused on the detection of suspicious amounts of zero or outlying data. Results showed that the methodology is effective in identifying criticalities both in terms of unreliable time series, namely time series having huge amounts of invalid data, and in terms of unreliable data, namely data values suspiciously different from some suitable central parameters, irrespective of the source causing the anomaly. As such, the proposed approach is suitable for large databases when no prior information is known about the underlying probability distribution of data, and it can also be coupled with other nonparametric, pattern-based methods in order to guarantee that the database to be analysed is homogeneous in terms of water uses.
2020
A nonparametric framework for water consumption data cleansing: An application to a smart water network in Naples (Italy) / Padulano, R.; Del Giudice, G.. - In: JOURNAL OF HYDROINFORMATICS. - ISSN 1464-7141. - 22:4(2020), pp. 666-680. [10.2166/hydro.2020.133]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/828833
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 8
social impact