Traditional supervised classification models aim to approximate the functional mapping between instance attributes and their class labels. These models, however, do not consider the interdependence between instances and global characteristics of data and thus often they lead to poor classification results. In this work, we present a novel hybrid classification model – named HyCASTLE – designed to solve the main shortcomings of hybrid models that employ topological information through clustering in order to improve classifiers performances: they make hypotheses on the underlying data distribution and do not consider the effect of noise. HyCASTLE utilises a non-parametric estimator to capture the underlying data distribution and creates entirely data-driven shape-free clusters. HyCASTLE then refines this cluster configuration using both data topology and available labels through an iterative cluster aggregation and separation process. We evaluated HyCASTLE performance on 37 datasets and compare it with both traditional and hybrid classification models. Our results show that HyCASTLE has comparable or better performance than the other models and results to be more resilient to class noise.

HyCASTLE: A Hybrid ClAssification System based on Typicality, Labels and Entropy / Delli Veneri, M.; Cavuoti, S.; Abbruzzese, R.; Brescia, M.; Sperli, G.; Moscato, V.; Longo, G.. - In: KNOWLEDGE-BASED SYSTEMS. - ISSN 0950-7051. - 244:(2022), p. 108566. [10.1016/j.knosys.2022.108566]

HyCASTLE: A Hybrid ClAssification System based on Typicality, Labels and Entropy

Delli Veneri M.
;
Abbruzzese R.;Brescia M.
Membro del Collaboration Group
;
Sperli G.;Moscato V.;Longo G.
2022

Abstract

Traditional supervised classification models aim to approximate the functional mapping between instance attributes and their class labels. These models, however, do not consider the interdependence between instances and global characteristics of data and thus often they lead to poor classification results. In this work, we present a novel hybrid classification model – named HyCASTLE – designed to solve the main shortcomings of hybrid models that employ topological information through clustering in order to improve classifiers performances: they make hypotheses on the underlying data distribution and do not consider the effect of noise. HyCASTLE utilises a non-parametric estimator to capture the underlying data distribution and creates entirely data-driven shape-free clusters. HyCASTLE then refines this cluster configuration using both data topology and available labels through an iterative cluster aggregation and separation process. We evaluated HyCASTLE performance on 37 datasets and compare it with both traditional and hybrid classification models. Our results show that HyCASTLE has comparable or better performance than the other models and results to be more resilient to class noise.
2022
HyCASTLE: A Hybrid ClAssification System based on Typicality, Labels and Entropy / Delli Veneri, M.; Cavuoti, S.; Abbruzzese, R.; Brescia, M.; Sperli, G.; Moscato, V.; Longo, G.. - In: KNOWLEDGE-BASED SYSTEMS. - ISSN 0950-7051. - 244:(2022), p. 108566. [10.1016/j.knosys.2022.108566]
File in questo prodotto:
File Dimensione Formato  
xx1-DelliVeneri-1-s2.0-S0950705122002507-main.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 2.38 MB
Formato Adobe PDF
2.38 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/900164
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact