In recent years, automatic segmentation and classification of data from digital surveys have taken a central role in built heritage studies. However, the application of Machine and Deep Learning (ML and DL) techniques for semantic segmentation of point clouds is complex in the context of historic architecture because it is characterized by high geometric and semantic variability. Data quality, subjectivity in manual labeling, and difficulty in defining consistent categories may compromise the effectiveness and reproducibility of the results. This study analyzes the influence of three key factors—annotator specialization, point cloud density, and sensor type—in the supervised classification of architectural elements by applying the Random Forest (RF) algorithm to datasets related to the architectural typology of the Franciscan cloister. The main innovation of the study lies in the development of an advanced feature selection technique, based on multibeam statistical analysis and evaluation of the p-value of each feature with respect to the target classes. The procedure makes it possible to identify the optimal radius for each feature, maximizing separability between classes and reducing semantic ambiguities. The approach, entirely in Python, automates the process of feature extraction, selection, and application, improving semantic consistency and classification accuracy.

Data Quality, Semantics, and Classification Features: Assessment and Optimization of Supervised ML-AI Classification Approaches for Historical Heritage / Cera, Valeria; Antuono, Giuseppe; Campi, Massimiliano; D’Agostino, Pierpaolo. - In: HERITAGE. - ISSN 2571-9408. - 8(7):265(2025), pp. 1-33. [10.3390/heritage8070265]

Data Quality, Semantics, and Classification Features: Assessment and Optimization of Supervised ML-AI Classification Approaches for Historical Heritage

Valeria Cera
;
Giuseppe Antuono
;
Massimiliano Campi;Pierpaolo D’Agostino
2025

Abstract

In recent years, automatic segmentation and classification of data from digital surveys have taken a central role in built heritage studies. However, the application of Machine and Deep Learning (ML and DL) techniques for semantic segmentation of point clouds is complex in the context of historic architecture because it is characterized by high geometric and semantic variability. Data quality, subjectivity in manual labeling, and difficulty in defining consistent categories may compromise the effectiveness and reproducibility of the results. This study analyzes the influence of three key factors—annotator specialization, point cloud density, and sensor type—in the supervised classification of architectural elements by applying the Random Forest (RF) algorithm to datasets related to the architectural typology of the Franciscan cloister. The main innovation of the study lies in the development of an advanced feature selection technique, based on multibeam statistical analysis and evaluation of the p-value of each feature with respect to the target classes. The procedure makes it possible to identify the optimal radius for each feature, maximizing separability between classes and reducing semantic ambiguities. The approach, entirely in Python, automates the process of feature extraction, selection, and application, improving semantic consistency and classification accuracy.
2025
Data Quality, Semantics, and Classification Features: Assessment and Optimization of Supervised ML-AI Classification Approaches for Historical Heritage / Cera, Valeria; Antuono, Giuseppe; Campi, Massimiliano; D’Agostino, Pierpaolo. - In: HERITAGE. - ISSN 2571-9408. - 8(7):265(2025), pp. 1-33. [10.3390/heritage8070265]
File in questo prodotto:
File Dimensione Formato  
Data Quality, Semantics, and Classification Features Assessment and Optimization of Supervised ML-AI Classification Approaches for Historical Heritage_compressed.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 2.51 MB
Formato Adobe PDF
2.51 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/1032356
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact