Data clustering has a long history and refers to a vast range of models and methods that exploit the ever-more-performing numerical optimization algorithms and are designed to find homogeneous groups of observations in data. In this framework, the probability distance clustering (PDC) family methods offer a numerically effective alternative to model-based clustering methods and a more flexible opportunity in the framework of geometric data clustering. Given n J-dimensional data vectors arranged in a data matrix and the number K of clusters, PDC maximizes the joint density function that is defined as the sum of the products between the distance and the probability, both of which are measured for each data vector from each center. This article shows the capabilities of the PDC family, illustrating the R package FPDclustering.

FPDclustering: a comprehensive R package for probabilistic distance clustering based methods / Tortora, C., Palumbo, F.. - In: COMPUTATIONAL STATISTICS. - ISSN 0943-4062. - (2024). [10.1007/s00180-024-01490-5]

FPDclustering: a comprehensive R package for probabilistic distance clustering based methods

Palumbo, Francesco
Secondo
Membro del Collaboration Group
2024

Abstract

Data clustering has a long history and refers to a vast range of models and methods that exploit the ever-more-performing numerical optimization algorithms and are designed to find homogeneous groups of observations in data. In this framework, the probability distance clustering (PDC) family methods offer a numerically effective alternative to model-based clustering methods and a more flexible opportunity in the framework of geometric data clustering. Given n J-dimensional data vectors arranged in a data matrix and the number K of clusters, PDC maximizes the joint density function that is defined as the sum of the products between the distance and the probability, both of which are measured for each data vector from each center. This article shows the capabilities of the PDC family, illustrating the R package FPDclustering.
2024
FPDclustering: a comprehensive R package for probabilistic distance clustering based methods / Tortora, C., Palumbo, F.. - In: COMPUTATIONAL STATISTICS. - ISSN 0943-4062. - (2024). [10.1007/s00180-024-01490-5]
File in questo prodotto:
File Dimensione Formato  
s00180-024-01490-5.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 2.17 MB
Formato Adobe PDF
2.17 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/962390
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact