The K-means algorithm is one of the most popular algorithms in Data Science, and it is aimed to discover similarities among the elements belonging to large datasets, partitioning them in K distinct groups called clusters. The main weakness of this technique is that, in real problems, it is often impossible to define the value of K as input data. Furthermore, the large amount of data used for useful simulations makes impracticable the execution of the algorithm on traditional architectures. In this paper, we address the previous two issues. On the one hand, we propose a method to dynamically define the value of K by optimizing a suitable quality index with special care to the computational cost. On the other hand, to improve the performance and the effectiveness of the algorithm, we propose a strategy for parallel implementation on modern multicore CPUs.

Performance enhancement of a dynamic K-means algorithm through a parallel adaptive strategy on multicore CPUs / Laccetti, G.; Lapegna, M.; Mele, V.; Romano, D.; Szustak, L.. - In: JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING. - ISSN 0743-7315. - 145:november(2020), pp. 34-41. [10.1016/j.jpdc.2020.06.010]

Performance enhancement of a dynamic K-means algorithm through a parallel adaptive strategy on multicore CPUs

Laccetti G.;Lapegna M.;Mele V.;
2020

Abstract

The K-means algorithm is one of the most popular algorithms in Data Science, and it is aimed to discover similarities among the elements belonging to large datasets, partitioning them in K distinct groups called clusters. The main weakness of this technique is that, in real problems, it is often impossible to define the value of K as input data. Furthermore, the large amount of data used for useful simulations makes impracticable the execution of the algorithm on traditional architectures. In this paper, we address the previous two issues. On the one hand, we propose a method to dynamically define the value of K by optimizing a suitable quality index with special care to the computational cost. On the other hand, to improve the performance and the effectiveness of the algorithm, we propose a strategy for parallel implementation on modern multicore CPUs.
2020
Performance enhancement of a dynamic K-means algorithm through a parallel adaptive strategy on multicore CPUs / Laccetti, G.; Lapegna, M.; Mele, V.; Romano, D.; Szustak, L.. - In: JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING. - ISSN 0743-7315. - 145:november(2020), pp. 34-41. [10.1016/j.jpdc.2020.06.010]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/812891
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 25
  • ???jsp.display-item.citation.isi??? 18
social impact