An Adaptive Strategy for Dynamic Data Clustering with the K-Means Algorithm

Lapegna, M.; Mele, V.; Romano, D.

doi:10.1007/978-3-030-43222-5_9

K-means algorithm is one of the most widely used methods in data mining and statistical data analysis to partition several objects in K distinct groups, called clusters, on the basis of their similarities. The main problem of this algorithm is that it requires the number of clusters as an input data, but in the real life it is very difficult to fix in advance such value. For such reason, several modified K-means algorithms are proposed where the number of clusters is defined at run time, increasing it in a iterative procedure until a given cluster quality metric is satisfied. In order to face the high computational cost of this approach we propose an adaptive procedure, where at each iteration two new clusters are created, splitting only the one with the worst value of the quality metric.

An Adaptive Strategy for Dynamic Data Clustering with the K-Means Algorithm / Lapegna, M.; Mele, V.; Romano, D.. - 12044:(2020), pp. 101-110. ( 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019 polonia 2019) [10.1007/978-3-030-43222-5_9].