Paper
22 April 2022 A fast k-means algorithm based on multi-granularity
Author Affiliations +
Proceedings Volume 12174, International Conference on Internet of Things and Machine Learning (IoTML 2021); 1217410 (2022) https://doi.org/10.1117/12.2628453
Event: International Conference on Internet of Things and Machine Learning (IoTML 2021), 2021, Shanghai, China
Abstract
The k-means algorithm has been widely used since it was proposed, but the standard k-means algorithm does not perform well in terms of efficiency when dealing with large-scale data. To solve this problem, in this paper, we propose a fast kmeans algorithm based on multiple granularities. First, from the coarse-grained perspective, we use the clustering distribution information to narrow the search range of sample points, which makes the proposed algorithm very advantageous on large k. Second, from the fine-grained perspective, we use the rules of upper and lower bounds to reduce the number of sample points involved in the distance calculation, thus reducing many unnecessary distance calculations. Finally, we evaluate the proposed k-means algorithm on several real-world datasets, and the experimental results show that the proposed algorithm converges hundreds of times faster than standard k-means on average with the accuracy loss controlled at about three percent, and the speedup of the algorithm is more obvious when the dataset size is larger and the dimensionality of the dataset is higher.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Qing Wen, Junkuan Wang, Yabin Shao, and Zizhong Chen "A fast k-means algorithm based on multi-granularity", Proc. SPIE 12174, International Conference on Internet of Things and Machine Learning (IoTML 2021), 1217410 (22 April 2022); https://doi.org/10.1117/12.2628453
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Algorithm development

Computer science

Data centers

Distortion

Machine learning

RELATED CONTENT

Machine unlearning survey
Proceedings of SPIE (December 16 2022)
Supervised hub-detection for brain connectivity
Proceedings of SPIE (March 21 2016)
Kernel credal classification rule
Proceedings of SPIE (March 17 2017)

Back to Top