Indexed by:
Abstract:
As a low-cost, all-purpose parallel computing system with the advantages of using easily and good expandability, the Cluster System has become a popular platform in lots of fields. Clustering analyzing is one of the important problems in Data Mining. Because most of its objects are large-scale databases or high-dimension data, clustering requests more powerful computing ability. So how to develop parallel clustering algorithm based on Cluster System deserves attention. This paper proposes a new parallel clustering algorithm called PARCLE for very large databases that is suitable for Cluster System. This algorithm adopts data parallelism and asynchronous communication to reduce the communication costs. It applies a new clustering algorithm derived from BIRCH[2] to improve the quality of clustering. Our implementation shows high speedups with negligible communication overheads and good clustering result not less than that of linear clustering algorithm.
Keyword:
Reprint Author's Address:
Email:
Source :
2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS
Year: 2003
Page: 4-8
Language: English
Cited Count:
WoS CC Cited Count: 1
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 3
Affiliated Colleges: