A Proposed Clustering Algorithm for Efficient Clustering of High-Dimensional Data

S. Gopinath; G. Kowsalya; K Sakthivel; S. Arularasi

doi:10.48001/joitc.2023.1114-21

Authors

S. Gopinath GNANAMANI COLLEGE OF TECHNOLOGY
G. Kowsalya
K Sakthivel
S. Arularasi

DOI:

https://doi.org/10.48001/joitc.2023.1114-21

Keywords:

Algorithm, Clustering, High-dimensional data, Information, Subspace

Abstract

To partition transaction data values, clustering algorithms are used. To analyse the relationships between transactions, similarity measures are utilized. Similarity models based on vectors perform well with low-dimensional data. High-dimensional data values are clustered using subspace clustering techniques. Clustering high-dimensional data is difficult due to the curse of dimensionality. Projective clustering seeks out projected clusters in subsets of a data space's dimensions. In high-dimensional data space, a probability model represents predicted clusters. A model-based fuzzy projection clustering method to find clusters with overlapping boundaries in different projection subspaces. The system employs the Model Based Projective Clustering (MPC) method. To cluster high-dimensional data, projective clustering algorithms are used. A subspace clustering technique is the model-based projective clustering algorithm. Similarity analysis use non-axis-subspaces. Anomaly transactions are segmented using projected clusters. The suggested system is intended to cluster objects in high-dimensional spaces. The similarity analysis includes non-access subspaces. The clustering procedure validates anomaly data values with similarity. The subspace selection procedure has been optimized. A subspace clustering approach is the model-based projective clustering algorithm. Similarity analysis use non-axis-subspaces. Anomaly transactions are segmented using projected clusters. The suggested system is intended to cluster objects in high-dimensional spaces. The similarity analysis includes non-access subspaces. The clustering procedure validates anomaly data values with similarity. The subspace selection procedure has been improved.

Downloads

References

Bouguessa, M., Wang, S., & Sun, H. (2006). An objective approach to cluster validation. Pattern Recognition Letters, 27(13), 1419-1430. https://doi.org/10.1016/j.patrec.2006.01.015.

Chen, L., Jiang, Q., & Wang, S. (2008, December). A probability model for projective clustering on high dimensional data. In 2008 Eighth IEEE International Conference on Data Mining (pp. 755-760). IEEE. https://doi.org/10.1109/ICDM.2008.15.

Chen, L., Jiang, Q., & Wang, S. (2010). Model-based method for projective clustering. IEEE Transactions on Knowledge and Data Engineering, 24(7), 1291-1305. https://doi.org/10.1109/TKDE.2010.256.

Domeniconi, C., Gunopulos, D., Ma, S., Yan, B., Al-Razgan, M., & Papadopoulos, D. (2007). Locally adaptive metrics for clustering high dimensional data. Data Mining and Knowledge Discovery, 14, 63-97. https://doi.org/10.1007/s10618-006-0060-8.

Gan, G., Wu, J., & Yang, Z. (2006). A fuzzy subspace algorithm for clustering high dimensional data. In Advanced Data Mining and Applications: Second International Conference, ADMA 2006, Xi’an, China, August 14-16, 2006 Proceedings 2 (pp. 271-278). Springer Berlin Heidelberg. https://doi.org/10.1007/11811305_30.

Haralick, R., & Harpaz, R. (2007). Linear manifold clustering in high dimensional spaces by stochastic search. Pattern Recognition, 40(10), 2672-2684. https://doi.org/10.1016/j.patcog.2007.01.020.

Hoff, P. D. (2006). Model-based subspace clustering. Bayesian Analysis, 1(2), 321-344. https://doi.org/10.1214/06-BA111.

Jing, L., Ng, M. K., & Huang, J. Z. (2007). An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Transactions on Knowledge and Data Engineering, 19(8), 1026-1041. https://doi.org/10.1109/TKDE.2007.1048.

Lu, Y., Wang, S., Li, S., & Zhou, C. (2011). Particle swarm optimizer for variable weighting in clustering high-dimensional data. Machine Learning, 82, 43-70. https://doi.org/10.1007/s10994-009-5154-2.

Moise, G., Sander, J., & Ester, M. (2008). Robust projected clustering. Knowledge and Information Systems, 14, 273-298. https://doi.org/10.1007/s10115-007-0090-6.

Wang, Q., Ye, Y., & Huang, J. Z. (2008, July). Fuzzy k-means with variable weighting in high dimensional data analysis. In 2008 The Ninth International Conference on Web-Age Information Management (pp. 365-372). IEEE. https://doi.org/10.1109/WAIM.2008.50.