Абстрактный

Semi supervised clustering for Text Clustering

N.Saranya

Based on clustering algorithm Affinity Propagation (AP) I present this paper a semisupervised text clustering algorithm, called Seeds Affinity Propagation (SAP). There are two main contributions in my approach: 1) a similarity metric that captures the structural information of texts, and 2) seed construction method to improve the semisupervised clustering process. To study the performance and efficiency of the new algorithm, I applied it to the benchmark data and compared it to two state-of-the-art clustering algorithms, namely, k-means algorithm and the original AP algorithm. Furthermore, I have analyzed the individual impact of the two proposed contributions. Results show that the proposed similarity metric is more effective in text clustering and the proposed semisupervised strategy achieves both better clustering results and faster convergence. The complete SAP algorithm obtains higher F-measure and lower entropy, improves significantly clustering execution time (25 times faster) in respect that k-means, and provides enhanced robustness compared with all other methods.

Отказ от ответственности: Этот реферат был переведен с помощью инструментов искусственного интеллекта и еще не прошел проверку или верификацию

Индексировано в

Индекс Коперника
Академические ключи
CiteFactor
Космос ЕСЛИ
РефСик
Университет Хамдарда
Всемирный каталог научных журналов
Импакт-фактор Международного инновационного журнала (IIJIF)
Международный институт организованных исследований (I2OR)
Cosmos

Посмотреть больше