UUM Electronic Theses and Dissertation
UUM ETD | Universiti Utara Malaysian Electronic Theses and Dissertation
FAQs | Feedback | Search Tips | Sitemap

Adaptive firefly algorithm for hierarchical text clustering

Mohammed, Athraa Jasim (2016) Adaptive firefly algorithm for hierarchical text clustering. PhD. thesis, Universiti Utara Malaysia.

[thumbnail of s94734_02.pdf]
Preview
Text
s94734_02.pdf

Download (1MB) | Preview
[thumbnail of s94734_01.pdf]
Preview
Text
s94734_01.pdf

Download (4MB) | Preview

Abstract

Text clustering is essentially used by search engines to increase the recall and precision in information retrieval. As search engine operates on Internet content that is constantly being updated, there is a need for a clustering algorithm that offers automatic grouping of items without prior knowledge on the collection. Existing clustering methods have problems in determining optimal number of clusters and producing compact clusters. In this research, an adaptive hierarchical text clustering
algorithm is proposed based on Firefly Algorithm. The proposed Adaptive Firefly Algorithm (AFA) consists of three components: document clustering, cluster refining, and cluster merging. The first component introduces Weight-based Firefly Algorithm (WFA) that automatically identifies initial centers and their clusters for any given text collection. In order to refine the obtained clusters, a second algorithm, termed as Weight-based Firefly Algorithm with Relocate (WFAR), is proposed. Such an approach allows the relocation of a pre-assigned document into a newly created cluster. The third component, Weight-based Firefly Algorithm with Relocate and
Merging (WFARM), aims to reduce the number of produced clusters by merging nonpure clusters into the pure ones. Experiments were conducted to compare the proposed algorithms against seven existing methods. The percentage of success in
obtaining optimal number of clusters by AFA is 100% with purity and f-measure of 83% higher than the benchmarked methods. As for entropy measure, the AFA produced the lowest value (0.78) when compared to existing methods. The result indicates that Adaptive Firefly Algorithm can produce compact clusters. This research contributes to the text mining domain as hierarchical text clustering
facilitates the indexing of documents and information retrieval processes.

Item Type: Thesis (PhD.)
Supervisor : Yusof, Yuhanis and Husni, Husniza
Item ID: 5801
Uncontrolled Keywords: Text mining, Hierarchical text clustering, Swarm Intelligence, Firefly Algorithm
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Awang Had Salleh Graduate School of Arts & Sciences
Date Deposited: 01 Aug 2016 17:26
Last Modified: 06 Apr 2021 06:33
Department: Awang Had Salleh Graduate School of Arts and Sciences
Name: Yusof, Yuhanis and Husni, Husniza
URI: https://etd.uum.edu.my/id/eprint/5801

Actions (login required)

View Item
View Item