UUM Electronic Theses and Dissertation
UUM ETD | Universiti Utara Malaysian Electronic Theses and Dissertation
FAQs | Feedback | Search Tips | Sitemap

An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents

Al-Dyani, Wafa Zubair Abdullah (2022) An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents. Doctoral thesis, Universiti Utara Malaysia.

[thumbnail of s901775_01.pdf] Text
s901775_01.pdf

Download (4MB)

Abstract

Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making.

Item Type: Thesis (Doctoral)
Supervisor : Kabir Ahmad, Farzana and Kamaruddin, Siti Sakira
Item ID: 10228
Uncontrolled Keywords: Event detection, Feature selection, Heterogeneous news text documents, Binary bat algorithm, Markov clustering algorithm.
Subjects: Q Science > QA Mathematics
Divisions: Awang Had Salleh Graduate School of Arts & Sciences
Date Deposited: 16 Jan 2023 03:51
Last Modified: 24 Aug 2025 06:32
Department: Awang Had Salleh Graduate School of Arts & Sciences
Name: Kabir Ahmad, Farzana and Kamaruddin, Siti Sakira
URI: https://etd.uum.edu.my/id/eprint/10228

Actions (login required)

View Item
View Item