Kongwan, Authapon (2024) Semantic-based question answering framework for fuzzy factoid answer from Thai texts. Doctoral thesis, Universiti Utara Malaysia.
![[thumbnail of depositpermission.pdf]](https://etd.uum.edu.my/style/images/fileicons/text.png)
depositpermission.pdf
Restricted to Repository staff only
Download (1MB) | Request a copy
![[thumbnail of s900995_01.pdf]](https://etd.uum.edu.my/style/images/fileicons/text.png)
s900995_01.pdf
Download (3MB)
Abstract
Text is an important human knowledge source. The question-answering system can retrieve the fact from the source of knowledge and provide the answer to the user. Translating the text to the knowledge base is a very challenge task and complicated process. Thai text can be a form of character stream written continuously without any punctuation or marker to separate each word and each sentence in a paragraph. This research is aim to develop a semantic base question-answering framework that can handle the fuzzy factoid and target the knowledge source to Thai text. In building a Thai question-answering system, Thai morphological analysis is an important component to process Thai text. Ellipsis and anaphora resolution in Thai text is also the needed process for constructing the complete fact from Thai text. Thai semantic parser is the core component to construct the knowledge base by extracting the fact from Thai text into the semantic frame structure. The methodology of this research is divided into 4 steps. First is building the accurate Thai morphological analysis: Thai word segmentation and Thai EDU segmentation. The second is to develop the ellipsis and anaphora resolution for Thai text to achieve the goal that is creating the complete fact in Thai EDU segmentation. The third is to develop the semantic parser to build the knowledge base that transforms the Thai text into a semantic frame representation. Forth is developed the answer extraction for the question answering system with fuzzy matching to handle the fuzzy factoid. From the pipeline of the processes, the semanticbased question answering system performs high precision and recall to 0.9892 and 0.9484. In conclusion, anaphora and ellipsis resolution are crucial for achieving precise semantic construction, while fuzzy matching significantly enhances answer extraction recall. Together, these components are essential for building robust "What" and "How many" question answering systems
Item Type: | Thesis (Doctoral) |
---|---|
Supervisor : | Kamaruddin, Siti Sakira and Kabir Ahmad, Farzana |
Item ID: | 11490 |
Uncontrolled Keywords: | Semantic, Question answering system, Anaphora resolution, Word segmentation, Fuzzy |
Subjects: | P Language and Literature > P Philology. Linguistics |
Divisions: | Awang Had Salleh Graduate School of Arts & Sciences |
Date Deposited: | 06 Jan 2025 04:15 |
Last Modified: | 06 Jan 2025 04:15 |
Department: | Awang Had Salleh Graduate School of Arts And Sciences |
Name: | Kamaruddin, Siti Sakira and Kabir Ahmad, Farzana |
URI: | https://etd.uum.edu.my/id/eprint/11490 |