Gender dependent word-level emotion detection using global spectral speech features

Siddique, Haris (2015) Gender dependent word-level emotion detection using global spectral speech features. Masters thesis, Universiti Utara Malaysia.

Rights: Restricted

Preview

Text
s814534.pdf
Download (1MB) | Preview

Preview

Text
s814534_abstract.pdf
Download (556kB) | Preview

Abstract

In this study, global spectral features extracted from word and sentence levels are studied for speech emotion recognition. MFCC (Mel Frequency Cepstral Coefficient) were used as spectral information for recognition purpose. Global spectral features representing gross statistics such as mean of MFCC are used. This study also examine words at different positions (initial, middle and end) separately in a sentence. Word-level feature extraction is used to analyze emotion recognition performance of words at different positions. Word boundaries are manually identified. Gender dependent and independent models are also studied to analyze the gender impact on emotion recognition performance. Berlin’s Emo-DB (Emotional Database) was used for emotional speech dataset. Performance of different classifiers also been studied. NN (Neural
Network), KNN (K-Nearest Neighbor) and LDA (Linear Discriminant Analysis) are included in
the classifiers. Anger and neutral emotions were also studied. Results showed that, using all 13 MFCC coefficients provide better classification results than other combinations of MFCC coefficients for the mentioned emotions. Words at initial and ending positions provide more emotion, specific information than words at middle position. Gender dependent models are more efficient than gender independent models. Moreover, female are more efficient than male model and female exhibit emotions better than the male. General, NN performs the worst compared to KNN and LDA in classifying anger and neutral. LDA performs better than KNN almost 15% for gender independent model and almost 25% for gender dependent.

Item Type:	Thesis (Masters)
Supervisor :	UNSPECIFIED
Item ID:	4518
Uncontrolled Keywords:	Mel Frequency Cepstral coefficents, Feature extraction, emotional speech recognition, simulated emotional speech corpus, classification models
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:	Awang Had Salleh Graduate School of Arts & Sciences
Date Deposited:	10 May 2015 03:16
Last Modified:	05 Apr 2021 01:19
Department:	Awang Had Salleh Graduate School of Arts and Sciences
URI:	https://etd.uum.edu.my/id/eprint/4518

Actions (login required)

: View Item