UUM Electronic Theses and Dissertation
UUM ETD | Universiti Utara Malaysian Electronic Theses and Dissertation
FAQs | Feedback | Search Tips | Sitemap

Robust correlation coefficient based on robust scale and location estimator

Nur Amira, Zakaria (2018) Robust correlation coefficient based on robust scale and location estimator. Masters thesis, Universiti Utara Malaysia.

[thumbnail of s818475_01.pdf] Text

Download (8MB)
[thumbnail of s818475_02.pdf] Text

Download (8MB)
[thumbnail of s818475_references.docx] Text

Download (71kB)


The correlation coefficient is the common statistical analysis that has been used in measuring the relationship between two variables. The most frequently used correlation coefficients is the Pearson correlation coefficient. This coefficient is powerful when the assumptions of linearity between two variables and the normality of the distribution are fulfilled. However, this correlation coefficient unable to perform well with the presence of the outlier in the data. The calculation of the Pearson correlation coefficient uses mean, which known to be very sensitive to the outlier. Alternatively, the Spearman rank correlation coefficient and Kendall’s Tau correlation coefficient are the solutions for this problem. The usage of rank in the calculation of these coefficients instead of original observation lead to losing useful information. For that reason, this study focusing on robust correlation approach based on the median. The existence of median based correlation coefficient used Median Absolute Deviation (MAD) as it scales estimator. Nevertheless, the MAD has low efficiency under Gaussian distribution and this estimator only view dispersion on symmetric distribution. Thus, this study modified the median based correlation using two approaches. Firstly, using the same median based correlation, this study proposed another robust scale estimator namely MADn, Sn, and Qn. Secondly, this study changed the median based correlation to the Hodges Lehmann based correlation and employed all robust scale estimators that are median, MAD, MADn, Sn, and Qn. The performances of the proposed procedures were evaluated based on two conditions of simulation data; perfect and contaminated data. Three indicators were used in evaluating the performance of the proposed procedures which are the correlation coefficient value, the average bias and the standard error. The proposed procedures were validated using a real dataset. The results of the simulation data show that the Qn correlation coefficient and Hodges Lehmann- Qn correlation coefficient performed better under contaminated data compared to the Pearson correlation coefficient and other existing robust correlation coefficients. As the conclusion, the Qn correlation coefficient and the Hodges Lehmann- Qn correlation coefficient are the good alternatives for the Pearson correlation coefficient when there is the outlier in the data.

Item Type: Thesis (Masters)
Supervisor : Abdullah, Suhaida and Ahad, Nor Aishah
Item ID: 9137
Uncontrolled Keywords: Pearson Correlation Coefficient, Outlier, Median, Robust Correlation Coefficient, Robust Scale Estimator
Subjects: Q Science > QA Mathematics > QA273-280 Probabilities. Mathematical statistics
Divisions: Awang Had Salleh Graduate School of Arts & Sciences
Date Deposited: 28 Mar 2022 00:41
Last Modified: 28 Mar 2022 00:41
Department: Awang Had Salleh Graduate School of Arts & Sciences
Name: Abdullah, Suhaida and Ahad, Nor Aishah
URI: https://etd.uum.edu.my/id/eprint/9137

Actions (login required)

View Item
View Item