UUM Electronic Theses and Dissertation
UUM ETD | Universiti Utara Malaysian Electronic Theses and Dissertation
FAQs | Feedback | Search Tips | Sitemap

Winsorize tree algorithm for handling outliers in classification problem

Ch’ng, Chee Keong (2016) Winsorize tree algorithm for handling outliers in classification problem. PhD. thesis, Universiti Utara Malaysia.

[thumbnail of depositpermission_s92068.pdf]
Preview
Text
depositpermission_s92068.pdf

Download (270kB) | Preview
[thumbnail of s92068_01.pdf]
Preview
Text
s92068_01.pdf

Download (9MB) | Preview

Abstract

Classification and Regression Tree (CART) is designed to predict or classify the objects in the predetermined classes from a set of predictors. However, having outliers could affect the structures of CART, purity and predictive accuracy in classification. Some researchers opt to perform pre-pruning or post-pruning of the CART in handling the outliers. This study proposes a modified classification tree algorithm called Winsorize tree based on the distribution of classes in the training dataset. The Winsorize tree investigates all possible outliers from node to node before checking the potential splitting point to gain the node with the highest purity of the nodes. The
upper fence and lower fence of a boxplot are used to detect potential outliers whose values exceeding the tail of Q ± (1.5×Interquartile range). The identified outliers are neutralized using the Winsorize method whilst the Winsorize Gini index is then used to compute the divergences among probability distributions of the target predictor’s
values until stopping criteria are met. This study uses three stopping rules: node achieved the minimum 10% of total training set,

Item Type: Thesis (PhD.)
Supervisor : Ismail, Wan Rosmanira and Mahat, Nor Idayu
Item ID: 5780
Uncontrolled Keywords: Classification tree, Outliers, Winsorize Gini index, Winsorize tree algorithm
Subjects: Q Science > QA Mathematics > QA273-280 Probabilities. Mathematical statistics
Divisions: Awang Had Salleh Graduate School of Arts & Sciences
Date Deposited: 08 Aug 2016 10:50
Last Modified: 10 Apr 2022 23:45
Department: Awang Had Salleh Graduate School of Arts and Sciences
Name: Ismail, Wan Rosmanira and Mahat, Nor Idayu
URI: https://etd.uum.edu.my/id/eprint/5780

Actions (login required)

View Item
View Item