Missing Values Treatment and Feature Reduction Analysis to Enhance Classification

D. Muralidharan; K. Renuka; Mulagala Jaswant; J. Karthikeyan; G.R. Brindha

doi:10.3844/jcssp.2020.211.216

Research Article Open Access

Missing Values Treatment and Feature Reduction Analysis to Enhance Classification

D. Muralidharan¹, K. Renuka¹, Mulagala Jaswant¹, J. Karthikeyan¹ and G.R. Brindha¹

¹ SASTRA Deemed University, India

Abstract

Datasets may have large number of features which makes it hard and time consuming to classify. Additionally, they may have irrelevant and noise features too with missing values. The missing values should be treated in a proper way so that the classifier accuracy can be improved. There is also a need to reduce features and select only the features necessary to the classifier. Principal Component Analysis (PCA) is commonly considered for this process of reducing the number of features in a dataset. These reduced components can be applied as input to the classifiers. In this study, standard datasets are checked for missing values, classified using Support vector Machines (SVM) and Naive Bayes with and without reducing the features using PCA. Then, the proposed algorithm for missing value imputation is used on the datasets and the same analysis were carried out. The accuracy is evaluated using Confusion Matrix. The results are discussed with analysis based on the nature of features and missing values and how different datasets behave when used with machine learning algorithms.

Journal of Computer Science

Volume 16 No. 2, 2020, 211-216

DOI: https://doi.org/10.3844/jcssp.2020.211.216

Submitted On: 7 July 2019 Published On: 20 February 2020

How to Cite: Muralidharan, D., Renuka, K., Jaswant, M., Karthikeyan, J. & Brindha, G. (2020). Missing Values Treatment and Feature Reduction Analysis to Enhance Classification. Journal of Computer Science, 16(2), 211-216. https://doi.org/10.3844/jcssp.2020.211.216

Copyright: © 2020 D. Muralidharan, K. Renuka, Mulagala Jaswant, J. Karthikeyan and G.R. Brindha. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

6,928 Views
3,243 Downloads
1 Citations

Download

Keywords

PCA
SVM
Naive Bayes
Missing Value Treatment