Adaptive Synthetic Oversampling Algorithm for Handling Class Imbalance in Multi-Class Data Stream Classification
- 1 Department of Computer Science and Engineering, SRM Institute of Science and Technology, India
- 2 Department of Computational Intelligence, SRM Institute of Science and Technology, India
Abstract
Concept drift and class imbalanced data are major challenging processes involved in modern streaming data classification. Particularly, when integrated with difficult factors like the existence of noise, overlapping class distribution, concept drift, and data imbalance can considerably affect the classifier results. In addition, various challenges affect the performance of the existing oversampling schemes such as SMOTE and its derivatives. Regardless of that, several existing models concentrate on the data imbalance in the binary classification problems, whereas the complex multi-class counterparts are yet to be explored. With this motivation, this study develops an Adaptive Synthetic Oversampling Algorithm (ASYNO) based Multiclass Streaming Data Classification (ASYNO-MCSDC) model on Class Imbalance Handling and Concept Drift. The presented ASYNO-MCSDC method initially performs different stages of preprocessing such as label encoding, data normalization, and data splitting. Besides, the Adaptive Synthetic oversampling technique (ASYNO) is applied for handling class imbalance data problems. Also, the online bagging ensemble classifier is employed for the data classification process in which the Hoeffding Tree (HT) was utilized as the base classification and the number of estimators used in online bagging is set to 10. For the process of experimentation, two types of learning are used, one is batch learning and other is incremental learning. The experimental validation of the ASYNO-MCSDC model is tested using two datasets namely stationary imbalance stream and dynamic imbalance stream. The experimental results pointed out that the ASYNO-MCSDC model has accomplished promising results over other models.
DOI: https://doi.org/10.3844/jcssp.2022.650.664
Copyright: © 2022 Priya S. and Annie Uthra. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 1,981 Views
- 832 Downloads
- 1 Citations
Download
Keywords
- Machine Learning
- Class Imbalance
- Concept Drift
- Data Classification
- Oversample
- Streaming Data