Comparative Study: Algorithms for Short Message Service Classification

Evaristus Didik Madyatmadja; Aldi; Fiona Fheren; Helen Angelica; Hanny Juwitasary; David Jumpa Malem Sembiring

doi:10.3844/jcssp.2023.1333.1344

Research Article Open Access

Comparative Study: Algorithms for Short Message Service Classification

Evaristus Didik Madyatmadja¹, Aldi¹, Fiona Fheren¹, Helen Angelica¹, Hanny Juwitasary¹ and David Jumpa Malem Sembiring²

¹ Department of Information Systems, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia
² Teknik Informatika, Institut Teknologi Dan Bisnis Indonesia, Medan, Indonesia

Abstract

This research aims to classify Short Message Service (SMS) data by applying classification models that have studied SMS data to classify SMS data into SMS spam and SMS ham. The classification model is made from data mining algorithms: Naive Bayes and support vector machine. Before implementing the two algorithms, the SMS data will go through a text preprocessing stage, including data cleaning (whitespace removal, removal of punctuation, and removal of numbers), case folding, stemming, tokenizing, and stop word removal. In this research, a comparison of the accuracy of the two data mining methods will be carried out to see and get the best classification algorithm. Researchers also implemented several experiments by comparing the use of testing data by 20 and 30% and comparing the application of preprocessing stemming and without stemming. This study found that the support vector machine algorithm using testing data of 20% by applying the stemming stage had the highest accuracy rate, 97.5%.

Journal of Computer Science

Volume 19 No. 11, 2023, 1333-1344

DOI: https://doi.org/10.3844/jcssp.2023.1333.1344

Submitted On: 9 February 2023 Published On: 29 September 2023

How to Cite: Madyatmadja, E. D., Aldi, Fheren, F., Angelica, H., Juwitasary, H. & Sembiring, D. J. M. (2023). Comparative Study: Algorithms for Short Message Service Classification. Journal of Computer Science, 19(11), 1333-1344. https://doi.org/10.3844/jcssp.2023.1333.1344

Copyright: © 2023 Evaristus Didik Madyatmadja, Aldi, Fiona Fheren, Helen Angelica, Hanny Juwitasary and David Jumpa Malem Sembiring. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

3,951 Views
2,300 Downloads
0 Citations

Download

Keywords

SMS Spam
SMS HAM
Naive Bayes
Support Vector Machine
Classification
Data Mining
Text Mining