Big Data Framework for Predicting Infectious Diseases to Improve Healthcare by Discovering New Symptom Patterns
- 1 Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Egypt
Abstract
The utilization of big data in infectious disease control represents a captivating opportunity, as these novel data streams offer the potential to enhance the timeliness of preventive measures. Various healthcare providers in both the public and private sectors generate, store, and analyse extensive datasets to enhance the quality of services they deliver. Recently, the outbreak of the new coronavirus, COVID-19, has posed significant threats to human health, life, production, social connections, and international relations, placing them in substantial peril. Consequently, the adoption of big data technologies has played a pivotal role in the response to the pandemic. Infectious diseases manifest when a person contracts a disease from a pathogen transmitted by another person, posing challenges that affect both individual and macroscales. Furthermore, the unknown patterns of infectious illnesses add complexity to the prediction process. This study aims to establish a big data framework for predicting infectious diseases by uncovering new patterns of symptoms, ultimately enhancing healthcare infection prevention and control. To achieve this objective, machine-learning algorithms such as K-Nearest Neighbors and Random Forest were employed for cleaning and maintaining extensive datasets collected from December 2019 to June 2020. Additionally, FP-growth and the Park, Chen, and Yu algorithms were applied to identify new patterns. The results demonstrated the superior performance of the Support Vector Machines (SVM) classifier, which achieved the highest accuracy of 98.2%. The Random Forest (RF) classifier had the highest precision (92.80%), and the SVM classifier had the highest F1 score (94.80%). Similarly, the Park, Chen, and Yu algorithm outperformed FP growth, achieving an accuracy rate of 98.5%. These findings underscore the potential of big data and machine learning in pattern recognition and predicting infectious diseases, ultimately contributing to improved public health outcomes.
DOI: https://doi.org/10.3844/jcssp.2024.1251.1262
                                            
                                Copyright: © 2024 Amal Mohamed Mounir, Mohamed Ibrahim Marie and Laila Abd-Elhamid. This is an open access article distributed under the terms of the
                                                                            Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                                                                    
- 2,983 Views
- 1,372 Downloads
- 0 Citations
Download
Keywords
- Big Data
- Healthcare
- Association Rule Mining
- Random Forest
- Infection Diseases
- PCY Algorithm
