Research Article Open Access

A framework to Deal with Missing Data in Data Sets

Luai A. Shalabi, Mohannad Najjar and Ahmad A. Kayed

Abstract

Most information systems usually have some missing values due to unavailable data. Missing values minimizing the quality of classification rules generated by a data mining system. Missing vales also affecting the quantity of classification rules achieved by the data mining system. Missing values could influence the coverage percentage and number of reducts generated. Missing values lead to the difficulty of extracting useful information from that data set. Solving the problem of missing data is of a high priority in the field of data mining and knowledge discovery. Replacing missing values by a specific value should not affect the quality of the data. Four different models for dealing with missing data were studied. A framework is established that remove inconsistencies before and after filling the attributes of missing values with the new expected value as generated by one of the four models. Comparative results were discussed and recommendations were concluded.

Journal of Computer Science
Volume 2 No. 9, 2006, 740-745

DOI: https://doi.org/10.3844/jcssp.2006.740.745

Submitted On: 31 May 2006 Published On: 30 September 2006

How to Cite: Shalabi, L. A., Najjar, M. & Kayed, A. A. (2006). A framework to Deal with Missing Data in Data Sets. Journal of Computer Science, 2(9), 740-745. https://doi.org/10.3844/jcssp.2006.740.745

  • 3,512 Views
  • 2,900 Downloads
  • 13 Citations

Download

Keywords

  • Data mining
  • missing data
  • rules
  • reducts
  • coverage