Fast Real Time Analysis of Web Server Massive Log Files using an Improved Web Mining Architecture
- 1 Govt. Arts College, India
- 2 Sona College of Technology, India
Abstract
The web has played a vital role to detect the information and finding the reasons to organize a system. As the web sites were increased, the web log files also increased based on the web searching. Our challenge and the task are to reduce the log files and classify the best results to reach the task which we used. Aimed to overcome the deficiency of abundant data to web mining, the study proposed a path extraction using Euclidean Distance based algorithm with a sequential pattern clustering mining algorithm. First, we construct the Relational Information System using original data sets. Second, we here cluster the data by the Sequential Pattern Clustering Method for the data sets which make use of the data to produce Core of Information System. Web mining core data is the most important and necessary information which cannot reduce an original Information System. So it can get the same effect as original data sets to data analysis and can construct classification modeling using it. Third, we here used Sequential pattern clustering method with the help of Path Extraction. The experiment shows that the proposed algorithm can get high efficiency and avoid the abundant data in follow-up data processing.
DOI: https://doi.org/10.3844/jcssp.2013.771.779
Copyright: © 2013 C. Kavitha and Ramesh Rajamanickam. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 2,964 Views
- 2,494 Downloads
- 0 Citations
Download
Keywords
- Path Completion
- Cleanup the Data
- Data Preprocessor
- Travel Path Extraction
- Sequential Pattern Clustering Method