Digital Library

cab1

 
Title:      LOG DATA PREPARATION FOR MINING WEB USAGE PATTERNS
Author(s):      G. Castellano , A. M. Fanelli , M. A. Torsello
ISBN:      978-972-8924-30-0
Editors:      Nuno Guimarães and Pedro Isaías
Year:      2007
Edition:      Single
Keywords:      Data cleaning, data filtering, data preprocessing, user sessions identification , Web usage mining.
Type:      Full Paper
First Page:      371
Last Page:      378
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      In this paper we focus on log data preprocessing, the first step of a common Web Usage Mining process. In particular, we present LODAP (LOg DAta Preprocessor), a software tool which we designed and implemented in order to perform preprocessing of log data. The working scheme of LODAP embraces several steps. Firstly, log files are cleaned by removing irrelevant data. Then, the remaining requests are structured into user sessions, encoding the browsing behavior of users. Successively, the uninteresting sessions and the least visited pages are removed in order to reduce the size of data concerning the previously extracted user sessions. In addition, LODAP allows to create reports containing the results obtained in each step and information summaries mined from the analysis of the considered log files. During the preprocessing through LODAP, the analyst is guided by a sequence of panels representing the wizard-based interface which characterizes the tool. Each panel is a graphical window which offers a basic function of the preprocessor. Preliminary results on log files of a specific Web site show that the implemented tool can effectively reduce the log data size and identify user sessions encoding the user browsing behavior in a significant manner.
   

Social Media Links

Search

Login