Title:
|
WEB CRAWLER PROFILING AND CONTAINMENT THROUGH NAVIGATION PATTERN MINING |
Author(s):
|
Anália Lourenço , Orlando Belo |
ISBN:
|
978-972-8924-93-5 |
Editors:
|
Pedro Isaías, Bebo White and Miguel Baptista Nunes |
Year:
|
2009 |
Edition:
|
1 |
Keywords:
|
Web Usage Mining, Crawling Profiling, Navigation Patterns, Data Webhousing, Clickstream Processing. |
Type:
|
Full Paper |
First Page:
|
583 |
Last Page:
|
590 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
Web profiles may support the analysis of Web site popularity as well as the detection of unwanted and illegitimate
activities such as fraud. Yet, profiling techniques often fail to account for different usage, processing regular sessions,
crawler sessions and proxy sessions in a similar way. This paper proposes an integrated approach to Web crawler
profiling and containment. A data webhousing embracing standard crawler detection techniques supplies the profiles to
be further analysed through navigation pattern mining. The ability to adapt crawler identification to particular Web
scenarios, the incremental analysis of navigation patterns, and the capacity of monitoring server performance and
preventing crawler-related hazards are considered main strengths of this approach. Experiments over six-month Web
server logs of a non-commercial Web site evidence the benefits of focused Web profiling and, in particular, of this
approach. |
|
|
|
|