Digital Library

cab1

 
Title:      A FRAMEWORK FOR DEVELOPING CONTEXT-BASED BLOG CRAWLERS
Author(s):      Rafael Ferreira, Rinaldo J. Lima, Ig Ibert Bittencourt, Dimas Melo Filho, Olavo Holanda, Evandro Costa, Fred Freitas, Lídia Melo
ISBN:      978-972-8939-25-0
Editors:      Bebo White, Pedro Isaías and Diana Andone
Year:      2010
Edition:      Single
Keywords:      Blog Crawler, Framework, Blog, Information Retrieval, Blogosphere
Type:      Full Paper
First Page:      120
Last Page:      126
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      The development of the Web brought an interactive environment in which an increasing number of users are sharing their knowledge and opinions. Blogs are a growing part of this environment. Considering the rate at which knowledge is created daily in the blogosphere, such an amount of information could be used by several applications. However, the infeasibility of manual information extraction from blogs demands the development of new computational approaches. This context is the rationale for the blog crawlers, software programs capable of searching and extracting information from blogs. This paper proposes a framework that helps the user in the task of building blog crawlers. The proposed framework provides further access to several tools, simplifying the overall development of new applications. We also present an algorithm for locating the textual content within blogs. Finally, we demonstrate the feasibility of our framework by means of an instantiation of it that achieved a precision and recall of 73.46% and 71.92%, respectively.
   

Social Media Links

Search

Login