Digital Library

cab1

 
Title:      WEB WRAPPER SPECIFICATION USING COMPOUND FILTER LEARNING
Author(s):      Julien Carme , Michal Ceresna , Max Goebel
ISBN:      972-8924-19-4
Editors:      Pedro Isaías, Miguel Baptista Nunes and Inmaculada J. Martínez
Year:      2006
Edition:      V I, 2
Keywords:      wrapper induction, interactive learning, information extraction
Type:      Full Paper
First Page:      187
Last Page:      194
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Information available on the Internet is made to be read by humans, not to be processed by machines. To automatically access this information, there is a need for intelligent services that convert HTML documents into more suitable formats like XML. This can be achieved through generation of Web wrappers, programs designed to process pages of a given Web site. To generate such Web wrappers, an efficient approach is to learn them from examples provided by the user. We present such a system, which is based on the generation, selection and combination of elementary extraction operators that we call filters. What makes this approach innovative is that generated wrappers can be easily read, interpreted and modified by the user.
   

Social Media Links

Search

Login