Digital Library

cab1

 
Title:      INTERACTIVE WRAPPER LEARNING FOR WEB DOCUMENTS USING TREE ALIGNMENT
Author(s):      Max Goebel , Michal Ceresna
ISBN:      978-972-8924-30-0
Editors:      Nuno Guimarães and Pedro Isaías
Year:      2007
Edition:      Single
Keywords:      Information Extraction, Data mining, Tree alignment, Classification.
Type:      Full Paper
First Page:      363
Last Page:      370
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      This paper proposes an interactive wrapper learning approach to Web information extraction for semi-automatic wrapper generation. In particular, we present an algorithm that learns patterns based on the structure of training instances using tree alignment techniques. This is achieved by generating structural template models for both positive and negative examples. We evaluate our system on standard benchmarks, and evaluation shows that there exists great potential for structure learning for a variety of extraction tasks.
   

Social Media Links

Search

Login