Digital Library

cab1

 
Title:      SUBJECT CLASSIFICATION OF WEB PAGES
Author(s):      Ludger Martin
ISBN:      978-989-8533-09-8
Editors:      Bebo White and Pedro IsaĆ­as
Year:      2012
Edition:      Single
Keywords:      Subject Classification, Web Content Mining
Type:      Full Paper
First Page:      298
Last Page:      306
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Subject classification is a discipline to automatically find out what a text is about. For example, a text can refer to biology or to economic science. This paper discusses how the subject of a web page can be determined. This is done in several steps. First the main content of the page is extracted. Then it is investigated by using frequency classes and Wikipedia categories to determine the class of the subject. A case study shows the suitability of the procedure which depends on certain parameters. Their choice of these parameters is motivated, too.
   

Social Media Links

Search

Login