Digital Library

cab1

 
Title:      FEATURE REDUCTION FOR DOCUMENT CLUSTERING WITH NZIPF METHOD
Author(s):      José Luis Castillo Sequera , José R. Fernández Del Castillo , León González Sotos
ISBN:      978-972-8924-78-2
Editors:      Piet Kommers and Pedro Isaías
Year:      2009
Edition:      2
Keywords:      Clustering, Information Management, Information Search and Retrieval, Data Mining.
Type:      Short Paper
First Page:      205
Last Page:      209
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      In this paper, we discuss a feature reduction technique and their application to document clustering, showing that feature reduction improves efficiency as well as accuracy. We select the terms starting from the Goffman point, selecting an area of suitable transition making use for it of the Zipf law (our method is called NZIPF). The experiments are carried out with the collection Reuters 21578 and the results are compared with other methods to validate their efficiency. Finally, we demonstrate experimentally that the transition zone that provides better results is taking 40 terms starting from the Goffman point for a supervised clustering algorithm.
   

Social Media Links

Search

Login