Title:
|
METHODOLOGY OF PREPROCESSING OF DOCUMENTS FOR SYSTEMS OF RECOVERY OF INFORMATION |
Author(s):
|
José Luis Castillo Sequera , José Luis Castillo Sequera , León González Sotos |
ISBN:
|
978-972-8924-57-7 |
Editors:
|
Miguel Baptista Nunes, Pedro Isaías and Philip Powell |
Year:
|
2008 |
Edition:
|
Single |
Keywords:
|
Clustering, Information Retrieval, Optimization methods, Data mining. |
Type:
|
Poster/Demonstration |
First Page:
|
324 |
Last Page:
|
326 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
The present article offers the description of a work methodology that will allow to obtain the characteristic best vectors in
a group of documents in a system of recovery of information (SRI), for its later cluster with a well-known method. We
propose a methodology of preprocessing of documents based on a series of steps and methodological approaches that are
applied to a documental base (in our case the documental base Reuters 21578, and preconceived bases documentaries)
that it will allow to optimize the most representative values of each one of the documents. In this work we detail the
methodology continued in the experimentation, enumerating the variables utilized. |
|
|
|
|