Title:
|
REFINEMENT OF A GENETIC ALGORITHM FOR DOCUMENT CLUSTERING |
Author(s):
|
José Luis Castillo Sequera, José R. Fernández del Castillo Díez, León González Sotos |
ISBN:
|
978-972-8939-30-4 |
Editors:
|
Hans Weghorn, Pedro Isaías and Radu Vasiu |
Year:
|
2010 |
Edition:
|
Single |
Keywords:
|
Data Mining, Genetic Algorithm, Information Retrieval, Documentation, Optimization Methods |
Type:
|
Full Paper |
First Page:
|
191 |
Last Page:
|
198 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
In this paper we show the strategies used to refine the parameters of a genetic algorithm applied to the field of documentation. Properly assigning these parameters allows us to improve the solution and address the problems of optimization in the evolutionary field successfully. This paper presents an initial introduction on the mentioned topic showing the techniques and strategies implemented in an algorithm designed to cluster documents in a non supervised manner ensuring a balance between diversification or the ability to visit many different regions of the search space, and the intensification or the ability to obtain high quality solutions in these regions. The criteria used for document clustering is based on a fitness function that uses both the similarity and the distance between documents to measure the degree of affinity and closeness between the various documents. We show the strategies used to refine the integration of algorithm parameters and the results obtained by varying the algorithm parameters to improve performance and obtain an acceptable document cluster at the end of the evolution, and provide at least two possible groups among all documents, placing documents by affinity. The proposal to be presented as an alternative to traditional methods in Information Retrieval. |
|
|
|
|