Title:
|
ANALYSIS OF CLUSTERS PERFORMANCE USING SEPARATION ACTIVES/INACTIVES AND MEAN INTERCLUSTER DISSIMILARITY |
Author(s):
|
M. Rosmayati , A. B. Zuriana , C. A. Arifah , C. M. Noor Azliza |
ISBN:
|
972-8924-16-X |
Editors:
|
Pedro Isaías, Maggie McPherson and Frank Bannister |
Year:
|
2006 |
Edition:
|
1 |
Keywords:
|
Wards clustering, genetic algorithm, chemoinformatic. |
Type:
|
Full Paper |
First Page:
|
441 |
Last Page:
|
448 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
Lead identification in drug discovery is a long and complicated process. The increasing numbers of molecules in
chemical database make it harder to be screened. One of the solutions is by using compound selection methods where
only a small portion of compounds is selected to represent the whole dataset. Cluster-based selection is one of the widely
used methods in compound selection where the chemical compounds are located into dedicated clusters. In this paper,
Wards clustering algorithm is chosen to cluster 2D fragment bit-strings. Genetic algorithm (GA) is applied on each
cluster produced by Wards in order to optimise the performance of cluster in terms of separating actives molecules from
inactives which is the main prerequisite in compound selection method. Another optimisation value used is mean intercluster
dissimilarity. The result from optimisation of Wards clusters using GA is compared to the clusters produced by
Wards alone. The optimisation of Wards clusters using GA shows better result based on the inter-cluster dissimilarity
whilst Wards method is best at separating actives molecules from inactives with a small difference. These imply that the
possibility of using GA on top of Wards clustering has potential to produce diverse dataset of compounds. |
|
|
|
|