Title:
|
A TWO-STAGE APPROACH FOR RELEVANT GENE SELECTION FOR CANCER CLASSIFICATION |
Author(s):
|
Rajni Bala , R. K. Agrawal |
ISBN:
|
978-972-8924-88-1 |
Editors:
|
Ajith P. Abraham |
Year:
|
2009 |
Edition:
|
Single |
Keywords:
|
Gene Selection, Microarray Datasets, Filter Methods, Data Mining, Probabilistic Distance Measures. |
Type:
|
Short Paper |
First Page:
|
127 |
Last Page:
|
132 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
The gene expression can be used to identify whether a person is suffering from cancer or not. The gene expression data
usually comes with only dozens of tissue samples but with thousands of genes. The extreme sparseness is believed to
deteriorate the performance of a classifier significantly. Hence extracting a subset of informative genes and removing
irrelevant or redundant genes is crucial for accurate classification. In this paper, a novel two-stage ensemble approach is
proposed to determine a subset of relevant genes subset for reliable cancer classification. Since different gene ranking
methods may give diverse subsets of informative gene, in first stage union of informative genes selected by different gene
ranking methods is considered. This will reduce chances of missing informative genes. This set of informative genes may
contain redundant features as ranking methods does not take into account the relationship between different genes. In
second stage a forward feature selection is used with a measure that selects relevant and non redundant genes. The
proposed method is experimentally assessed on four well known datasets namely Leukemia, SRBCT, Lung Cancer and
Colon Cancer. The experimental results are significantly better in comparison to other methods. |
|
|
|
|