Title:
|
ACCURATELY RANKING OUTLIERS IN DATA WITH MIXTURE OF VARIANCES AND NOISE |
Author(s):
|
Minh Quoc Nguyen , Edward Omiecinski , Leo Mark |
ISBN:
|
978-972-8924-88-1 |
Editors:
|
Ajith P. Abraham |
Year:
|
2009 |
Edition:
|
Single |
Keywords:
|
Data Mining, Outlier Detection |
Type:
|
Full Paper |
First Page:
|
83 |
Last Page:
|
94 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
In this paper, we introduce a bottom-up approach to discover outliers and clusters of outliers in data with a mixture of
variances and noise. First, we propose a method to split the outlier score into dimensional scores. We show that if a point
is an outlier in a subspace, the score must be high for that point in each dimension of the subspace. We then aggregate the
scores to compute the final outlier score for the points in the dataset. We introduce a filter threshold to eliminate the small
scores during the aggregation. The experiments show that filtering is effective in improving the outlier detection rate. We
also introduce a method to detect clusters of outliers by using our outlier score function. In addition, the outliers can be
easily visualized in our approach. |
|
|
|
|