Title:
|
ADAPTING HASH TABLE DESIGN TO REAL-LIFE DATASETS |
Author(s):
|
Sándor Juhász , Ákos Dudás |
ISBN:
|
978-972-8924-86-7 |
Editors:
|
Hans Weghorn, Jörg Roth and Pedro Isaías |
Year:
|
2009 |
Edition:
|
Single |
Keywords:
|
Data transformation, hash tables, open hashing, bucket hashing, pre-processing. |
Type:
|
Full Paper |
First Page:
|
3 |
Last Page:
|
10 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
The process of data mining often includes preprocessing steps like filtering and compression. Compressing unique
identifiers is usually based on look-up tables, which can be efficiently implemented by hash tables. Such hash tables need
to be in-memory, compact, and provide fast retrieval of the data. Finding optimal hash functions for real-life datasets is
hard, as the uniform distribution of values cannot be guaranteed if the input data has unknown characteristics. This paper
seeks for hash table implementation that can compensate the fluctuations, thus allows handling the data even with simple,
commonly available hash functions. We provide the analyses of hash table structures designed to be fast and robust, and
test their performance on real-life data transformation. |
|
|
|
|