Title:
|
APPROACHES FOR EFFICIENT HANDLING OF LARGE DATASETS |
Author(s):
|
Renáta Iváncsy , Sándor Juhász |
ISBN:
|
978-972-8924-88-1 |
Editors:
|
Ajith P. Abraham |
Year:
|
2009 |
Edition:
|
Single |
Keywords:
|
Out-of-core data processing, efficient data handling, cache, parallelism |
Type:
|
Short Paper |
First Page:
|
143 |
Last Page:
|
147 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
Efficient handling of large datasets is a challenging task since in most cases the data to be processed do not fit into the
memory, thus the high number of slow I/O operations will dominate the performance. There exist several methods for
making data handling more efficient by compressing, partitioning, transforming the input data, suggesting more compact
storage structures or increasing cache friendliness. Our paper investigates and categorizes these approaches and gives an
overview of their benefits and drawbacks when using them in different stages of the data processing pipeline of a general
information system. The performance demand of long running data processing is often coupled with operability
requirements (like fault tolerance and monitoring). |
|
|
|
|