Title:
|
OUT-OF-CORE DATA HANDLING WITH PERIODIC PARTIAL RESULT MERGING |
Author(s):
|
Sándor Juhász , Renáta Iváncsy |
ISBN:
|
978-972-8924-88-1 |
Editors:
|
Ajith P. Abraham |
Year:
|
2009 |
Edition:
|
Single |
Keywords:
|
Out-of-core data processing, partitioning, efficient data handling with checkpoints |
Type:
|
Full Paper |
First Page:
|
50 |
Last Page:
|
58 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
Efficient handling of large amount of data is hindered by the fact that the data and the data structures used during the data
processing do not fit into the main memory. A widely used solution for this problem is to use the partitioning approach,
where the data set to be processed is split into smaller parts that can be processed in themselves in the main memory.
Summarizing the results created from the smaller parts is done in a subsequent step. In this paper we give a brief
overview of the different aspects of the partitioning approach, and seek for the most desirable approach to aggregate web
log data. Based on these results we suggest and analyze a method that splits the original data set into blocks with equal
sizes, and processes these blocks subsequently. After a processing step the main memory will contain the local result
based on the currently processed block, that is merged afterwards with the global result of the blocks processed so far. By
complexity analysis and experimental results we show that this approach is both fault tolerant and efficient when used in
record-based data processing, if the results are significantly smaller than the original data, and a linear algorithm is
available for merging the partial results. Also a method is suggested to adjust the block sizes dynamically in order to
achieve best performance. |
|
|
|
|