Title:
|
FIRST SYCL IMPLEMENTATION OF THE THREE-DIMENSIONAL SUBSURFACE XCA-FLOW CELLULAR AUTOMATON AND PERFORMANCE COMPARISON AGAINST CUDA |
Author(s):
|
Donato D'Ambrosio, Giovanni Terremoto, Alessio De Rango, Luca Furnari, Alfonso Senatore and Giuseppe Mendicino |
ISBN:
|
978-989-8704-44-3 |
Editors:
|
Hans Weghorn and Pedro Isaias |
Year:
|
2022 |
Edition:
|
Single |
Keywords:
|
XCA-Flow 3D Subsurface Model, Extended Cellular Automata, SYCL vs CUDA, Performance Assessment |
Type:
|
Full Paper |
First Page:
|
47 |
Last Page:
|
54 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
We present the results of a first SYCL vs CUDA performance assessment for the case of the three-dimensional XCA-Flow subsurface Extended Cellular Automata model. A grid domain of ?????? ?? ?????? ?? ???? cubic cells of 0.3 m side was considered, where two different heterogeneous hydraulic conductivity fields were imposed, resulting in different computational loads. For each conductivity field, a 10 days test case simulation with a constant infiltration rate over the central part of the upper domain's interface and no-flow condition at other boundaries was designed as a benchmark for performance assessment. The stencil-based kernels of the XCA-Flow model were implemented by considering the one-thread-one-cell thread-to-cell mapping and global memory accesses. A global reduction, needed by the algorithm at each computational step to find the minimum of a domain's state variable, exploited the device's on-chip local memory (shared memory in the CUDA nomenclature). The CUDA back-end featured SYCL compiler adopted was the Intel DPC++. The experiments were performed on an Nvidia Titan Xp GPU by considering different configurations of SYCL/CUDA thread blocks. Each simulation was re-executed four times by selecting the minimum elapsed time. As expected, the CUDA implementation performed slightly better. Nevertheless, SYCL provided satisfying results, with a limited mean gap of approximately 8%. |
|
|
|
|