Accelerating DynEarthSol3D on tightly coupled CPU-GPU heterogeneous processors


DynEarthSol3D (Dynamic Earth Solver in Three Dimensions) is a flexible, open-source finite element solver that models the momentum balance and the heat transfer of elasto-visco-plastic material in the Lagrangian form using unstructured meshes. It provides a platform for the study of the long-term deformation of earth's lithosphere and various problems in civil and geotechnical engineering. However, the continuous computation and update of a very large mesh poses an intolerably high computational burden to developers and users in practice. For example, simulating a small input mesh containing around 3000 elements in 20 million time steps would take more than 10 days on a high-end desktop CPU. In this paper, we explore tightly coupled CPU-GPU heterogeneous processors to address the computing concern by leveraging their new features and developing hardware-architecture-aware optimizations. Our proposed key optimization techniques are three-fold: memory access pattern improvement, data transfer elimination and kernel launch overhead minimization. Experimental results show that our proposed implementation on a tightly coupled heterogeneous processor outperforms all other alternatives including traditional discrete GPU, quad-core CPU using OpenMP, and serial implementations by 67%, 50%, and 154% respectively even though the embedded GPU in the heterogeneous processor has significantly less number of cores than high-end discrete GPU.

Publication Title

Computers and Geosciences