Context: High-fidelity simulations of turbulent compressible flows in aerodynamics usually imply the numerical analysis of three-dimensional flows around complex geometries. With the physical modelling of phenomena of increasing complexity and the industrial demand for accurate solutions, the linear systems are often large, sparse and very ill-conditioned. Iterative inner-outer Krylov solvers turned out to be robust and efficient strategies. The original domain is divided into partitions for MPI parallelization. The flexible preconditioning operator of the solver is defined by an inner GMRES solver that uses the Block-ILU(0) algorithm as a first-level preconditioner per partition and the Restricted Additive Schwarz (RAS) method as a second-level preconditioner for the coupling between partitions. While suitable, these strategies still cost a large part of the CPU time of the Computational Fluid Dynamics (CFD) simulations.
Objectives: Matrices arising from elliptic Partial Differential Equations (PDEs) have shown to have a low-rank property that is now efficiently exploited within the direct multifrontal solver MUMPS [1] to provide a substantial reduction of its complexity in terms of floating-point operations and memory requirement. Among the possible low-rank formats, MUMPS uses the Block Low-Rank (BLR) format. Changing the preconditioning operator to exploit a BLR multifrontal factorization per partition has brought promising gains on first CFD applications where matrices come now from hyperbolic PDEs. Global convergence of the direct-iterative strategy will be evaluated at a fixed memory budget driven by the CFD simulation. Several axes will be investigated: partitions size, impact of BLR data compression, costs and accuracy of approximate direct solver. Numerical experiments will be conducted with the high-order CFD code Aghora on representative test-cases. This internship will be in collaboration with the laboratory LIP6 of Sorbonne University.
[1] P. Amestoy, A. Buttari, J.-Y. L’Excellent, T. Mary, Performance and Scalability of the BLR Multifrontal Factorization on Multicore Architectures, ACM Trans. Math. Softw, 2019.
Keywords : Flexible inner-outer GMRES, data compression, BLR, mixed precision, preconditioner, performance.