3. A few tips for use#

If you wish [2] _ change the linear solver or adapt its settings, several questions must be addressed: What type of problem do you want to solve? What are the numerical properties of the linear systems encountered? etc.

We list below, and in a non-exhaustive way, several questions that are interesting to ask yourself when trying to optimize linear solver aspects. Of course, some questions (and answers) are cumulative and can therefore be applied simultaneously.

In short:

The method by default remains the internal multifronter MULT_FRONT. But to fully benefit from the gains CPU and RAM provided by parallelism, or to solve a numerically difficult problem (X- FEM, incompressibility, THM), we recommend the use of the external product MUMPS. The bigger the problem, the more we recommend using parallelism MPI and acceleration/compression techniques (ACCELERATION/LOW_RANK_SEUIL).

If, despite everything, on a given computing platform, the problem does not pass into memory, the use of PETSC’s iterative solvers can be a solution (if their functional perimeters allow it). Note that by default they « subtly » use MUMPS as a preconditioner (option PRE_COND =” LDLT_SP “).

To go further in memory economies, you can also degrade preconditioning (the calculation will probably take longer) by choosing one of the other preconditioners of PETSC (“LDLT_INC”…).

In non-linear, to save time, you can also play on several relaxation parameters (SYMEen non-symmetric) or « nonlinear solver/linear solver » interactions (REAC_PRECOND, NEWTON_KRYLOV…).

For non-linear problems well conditioned (thermal…), the use of a « relaxed » MUMPS (MIXER_PRECISION/FILTRAGE_MATRICE) can lead to very significant memory gains. Likewise, in linear as in non-linear, with PETSCsans preconditioner (PRE_COND =” SANS “).

For more details and advice on the use of linear solvers, you can consult the user manual [U4.50.01] and the associated reference documentation [R6…]. The related problems of improving the performance (RAM/CPU) of a calculation and, of the use of parallelism, are also the subject of detailed instructions: [U1.03.03] and [U2.08.06].

What type of problem do you need to solve?

Solving many systems with the same matrix (multiple second member type problem [3]) _

) ⇒ solvers MULT_FRONT or MUMPS (if possible by disabling RESI_RELA and with GESTION_MEMOIRE =” IN_CORE “).

Standard linear parametric calculation ⇒ a first try with MUMPS leaving the default settings, then all the other runs by unplugging RESI_RELA or with POSTTRAITEMENTS =” MINI “.
Non-linear parametric calculation with a well-conditioned matrix ⇒ a first try with MUMPS by playing on the relaxation parameters (FILTRAGE_MATRICE/MIXER_PRECISION or SYME if you are non-symmetric). Then, if an optimized operating point (consumption CPU/RAM) has been identified, use it for all the other runs.

We can also try MUMPS but, this time, as a simple preconditioner of PETSC/GCPC (LDLT_SP). The advantage is then to only update it periodically (REAC_PRECOND).

What are its digital properties?

Well conditioned linear system [4] _

(<104) ⇒ solvers MUMPS + MIXER_PRECISION or GCPC/PETSC.

Difficult linear system (poor conditioning, mixed finite elements, Lagranges predominances…) ⇒ solver MUMPS.

Do we need a very specific solution?

We can settle for a**very approximate solution* [5] _ ⇒ solvers GCPC or PETSC with a RESI_RELA =10-3 or MUMPS with LOW_RANK_SEUIL <10-9 and POSTTRAITEMENTS =” SANS “.

We want a**precise solution**or, at least, a diagnosis on its quality and on the numerical difficulties of the system to be solved**⇒ * solver MUMPS by activating NPREC and RESI_RELA (in INFO =2).

How to optimize the time/memory consumption RAM of the linear solver?

Time ⇒ solver MUMPS by deactivating OOC (GESTION_MEMORE =” IN_CORE “) or even RESI_RELA. Hybrid parallel computing MPI /OpenMP (cf. doc. U2 on parallelism). On large problems use low_rank compressions and accelerations via using the ACCELERATION/LOW_RANK_SEUIL keywords.
Memory ⇒ solver MUMPSen activating GESTION_MEMOIRE =” OUT_OF_CORE “/ MATR_DISTRIBUE and in distributed parallel mode. Or Krylov-like iterative solver: GCPC/PETSC + LDLT_SP.

In non-linear mode, if the matrix is well conditioned and/or not symmetric, it is also possible to play on the relaxation parameters of MUMPS (FILTRAGE_MATRICE, MIXER_PRECISION and SYME) or use an iterative solver by reducing the cost of preconditioning. More details can be found in the discussion in § 7.2.6.

Memory ⇒ solver PETSCen activating MATR_DISTRIBUE and in distributed parallel mode.

Is it a very large border problem (> 5.10 6 degrees of freedom )?

Robust solver ⇒ solver MUMPS with previous memory optimizations and low-rank compressions (ACCELERATION/LOW_RANK_SEUIL).
Last chance solvers ⇒ iterative solvers (with a low preconditioning level).

How can I optimize the overall performance of my calculation?

User manual [U1.03.03].

How to perform, calibrate, and optimize a parallel calculation?

User manual [U2.08.06].