15. Performance of calculations: parallelism in space or time?#
By default, when MPI parallelism is enabled [1] (cf. [U2.08.06]), the calculation options generally benefit here from parallelism called « in space ». That is, the calculations are distributed over the parallel processes in mesh packs. A MPI process will manage, for all time steps, the elementary calculations required by a given group of elements.
But, in addition to the fact that this parallelism is inefficient for options directly involving few or no elementary calculations (such as options REAC/FORC_NODA and xxx_ NOEU), it also gives rise to important and repeated communications MPI [2].
For the options mentioned above, we therefore prefer parallelism called « in time », which distributes, for all the meshes, the calculations (elementary or not) required by a set of time steps. Communications are therefore less expensive [3] and parallelism is extended to portions of code beyond basic calculations.
With this parallelism, accelerations can range from X5 to X50 depending on the configuration of the calculations. The activation of this parallelism is done in a given CALC_CHAMP [4] and it concerns all the calculation options likely to benefit from it: REAC/FORC_NODA and xxx_ NOEU and all the underlying calculation options [5].
On an expensive CALC_CHAMP, including REAC/FORC_NODA or xxx_ NOEU calculations on long transients and/or on large models, we therefore recommend activating this parallelism in time: PARALLELISME_TEMPS =” OUI “. And, unlike the one in space, it doesn’t need a minimum granularity [6] to be effective. As early as ten steps per process MPI, the acceleration provided by time parallelism can be significant [7] ….
The type of active parallelism (in space, time, or none) is specified, option by option, in the message file. For more details you can consult the instructions for using parallelism [U2.08.06].
- Notes on parallelism
Time parallelism, when activated in the operator, concerns all computer processing likely to benefit from it. To do this, we « unplug » parallelism in space from time to time, option by option. This is then reactivated, « as if nothing had happened », in the following options if these are not, in turn, concerned with parallelism in time. So the call
CALC_CHAMP (EFGE_ELGA, REAC_NODA, ETOT_ELGA, EPSI_NOEU,, PARALLELISME_TEMPS =' OUI ')
will activate parallelism in space for options 1 and 3 and that in time for the others. From a parallelism point of view, it is equivalent to the sequence of calls:
CALC_CHAMP (EFGE_ELGA, ETOT_ELGA) CALC_CHAMP (REAC_NODA, EPSI_NOEU, PARALLELISME_TEMPS =' OUI ')
To simplify programming [8] and not to look for minor gains unnecessarily, when the number of time steps is not a multiple of the number of processes MPI, the remaining time steps are carried out by all processes. And of course, no communication is organized between them.
In some situations, time parallelism does not work. Depending on the case, the calculation stops or sounds an alarm: option not yet covered (CHAM_UTIL), calculation context varying during the transition (MODELE…), calculation impossible sequentially, not enough time steps…