4. Other tools#

In this part, we describe some tools that also make it possible to find bugs or to identify abnormal behaviors:

« -Checkbounds » compiler option

Comparison of 2 different versions of*Code_Aster

4.1. Exceeding arrays (-checkBounds)#

Compilers have tools to instrument code to detect static array overflows (one of the types of bugs that are difficult to find in Fortran).

To use these functionalities, you must recompile the suspicious routines with these options (to do this, you must modify the config.txt and put it in Data in the Overload tab) then execute the code. If an overrun occurs, a fatal error with a message will occur.

Syntax: Gcc (g77): -fbounds-check Intel (ifort): -cb

Notes:

Routines using COMMONJEVEUX (ZI, ZR,…) cannot be compiled with - CBcar so execution stops quickly due to the overflow of the ZI (1) array .

So, using - CBest is a bit complicated: you have to juggle 2 config.txt files and keep the.o files

The benefit of - CBn is not huge because this mechanism does not detect all table overwrites. For the overwrite to be detected, the array must be local (therefore « hard » sized), or else it must be an argument array declared with its exact length (and not name TAB ()). The other advantage of - CBest is the detection of character string overwrites because in Fortran the length of a string is attached to the string. That’s why you can do len (string) on a string that you received as an argument (while you can’t do len (TAB)) . *

Gcc (g77): -fbounds-check

-fbounds-check

-ffortran-bounds-check

Enable generation of run-time checks for array subscripts and substring start and end points against the (locally) declared minimum and maximum values.

The current implementation uses the « libf2c » library routine « s_rnge » to print the diagnostic.

However, whereas f2c generates a single check per reference for a multi-dimensional array, of the computed offset against the valid offset range (0 through the size of the array), g77 generates a single check per subscript expression. This catches some cases of potential bugs that f2c does not, such as references to below the beginning of an assumed size array.

g77 also generates checks for « CHARACTER » substring references, something f2c currently does not do.

Use the new -ffortran-bounds-check option to specify bounds-checking for only the Fortran code you are compiling, not necessarily for code written in other languages.

Note: To provide more detailed information on the offending subscription, g77 provides the « libg2c » run-time library routine « s_rnge » with somewhat differently-formatted information. Here’s a sample diagnosis:

Subscript out of range on file line 4, procedure RNGE.f90/bf.

Attempt to access the -6-th element of variable b [subscript-2-of-2].

Aborted

The above message indicates that the offending source line is line 4 of the file RNGE.F90, within the program unit (or statement function) named bf. The Offended Array is Named B. The offended array dimension is the second for a two-dimensional array, and the offending, computed subscript expression was -6.

For a « CHARACTER » substring reference, the second line has this apparance:attempt to access the 11-th element of variable a [start-substring].

This indicates that the offended « CHARACTER » variable or array is named a, the offended substring position is the starting (leftmost) position, and the offending substring expression is 11.

(Although the verb of « s_rnge » is not ideal for the purpose of the g77 compiler, the above information should provide adequate diagnostic abilities to it users.)

Some of these do not work when compiling programs written in Fortran:

Intel (ifort): -CB

-CB Performs run-time checks on whether array subscript and

substring references are within declared bounds

(same as the -check bounds option).

Error detection example:

forrtl: severe (408): strong: (2): Subscript #1 of the array RESU

Has Value 4 Which is Greater Than the Upper Bound of 3

Routine Line Source PC Picture

asteru_jpl 0000000001F9E956 Unknown Unknown Unknown

asteru_jpl 0000000001F9 DB56 Unknown Unknown Unknown

asteru_jpl 0000000001F11232 Unknown Unknown Unknown

asteru_jpl 0000000001 EDA0E2 Unknown Unknown Unknown

asteru_jpl 0000000001 ED9068 Unknown Unknown Unknown

asteru_jpl 0000000004933AE mkkvec_ 52 mkkvec.F90

asteru_jpl 000000493 AD0 mmmab2_ 40 mmmab2_jpl.f90

asteru_jpl 0000000000E6A639 te0364_ 370 TE0364.f90

asteru_jpl 0000000000910304 te0000_ 1261 TE0000.f90

asteru_jpl 0000000000613998 calculation_ 472 calcul.F90

asteru_jpl 00000000 EE74A0 mmcmat_ 149 mmcmat.F90

asteru_jpl 0000000000D9F12F mmcmem_ 69 Mmcmem.F90

asteru_jpl 0000000009D308F nmdepl_ 300 nmdepl.F90

asteru_jpl 00000000007A4 EA8 op0070_ 304 op0070.f90

asteru_jpl 0000000000599772 ex0000_ 258 ex0000.F90

asteru_jpl 00000000004E6750 execop_ 90 execop.f90

asteru_jpl 00000000004D26EE expass_ 82 Expass.f90

asteru_jpl 000000499 BD7 aster_oper 2635 astermodule.c

4.2. Comparison of 2 different versions of Code_Aster#

Sometimes two different executions of Code_Aster lead to different results.

This can happen:

With the same version of the code on two different platforms.
With two different versions (N and N+1) on the same platform
With the same version but with the two executables « debug » and « nodebug »

*…

The problem to be solved is then to identify the piece of code that behaves differently for the two executions. To locate the problem, you can trigger intermediate impressions in a few « strategic » places in the code:

during each call to elementary calculations (routine Calcul.f90)
during each call to the linear system resolution routine (Resoud.f90 routine)

By doing a diff (or a tkdiff) on the 2 message files produced, you can locate the place where the 2 versions diverge.

Implementation

To trigger these impressions, you must overload the Calcul.f90 and/or Resoud.f90 routine. We then modify the source by forcing the variable: DBG =. TRUE..

This then leads to additional impressions in the message file.

routine calcul.F90

For example, the impressions of the Calcul.f90 routine when calculating option AMOR_ACOU are:

&&CALCUL|IN | PGEOMER | MAIL.COORDINO.VALE | LONMAX=... | SUMM= 0.58898033E+03

&&CALCUL|IN | PIMPEDC | IMPEACOU.CHAC.IMPED.VALE | LONMAX=... | SUM = 0.13370000E+04

&&CALCUL|IN | PMATERC | FIELDMAT.MATE_CODE.VALE | LONMAX=... | SUMMI= 743107436

& CALCUL OPTION = AMOR_ACOU ACOU_FACE8 182

&&CALCUL|OUTG|PMATTTC | _9000024.ME001.RESL | LONMAX=... | SUMMR= 0.74828831E-04

&&CALCUL|OUTF|PMATTTC | _9000024.ME001.RESL | LONMAX=... | SUMMR= 0.74828831E-04

Lines 1, 2, 3 correspond to the 3 « in » parameters of this option. For each parameter, information about the field associated with this parameter is printed: field name, LONMAX of the object containing the field values,… and « summary » (column SOMMR or SOMMI) of the field values (column or) of the field values.

Line 4 indicates that the ligrel on which the calculation is done contains a grel of elements of type ACOU_FACE8 and that the TE00ij.F90 routine called is TE0182.F90.

Line 5 provides information on the « out » field PMATTTC after the elementary calculations of grel ACOU_FACE8 (therefore TE0182.f90).

Lines 4 and 5 can be repeated if there are multiple grel in the ligrel.

Line 6 provides information on the « out » field after calculating all the grel values.

Sometimes the impressions show that although the « in » fields in an elementary calculation are the same, the « out » fields differ. We then know that the problem concerns a precise elementary calculation: OPTION type_element and routine number TE00ij.F90.

Notes:

When a field is integer, real, or complex, the number summarizing the field ( SOMMI or SOMMR ) is a number obtained by « summing » the values in the field. In reality, a slight « bias » is introduced to allow the detection of a permutation of values: the vector (1 2 3 4) will generally lead to a SOMMI different from (2 3 1 4) .

For fields of type CHARACTER, we make an integer sum (SOMMI) by transforming each character into an integer (function ICHAR) .

Attention: a field « in » * is almost always different between two executions, it is the « coded material » field (“PMATERC”): it contains addresses JEVEUXqui have no reason to be the same. Other objects JEVEUXont also almost always have different content each time they run, these are the objects . TITRqui generally contain the date of the execution.

Detail: For each object JEVEUX « summary », we print: its name, its « sum » (SOMMIou SOMMR) its LONMAX, its LONUTI, its TYPE (R/C/I/K8/…), a return_code IRET (if iret /=/= 0, the object JEVEUXest in a doubtful state) as well as a number IGNORE which counts the « ignored » values « in the sum (SOMMIou SOMMR). The ignored values are “Nan” or invalid values (to sum up) : R8 MAEM (), R8 VIDE (), ISMAEM (), (),…

routine Solud.f90

The impressions of the Resoud.f90 routine are:

&&RESOUD 2ND MEMBER | &&MESTAT.2NDMBR_ASS.VALE | LONMAX=... | SOMMR= -0.20000000000E+06

&&RESOLVE STRING | &&ASCAVC.VCI.VCI .VALUE | LONGMAX=... | SUMM= 0.00000000000E+00

&&RESOLVE MATR.VALM | &&MESTAT_MATR_ASSEM.VALM | LONMAX=... | SUMM= 0.13926619473E+13

&&RESOLVE MATR.VALF | &&MESTAT_MATR_ASSEM.VALF | LONMAX=... | SUMM= 0.12301036782E+13

&&RESOLVE MATR.CONL | &&MESTAT_MATR_ASSEM.CONL | LONMAX=... | SUMM= 0.55076923235E+12

&&RESOLVE SOLU | &&MERESO_SOLUTION.VALE | LONMAX=... | SUMM= -0.37488024534E+06

Line 1: second member of the linear system

Line 2: values of the imposed degrees of freedom eliminated (char_cine)

Line 3: values from the initial matrix (before factorization)

Line 4: values of the factored matrix

Line 5: value of the conditioning coefficient of Les Lagranges (dualized imposed ddls)

Line 6: solution values

If line 1 differs, the problem is caused by manufacturing the second member of the system.

If the only line 4 differs, this indicates a factorization problem (routine Preres.f90)

If the only line 6 differs, the problem comes from the resolution (routine Resoud.f90).