7. Algorithm IRA (SOLVEUR_MODAL =_F (METHODE =” SORENSEN “))#

7.1. Introduction#

We have seen that one of the crucial problems of the Lanczos method is the inevitable loss of orthogonality of its base vectors. A generalization of this algorithm to the nonhermitian case imagined by W.E.Arnoldi [Arn51] in 1951 partially solves this problem. It was brought up to date by Y.Saad [Saa80] in 1980 and an extensive literature covers the subject. Apart from the seminal article and the papers by Y.Saad and M.Sadkane [Sad93], we recommend the updated and exhaustive synthesis by J.L.Vaudescal [Vau00].

This method being the basis of the algorithm IRAM called « DeSorensen » (IRAM for “Implicit Restarted Arnoldi Method”), we will first detail its operation, its behaviors and its limitations. Next, we will identify the challenges that IRAM must answer (and it does so in most standard cases!) and we will look at its theoretical and numerical mysteries. We will conclude with a summary of its effective configuration in Code_Aster and with an example of a message file.

7.2. Arnoldi algorithm#

7.2.1. Principle#

Its scope of application covers all « work operator- (pseudo-) scalar product » couples. The counterpart of this opening is the filling of the Rayleigh matrix which becomes of the form Hessenberg upper. This is not very harmful, because you can thus apply the QR algorithm directly to it *.* The first step of a good QR, apart from balancing, is precisely to reduce the work matrix in the form of Hessenberg. This makes it possible to gain an order of magnitude during the resolution itself (see Appendix 1).

The algorithm is very similar to that of Lanczos, it consists in progressively building a family of Arnoldi \({\mathrm{q}}_{\mathrm{1,}}{\mathrm{q}}_{\mathrm{2,}}\mathrm{...},{\mathrm{q}}_{m}\) vectors by projecting orthogonally, at iteration \(k\), the vector \({\mathrm{A}}_{\sigma }{\mathrm{q}}_{k}\) onto the previous \(k\) vectors. The new vector becomes \({\mathrm{q}}_{k+1}\) and thus, step by step, the orthogonality of this family of vectors is ensured.

Unlike Lanczos, the orthogonality of the new vector with respect to all the previous ones is therefore ensured explicitly and not implicitly. This is managed by the Modified Gram-Schmidt algorithm (GSM) (see annex 2) which is sufficiently robust in most cases.

Noting \({\mathrm{e}}_{m}\) the \(m\) th vector of the canonical base, the residual vector of the Arnoldi factorization is written \({\mathrm{R}}_{m}\mathrm{=}{b}_{m+\mathrm{1,}m}{\mathrm{q}}_{m+1}{\mathrm{e}}_{m}^{T}\).

**B***m*

\({\mathrm{R}}_{m}\mathrm{=}{b}_{m+\mathrm{1,}m}{\mathrm{q}}_{m+1}{\mathrm{e}}_{m}^{T}\) n

A s

**Qm*

**Q***m*

Figure 7.2-1. Arnoldi factorization.

The iterative process is summarized as follows:

\(\begin{array}{c}\text{Calcul de}{\mathrm{q}}_{1}\mathrm{/}\mathrm{\parallel }{\mathrm{q}}_{1}\mathrm{\parallel }\mathrm{=}1\text{.}\\ \text{Pour}\text{k=}\text{1,}\text{m}\text{faire}\\ \mathrm{z}\mathrm{=}{\mathrm{A}}_{\sigma }{\mathrm{q}}_{k},\\ \begin{array}{c}\text{Pour}\text{l=}\text{1,}\text{k}\text{faire}(\text{GSM})\\ {b}_{\text{lk}}\mathrm{=}(\mathrm{z},{\mathrm{q}}_{l}),\\ \mathrm{z}\mathrm{=}\mathrm{z}\mathrm{-}{b}_{\text{lk}}{\mathrm{q}}_{l},\end{array}\\ \text{Fin}\text{boucle;}\\ \begin{array}{c}{b}_{k+\text{1,k}}\mathrm{=}\mathrm{\parallel }\mathrm{z}\mathrm{\parallel },\\ \text{Si}{b}_{k+\text{1,k}}\mathrm{\ne }0\text{alors}\end{array}\\ \begin{array}{c}{\mathrm{q}}_{\text{k+1}}\mathrm{=}\frac{\mathrm{v}}{{b}_{k+\text{1,k}}},\\ \text{Sinon}\\ \text{Déflation};\\ \text{Fin si}\text{.}\end{array}\\ \text{Fin}\text{boucle}\text{.}\end{array}\)

Algorithm 13. Theoretical Arnoldi.

The Rayleigh matrix is then written:

\({\mathrm{B}}_{m}\mathrm{=}\left[\begin{array}{cccc}{b}_{\text{11}}& {b}_{\text{12}}& \text{.}\text{.}\text{.}& {b}_{\mathrm{1m}}\\ {b}_{\text{21}}& {b}_{\text{22}}& \text{.}\text{.}\text{.}& {b}_{\mathrm{2m}}\\ 0& \text{.}\text{.}\text{.}& \text{.}\text{.}\text{.}& {b}_{m\mathrm{-}\mathrm{1,}m}\\ 0& 0& {b}_{m,m\mathrm{-}1}& {b}_{\text{mm}}\end{array}\right]\)

Apart from the shape of this matrix and a lower sensitivity to orthogonality problems, this algorithm assures us the same theoretical and numerical properties as Lanczos. However, if the size of the subspace is allowed to increase indefinitely until the desired eigenvalues converge, the rounding effects will still take over and we will run into big trouble. Hence the need, as for Lanczos, to make this process iterative through restarts.

Notes:

This method can be seen as an implicit variant of the Lanczos algorithm with total reorthogonalization.
The implicit orthogonalization of the algorithm can be driven by more expensive but more robust algorithms such as QR or IGSM. This is highly required when the work operator has a too strong lack of normality.
Because of the structure of \({\mathrm{B}}_{m}\), memory complexity is more solicited than with Lanczos, on the other hand the computational complexity remains of the same order of magnitude \(O(\mathrm{nm}(c+3+m))\) (with \(c\) the average number of non-zero terms on the rows of the working matrix).
To improve this first point, Y.Saad [Saa80] showed that the upper Hessenberg structure can have its extreme over-diagonals cancelled out if the reorthogonalization is only partially performed.
Since vector algorithms have a natural tendency to miss multiplicities, we often prefer to use a block version (M.Sadkane 1993). But the size of these affects the quality of the results, which is why we prefer the vector or block versions of IRAM.
The choice of the initial Arnoldi vector is made in the same way as for Lanczos.

7.2.2. Error and convergence estimates#

With regard to the evaluation of the approximation quality of the obtained natural modes, we have a criterion as simple and effective as for Lanczos.

Property 5

The Euclidean residue norm for Ritz element \((\tilde{\lambda },\tilde{\mathrm{u}}\mathrm{=}{\mathrm{Q}}_{m}\mathrm{x})\) is equal to

\({\mathrm{\parallel }\mathrm{r}\mathrm{\parallel }}_{2}\mathrm{=}{\mathrm{\parallel }({\mathrm{A}}_{\sigma }\mathrm{-}\tilde{\lambda }\mathrm{I})\tilde{\mathrm{u}}\mathrm{\parallel }}_{2}\mathrm{=}\mathrm{\mid }{b}_{m+\mathrm{1,}m}\mathrm{\mid }\mathrm{\mid }{\mathrm{e}}_{m}^{T}\mathrm{x}\mathrm{\mid }\)

Proof:

Trivial taking the Euclidean norm of Arnoldi factorization \({\mathrm{A}}_{\sigma }{\mathrm{Q}}_{m}\mathrm{x}\mathrm{=}{\mathrm{Q}}_{m}{\mathrm{B}}_{m}{\mathrm{x}+b}_{m+\mathrm{1,}m}{\mathrm{q}}_{m+1}{\mathrm{e}}_{m}^{T}\mathrm{x}\) and as \({\mathrm{q}}_{m+1}\) is normalized to unity.

Still by focusing on the norm of the supplementary projector \({\mathrm{\parallel }(\mathrm{I}\mathrm{-}{\mathrm{P}}_{m})\mathrm{u}\mathrm{\parallel }}_{2}\) we can generalize the convergence theorem 3 to the non-Hermitian case.

Theorem 6

Let \(({\lambda }_{1},{\mathrm{u}}_{1})\) be the first (classical order, in descending order of module) dominant eigenmode of \({\mathrm{A}}_{\sigma }\) diagonalizable and let \({\mathrm{q}}_{1}\mathrm{=}\mathrm{\sum }_{k\mathrm{=}1}^{n}{\alpha }_{k}{\mathrm{u}}_{k}\) be the initial Arnoldi vector decomposed on the basis of eigenvectors, there is then a Ritz mode \(({\tilde{\lambda }}_{1},{\tilde{u}}_{1})\) such as: \(\mathrm{\mid }{\lambda }_{1}\mathrm{-}{\tilde{\lambda }}_{1}\mathrm{\mid }\mathrm{\le }\frac{{\alpha }^{2}}{\beta }{\xi }_{1}{\delta }_{1}^{m}\)

with \(\alpha \mathrm{=}\frac{\beta }{{\mathrm{\parallel }{\mathrm{P}}_{m}{\mathrm{u}}_{i}\mathrm{\parallel }}_{2}}\), \(\beta\) the constant of theorem 1, \({\xi }_{1}\mathrm{=}\mathrm{\sum }_{k\mathrm{=}\mathrm{1,}k\mathrm{\ne }1}^{n}\mathrm{\mid }\frac{{\alpha }_{k}}{{\alpha }_{1}}\mathrm{\mid }\) and \({\delta }_{1}^{m}\mathrm{=}{(\mathrm{\sum }_{j\mathrm{=}2}^{m+1}\mathrm{\prod }_{k\mathrm{=}\mathrm{2,}k\mathrm{\ne }j}^{m+1}∣\frac{{\lambda }_{k}\mathrm{-}{\lambda }_{1}}{{\lambda }_{k}\mathrm{-}{\lambda }_{j}}∣)}^{\mathrm{-}1}\). This result is applied in the same way to the other modes.

Proof:

By using Y.Saad’s proof ([Saa91] pp209-210) and the result of Theorem 1.

These increases, which are very different from those obtained with Lanczos, however, guide the same phenomena:

If the initial vector has no contribution along the desired eigenvectors, they cannot be captured (\({\xi }_{i}\to +\infty\)).

We**primarily have convergence of the peripheral modes* of the spectrum, all the better as it is separated (property of \({\lambda }_{i}^{m}\)).

The decrease in the error is proportional to the increase in \(m\) (property of \({\lambda }_{i}^{m}\)).

Notes:

When an eigenvalue is poorly conditioned \({\xi }_{i}\to +\infty\) then \(m\) must be increased in order for \({\lambda }_{i}^{m}\) to decrease.
Similar results have been seen in the case of a faulty operator (see ZIA 94).

Based on these lessons, we will now summarize the challenges that IRAM must address.

7.3. The challenges#

Algorithm IRA attempts to provide an elegant remedy to the recurring problems raised by other approaches:

Minimization of the projection space: it offers at least \(m>p+1\) instead of \(m=4p\) by Lanczos.

Optimal management of orthogonalization additional costs establishing a**tradeoff**between the**size of the subspace**and the**frequency of restarts*.

Transparent, dynamic and efficient management of these restarts.

Automatic consideration of**spectral information*.

Setting the**memory requirements**and the**quality**of the**results*.

We therefore have more questions to ask ourselves concerning the reorthogonalization strategy, the frequency of restarts, their locations, the criteria for detecting possible phantom modes… « super- IRAM » takes care of everything!

In short, it provides:

Better**robustness* overall.

computational complexities \(O(\mathrm{4nm}(m-p))\) and memories \(O(\mathrm{2np}+{p}^{2})\) improved (especially compared to a simple Lanczos such as that of Newmann & Pipano) for a fixed precision.

A more rigorous capture of multiplicities, clusters, and rigid body modes (so**fewer parasitic modes*!).

Notes:

On this last point, only a block-based version of IRAM or a version incorporating « purge and lock » (capture and filtering techniques, cf. D.Sorensen & R.B.Lehoucq, 1997) can guarantee us correct detection of the spectrum of a standard operator (i.e. not too poorly packaged).

It seems that in practice in*Code_Aster, the ratio in computational complexity between IRAM and Lanczos/Bathe & Wilson is at least order 2 in favor of the former. With reasonable problem sizes (a few tens of thousands of ddls and \(p=O(100)\)) this one can go up to 10 (without the encapsulation of CALC_MODES on several search sub-bands). In some semi-industrial cases, it made it possible to conduct a spectrum search that had failed with Jacobi and was unaffordable with Lanczos (given the time frame).

A class of algorithm called « Jacobi-Davidson » (cf. R.B.Morgan 1990) seems even more promising for treating pathogenic cases. It uses a Lanczos algorithm that it preconditions via a Rayleigh method.

In the next paragraph we will explain how IRAM works.

7.4. “Implicit Restarted Arnoldi” Algorithm (IRA)#

This algorithm was initiated by D.C.Sorensen [Sor92] in 1992 and is experiencing a real boom for the resolution of large modal systems on parallel supercomputers. Its framework of application is quite general. It treats real as well as complex problems, Hermitian or not. It is summarized in a succession of Arnoldi factorizations whose results automatically pilot static and implicit restarts, via polynomial filters modeled by implicit QRs.

First of all, he performs an Arnoldi factorization of the order \(m=p+q\) (in theory \(q=2\) is sufficient, in practice \(q\mathrm{=}p\) is preferable). Moreover, it is this last value by default that was retained (see §7.5) from the work matrix. Then, once this pre-treatment has been carried out, it iterates a process of filtering the part of the undesirable spectrum (numerically and methodologically, it is easier to exclude than to incorporate).

He starts by determining the spectrum of the Rayleigh matrix (via the undetronable QR) and he dissociates the unwanted part from it (by referring to the criteria set by the user) which he then uses as a shift to set up a series of \(q\) implicit QRs with simple shift (see Appendix 1). The factorization is then written as:

\({\mathrm{A}}_{\sigma }{\mathrm{Q}}_{m}^{\text{+}}\mathrm{=}{\mathrm{Q}}_{m}^{\text{+}}{\mathrm{B}}_{m}^{\text{+}}+{\mathrm{R}}_{m}\mathrm{Q}\)

where \({\mathrm{Q}}_{m}^{\text{+}}\mathrm{=}{\mathrm{Q}}_{m}\mathrm{Q}\), \({\mathrm{B}}_{m}^{\text{+}}\mathrm{=}{\mathrm{Q}}^{T}{\mathrm{B}}_{m}\mathrm{Q}\), and \(\mathrm{Q}\mathrm{=}{\mathrm{Q}}_{1}{\mathrm{Q}}_{2}\mathrm{....}{\mathrm{Q}}_{q}\) the unit matrix associated with the QRs. After updating the matrices, we truncate them to order \(p\)

\({\mathrm{A}}_{\sigma }{\mathrm{Q}}_{p}^{\text{+}}\mathrm{=}{Q}_{p}^{\text{+}}{\mathrm{B}}_{p}^{\text{+}}+{\mathrm{R}}_{p}^{\text{+}}\)

and so, at the cost of \(q\) new Arnoldi iterations, we can find a viable \(m\) Arnoldi factorization. All the subtlety of the process is based on this last sequence. Note:

\(\Phi ({\mathrm{A}}_{\sigma })\mathrm{=}\mathrm{\prod }_{i\mathrm{=}1}^{q}({\mathrm{A}}_{\sigma }\mathrm{-}{\tilde{\lambda }}_{p+i}\text{Id})\),

the matrix polynomial of order \(q\) generated by the work operator. In fact, the implicit QRs acted on the first \(p\) rows of the Arnoldi matrix, Rayleigh and the residue, in such a way that complementary factorization \(q\) produced the same effect as an order factorization \(m\) initiated by the vector

\({\tilde{\mathrm{q}}}_{1}\mathrm{=}\Phi ({\mathrm{A}}_{\sigma }){\mathrm{q}}_{1}\).

The initial vector was implicitly modified (there is no construction and effective application of the polynomial) so that it preferentially generates the desired modes by subtracting the components deemed « unsuitable ». As already noted, this type of restart makes it possible to reduce the residue by reducing the components of the initial vector according to the undesirable modes. In exact arithmetic, we would immediately have \({\mathrm{R}}_{m}\mathrm{=}0\) and the process would end there!

Iteration after iteration, the quality of the modes sought is therefore improved by relying on auxiliary modes. Of course, this is estimated at each stage and determines the opportunity to simulate another restart or not. The process can be summed up (very macroscopically!) as follows:

\(\begin{array}{c}\text{Factorisation d'Arnoldi d'ordre}m\mathrm{:}{\text{AQ}}_{m}\mathrm{=}{\mathrm{Q}}_{m}{\mathrm{B}}_{m}+{\mathrm{R}}_{m}\text{.}\\ \text{Pour k= 1, NMAX\_ITER\_SOREN faire}\\ \text{Calculer}\underset{\begin{array}{c}\text{Conservées pour}\\ \text{amélioration}\end{array}}{\underset{\underbrace{}}{\tilde{{\lambda }_{1}},\tilde{{\lambda }_{2}}\text{.}\text{.}\text{.}\tilde{{\lambda }_{p}}}},\underset{\begin{array}{c}\text{Utilisées comme}\\ \text{shifts}\end{array}}{\underset{\underbrace{}}{{\tilde{\lambda }}_{\text{p+1}}\text{.}\text{.}\text{.}{\tilde{\lambda }}_{\text{p+q}},}}\\ \text{QR avec shifts implicites,}\\ \text{Mise à jour}{\mathrm{Q}}_{m},{\mathrm{B}}_{m}\text{et}{\mathrm{R}}_{m},\\ \text{Troncature}\text{de}\text{ces}\text{matrices}à\text{l'ordre}p,\\ \begin{array}{c}{\text{AQ}}_{p}\mathrm{=}{\mathrm{Q}}_{p}{\mathrm{B}}_{p}+{\mathrm{R}}_{p}\mathrm{\Rightarrow }{\text{AQ}}_{m}\mathrm{=}{\mathrm{Q}}_{m}{\mathrm{B}}_{m}+{\mathrm{R}}_{m},\\ \text{Estimation qualité des}p\text{modes}\text{.}\end{array}\\ \text{Fin boucle}\text{.}\end{array}\)

Algorithm 14. Method IRA (known as Sorensen) .

The maximum number of iterations is driven by the keyword NMAX_ITER_SOREN of the keyword factor CALC_FREQ. It should be noted that the Arnoldi algorithm is carefully complemented by a total reorthogonalization (triggered only if necessary). This additional cost is all the more acceptable since it is implemented via the Kahan-Parlett algorithm IGSM (see Appendix 2) which is particularly effective. All this makes it possible to ensure that the projection holds up well with respect to the initial spectrum.

On the other hand, evaluating the quality of the modes is not done simply by building the \(p\) residues \({\mathrm{\parallel }{\mathrm{r}}_{i}\mathrm{\parallel }}_{2}\mathrm{=}{\mathrm{\parallel }({\mathrm{A}}_{\sigma }\mathrm{-}\tilde{\lambda }\mathrm{I}){\tilde{\mathrm{u}}}_{i}\mathrm{\parallel }}_{2}\) via property 10. It has already been mentioned that in the non-Hermitian case, they will not be sufficient for this task, especially in the case of a strong lack of normality. To pay for it, without using other information [23] _ a priori, we use the previous property supplemented by a criterion due to Z.Bai et al [BDK93]

\(\mathrm{\mid }{b}_{m+\mathrm{1,}m}\mathrm{\mid }\mathrm{\mid }{\mathrm{e}}_{m}^{T}\mathrm{x}\mathrm{\mid }<\text{max}(\varepsilon \mathrm{\parallel }{\mathrm{B}}_{m}\mathrm{\parallel },\text{PREC\_SOREN}\mathrm{\mid }\tilde{\lambda }\mathrm{\mid })\)

where \(\varepsilon\) is machine precision and PREC_SOREN is a keyword initialized under CALC_FREQ. The user therefore has (partial) control of the quality of the modes, which he did not have with the other methods implemented in the code. Given the different standards used, this error is different from the one resulting from the global post-processing (see §3.6) which is displayed in the results columns.

Notes:

The polynomial acceleration technique used is more efficient than that of Chebychev, since the latter is explicit and requires m matrix-vector products.
To avoid the deterioration of Ritz vectors (and therefore of approximate eigenvectors) by eigenvalues of very large modules (associated with the core of \(\mathrm{B}\) in « shift and invert »), Ericsson & Ruhe filtering [ER80] has been implemented. More robust techniques exist but they require prior information concerning in particular the Jordan blocks associated with the operator’s core (cf. Meerbergen & Spence, 1996).
This algorithm can be seen as a truncated form of the implicit QR algorithm by J.C.F.Francis (see Appendix 1).

The following paragraph will explain the choices that led to the variant put in place in the code.

7.5. Implementation in Code_Aster#

7.5.1. ARPACK#

Package ARPACK (ARPACK for ARnoldi PACKage) implements the IRA method. It is a public product that can be downloaded on the internet [Arp]. These designers, D.Sorensen, R.Lehoucq, and C.Yang from Rice University in Houston, wanted it all at once:

Easy to access: Fortran/C/C++ interfacing via the « reverse communication » model.
Modular: based on libraries LINPACK, LAPACK, and BLAS.
Rich in numerical features: Schur decomposition, modular shifts, numerous spectral transformations, configurable stopping criteria.
Covering a wide range of use: simple/double precision, real/complex, GEP with symmetric/non-symmetric matrices…
High-performance: to save time CPU and memory occupancy, the package is available in PARPACK parallelized version via MPI and BLACS. On the other hand, even with ARPACK sequential, the linear system resolutions required by the method can be performed by a parallel linear solver. The memory management of the large objects required by ARPACK is left to the host code.

Figure 7.5-1. The home page of the ARPACK [Arp] website.

Its effectiveness is increased tenfold by the use of BLAS of level 2 and 3 very optimized**and by the implementation of « reverse communication ». The user is therefore in control of his data structures and his processing procedures concerning the operator and the work scalar product. It is he who provides ARPACK routines with this type of information. This therefore makes it possible to best manage, with*Code_Aster* tools and procedures, matrix-vector products and linear system resolutions.

For now, the memory management of the large objects required by ARPACK is entrusted to JEVEUX. It therefore benefits from its « out-of-core » abilities [24] _ « . On the other hand, since Aster modal operators have access to the same linear solvers as those in static, they can also benefit from the advances in time and memory that parallelism provides (SOLVEUR/METHODE =” MUMPS “or” “or” MULT - FRONT “cf. [U4.50.01] [U2.08.06]). But be careful, these gains are limited to the linear solver part of the modal calculation. The achievable speed-ups are therefore less promising since this part can represent, in some cases, only 50% of the calculation time [25] _ .

The ARPACK site offers documentation (theory and use), examples of use and a download page. The finalized version of ARPACK (v2.5 from 27/08/96) used in*Code_Aster* seems to be the last one. Despite thousands of uses in the world (academic and industrial), a proven referencing [HRTV07] and a certain recognition (see the « ARPACK applications » tab of the site and the links with other libraries such as PETSc and TRILINOS) this software development project was stopped in 1997. It lasted a few more years via the C++ version (ARPACK ++) by D.Sorensen and F.Gomez.

Figure 7.5-2. Logo for the C++ version of ARPACK.

7.5.2. Adaptations of the Sorensen algorithm#

To deal with generalized modal problems, this package offers a whole series of spectral transformations, in real or in complex, Hermitian or not. In Hermitian, Sorensen’s algorithm is based on the Lanczos- QL couple (IRLM for Implicit Restarted Lanczos Method). Moreover, the two approaches (Hermitian or not) are not intended to deal with scalar pseudo-products linked to indefinite matrices.

Precisely, to both define numerical problems related to their fairly heterogeneous properties in the code and, on the other hand, to ensure better overall robustness, we chose to work in a non-symmetric (IRAM with**Arnoldi**and**QR) **, on**the following work-product-scalar pair**( with**Arnoldi**and**QR) **, on**the following work-product-scalar pair:

\(\begin{array}{c}\underset{{\mathrm{A}}_{\sigma }}{\underset{\underbrace{}}{{(\mathrm{A}\mathrm{-}\sigma \mathrm{B})}^{\mathrm{-}1}\mathrm{B}}}\mathrm{u}\mathrm{=}\underset{\lambda }{\underset{\underbrace{}}{\frac{1}{\mu \mathrm{-}\sigma }}}\mathrm{u}\\ (\mathrm{x},\mathrm{y})\mathrm{=}{\mathrm{y}}^{\mathrm{T}}\mathrm{x}\end{array}\)

We could have treated buckling problems in « buckling mode » via using the same « shift and invert » and the pseudo-scalar product introduced by \(A\). But due to the almost systematic introduction of Lagranges, this matrix becomes indefinite or even singular, which greatly disrupts the process. The same causes produce the same effects when, for a dynamics calculation, we use the \(B\) -dot product. Instead of changing the whole package by introducing a pseudo-scalar product, we therefore opted for a simple « Euclidean » scalar product that was more robust and much less expensive.

It was therefore necessary to modify the « reverse-communication » procedures of the package, because it did not provide for this option (with standard matrices we traditionally prefer to enrich the components with a matrix dot product, even if it is not symmetric). Unlike the Newmann & Pipano variant introduced for Lanczos, we deliberately placed ourselves in a non-symmetric configuration. But in order to avoid recurring orthogonality problems as much as possible, even symmetric, we would have opted for the version of IRAM using Arnoldi.

The disadvantage of this approach is that, during the preliminary post-processing of IRAM, \(B\) ‑orthonormalize the approximate eigenvectors in order to numerically find property 2 exploited by modal recombinations. This step does not disturb the exhumed mode base and is very effectively achieved*via* Kahan-Parlett’s IGSM.

Taking into account limit conditions and, in particular, double dualizations, was carried out as for Lanczos according to the procedure described in §3.2. In particular, once the initial vector is in the admissible space, the work operator is applied to it. This conventional method makes it possible to purge the Lanczos vectors (and therefore the Ritz vectors) of the components of the nucleus.

On the other hand, in some configurations for which the number of frequencies requested \(p\) is equal to the number of active degrees of freedom, we had to bluff the algorithm which stopped in a fatal error! Indeed, it generally detected the expected invariant space well (of size \(p\)), but due to the particular structure of the eigenvectors associated with Lagranges (cf. proof of property 4, § 3.5) it had a lot of trouble generating an initial vector that was proportional to them.

Special treatments would have been needed taking into account the numbering of these Lagranges, which would have been all the more expensive since they are not fundamentally necessary to solve the problem requested! All the spectral information is already present deep in the algorithm, so it is not worth it to complete the two remaining iterations (when the user asks \(p={n}_{\mathrm{ddl}-\mathrm{actifs}}\), \(m=p+2\) is automatically imposed). All you have to do is short circuit the natural thread of the algorithm, remove interesting Ritz modes and post-process them to return to the initial space.

Note:

This type of scenario in which we are looking for a number of natural modes that is very close to the number of degrees of freedom is outside the « ideal » scope of use of this type of algorithm. A good QR would certainly be more effective, but it’s a good way to test the algorithm.

7.5.3. Parameterization#

To be able to activate this method, you must initialize the METHODE keyword to “ SORENSEN “. The size of the projection subspace is determined, either by the user or empirically from the formula:

\(m=\mathrm{min}(\mathrm{max}(2p,p+2),{n}_{{\mathrm{ddl}}_{\mathrm{actifs}}})\)

where	\(p\)	is the number of eigenvalues to be calculated,
	\({n}_{\mathrm{ddl}-\mathrm{actifs}}\)	is the number of active degrees of freedom of the system (cf. [§3.2])

The user can always impose the dimension himself by indicating it with the keyword DIM_SOUS_ESPACE of the keyword factor CALC_FREQ. Instead of a value by default, a multiplication coefficient in relation to the number of frequencies contained in the study interval may be preferred. This type of feature relates to the COEF_DIM_ESPACE keyword.

The Kahan-Parlett IGSM parameter (see Appendix 2) PARA_ORTHO_SOREN , the maximum number of iterations of the global process, NMAX_ITER_SOREN, and the mode quality control criterion, PREC_SOREN, are accessible by the user under the keyword factor CALC_FREQ. When this last keyword is null, the algorithm initializes it to machine precision.

7.6. Scope of use#

GEP with any real matrices (symmetric or not) or with a matrix \(A\) complex and \(B\) symmetric real.

The user can specify the class to which his calculation belongs (dynamic or buckling if symmetric real matrices) by initializing the TYPE_RESU keyword. Depending on this value, we enter the vector FREQ or CHAR_CRIT.

7.7. Display in the message file#

The example below from the list of code test cases (ssll103b) summarizes all the traces managed by the algorithm. In particular, for each critical load (or frequency), we find the estimation of its quality via the error standard.

Here, IRAM iterated only once and used 30 IGSM (in its first phase). The global resolution consumed 91 products (in fact, less than that due to the « implicit » introduction of the Euclidean dot product) matrix-vector and 31 system inversions (just the feedback because the work operator is already factored).

THE NOMBRE OF DDL

TOTAL EST: 68

FROM LAGRANGE EST: 14

THE NOMBRE OF DDL ACTIFS EST: 47

The OPTION CHOISIE EST: PLUS_PETITE

THE VALEUR OF DECALAGE CHARGE CRITIQUE EST: 0.00000E+00

INFORMATIONS SUR THE CALCUL DEMANDE:

NOMBRE OF MODES DEMANDES: 10

THE DIMENSION OF ESPACE REDUIT EST: 30

= METHODE OF SORENSEN (CODE ARPACK) =

= VERSION: 2.4 =

= DATE: 07/31/96 =

NOMBRE OF REDEMARRAGES = 1

NOMBRE OF PRODUITS OP*X = 31

NOMBRE OF PRODUITS B*X = 91

NOMBRE OF REORTHOGONALISATIONS (ETAPE 1) = 30

NOMBRE OF REORTHOGONALISATIONS (ETAPE 2) = 0

NOMBRE OF REDEMARRAGES DUE TO A V0 NUL = 0

LES CHARGES CRITIQUES CALCULEES INF. AND SUP. SONT:

CHARGE_CRITIQUE_INF: -9.96796E+06

CHARGE_CRITIQUE_SUP: -6.80007E+05

CALCUL MODAL: METHODE OF ITERATION SIMULTANEE

METHODE OF SORENSEN

NUMERO CHARGE CRITIQUE NORME D’ERREUR

1-6.80007E+05 5.88791E-12

2 -7.04572E+05 1.53647E-12

3 -7.09004E+05 1.16735E-12

…

10 -9.96796E+06 3.55014E-12

VERIFICATION A POSTERIORI DES MODES

DANS THE INTERVALLE (-1.00178E+07, -6.76607E+05)

THERE IS BIEN 10 CHARGE (S) CRITIQUE (S)

Example 6. Trace of CALC_MODES with OPTIONparmi [“BANDE”, “”, “CENTRE”, “PLUS_”] in the message file with SOLVEUR_MODAL =_F (METHODE =” SORENSEN”) .

Note:

The introduction of this method has made it possible to resolve numerous anomaly sheets related to multiplicities, clusters or searches for eigenvalues of very different orders of magnitude on which Lanczos and Bathe & Wilson were struggling.

7.8. Summary of the settings#

Let’s now summarize the available settings for operator CALC_MODES with OPTION among [“BANDE”, “CENTRE”, “PLUS_ *”] with this algorithm.

Key factor	Keyword	Default value	References
	TYPE_RESU =” DYNAMIQUE “	“DYNAMIQUE”	§3.1
	“MODE_FLAMB”		§3.1

	OPTION =” PLUS_ *”	“PLUS_PETITE “”	§5.4
	“BANDE”		§5.4
	“CENTRE”		§5.4

SOLVEUR_MODAL	METHODE =” SORENSEN “	“SORENSEN”	§7.4
	PREC_SOREN		§7.4, §7.5
	NMAX_ITER_SOREN	20	§7.4, §7.5
	PARA_ORTHO_SOREN	0.717	§7.5, Appendix 2
	DIM_SOUS_ESPACE COEF_DIM_ESPACE	computed	§7.5

CALC_FREQ or CALC_CHAR_CRIT	FREQ or CHAR_CRIT ** (if “ CENTRE “** )		§5.4
	NMAX_FREQ or NMAX_CHAR_CRIT (if” PLUS_ “*)	10	§5.4
	AMOR_REDUIT (if “ CENTRE “ and FREQ)		§5.4
	NMAX_ITER_SHIFT	3	[] §3.2
	PREC_SHIFT	0.05	[] §3.2
	SEUIL_FREQ or SEUIL_CHAR_CRIT	1.E-02	§3.7

VERI_MODE	STOP_ERREUR =” OUI “	“OUI”	§3.7
	“NON”		§3.7
	PREC_SHIFT	5.E-03	§3.7
	SEUIL	1.E-06	§3.7
	STURM =” OUI “	“OUI”	§3.7
	=” NON “		§3.7

Table 7.8-1. Summary of the settings of CALC_MODES with OPTIONparmi [“BANDE”, “CENTRE”, “PLUS_ “] (GEP) *

with SOLVEUR_MODAL =_F (METHODE =” SORENSEN “) .

Notes:

We find all the « trilogy » of parameters related to the pre-treatments of the Sturm test (NMAX_ITER_SHIFT, PREC_SHIFT) and to the verification post-treatments (SEUIL_FREQ/SEUIL_CHAR_CRIT, VERI_MODE).
During the first few steps, it is strongly recommended to change only the main parameters marked in bold. The others are more about the arcana of the algorithm and they have been initialized empirically to standard values.
In particular, to improve the quality of a mode, the fundamental parameter is the dimension of the subspace (DIM_SOUS_ESPACE/COEF_DIM_ESPACE).