Advanced Features of Matrices and Solvers#

This chapter introduces additional features of the PETSc matrices and solvers.

Extracting Submatrices#

One can extract a (parallel) submatrix from a given (parallel) using

MatCreateSubMatrix(Mat A,IS rows,IS cols,MatReuse call,Mat *B);

This extracts the rows and cols of the matrix A into B. If call is MAT_INITIAL_MATRIX it will create the matrix B. If call is MAT_REUSE_MATRIX it will reuse the B created with a previous call. This function is used internally by PCFIELDSPLIT.

One can also extract one or more submatrices per MPI process with

MatCreateSubMatrices(Mat A,PetscInt n,IS rows[],IS cols[],MatReuse call,Mat *B[]);

This extracts n (zero or more) matrices with the rows[k] and cols[k] of the matrix A into an array of sequential matrices B[k] on this process. If call is MAT_INITIAL_MATRIX it will create the array of matrices B. If call is MAT_REUSE_MATRIX it will reuse the B created with a previous call. The IS arguments are sequential. The array of matrices should be destroyed with MatDestroySubMatrices(). This function is used by PCBJACOBI and PCASM.

Each submatrix may be parallel, existing on a MPI_Comm associated with each pair of IS rows[k] and cols[k], using

MatCreateSubMatricesMPI(Mat A,PetscInt n,IS rows[],IS cols[],MatReuse call,Mat *B[]);

Finally this version has a specialization

MatGetMultiProcBlock(Mat A, MPI_Comm subComm, MatReuse scall,Mat *subMat);

where collections of non-overlapping MPI processes share a single parallel matrix on their sub-communicator. This function is used by PCBJACOBI and PCASM.

The routine

MatCreateRedundantMatrix(Mat A,PetscInt nsubcomm,MPI_Comm subcomm,MatReuse reuse,Mat *matredundant);

where nsubcomm copies of the entire matrix are stored, one on each subcomm. The routine PetscSubcommCreate() and its PetscSubcomm object may, but need not be, used to construct the subcomm.

The routine

MatMPIAdjToSeq(Mat A,Mat *B);

is a specialization that duplicates an entire MATMPIADJ matrix on each MPI process.

Matrix Factorization#

Normally, PETSc users will access the matrix solvers through the KSP interface, as discussed in KSP: Linear System Solvers, but the underlying factorization and triangular solve routines are also directly accessible to the user.

The ILU, LU, ICC, Cholesky, and QR matrix factorizations are split into two or three stages depending on the user’s needs. The first stage is to calculate an ordering for the matrix. The ordering generally is done to reduce fill in a sparse factorization; it does not make much sense for a dense matrix.

MatGetOrdering(Mat matrix,MatOrderingType type,IS* rowperm,IS* colperm);

The currently available alternatives for the ordering type are

MATORDERINGNATURAL - Natural
MATORDERINGND - Nested Dissection
MATORDERING1WD - One-way Dissection
MATORDERINGRCM - Reverse Cuthill-McKee
MATORDERINGQMD - Quotient Minimum Degree

These orderings can also be set through the options database.

Certain matrix formats may support only a subset of these. All of these orderings are symmetric at the moment; ordering routines that are not symmetric may be added. Currently we support orderings only for sequential matrices.

Users can add their own ordering routines by providing a function with the calling sequence

int reorder(Mat A,MatOrderingType type,IS* rowperm,IS* colperm);

Here A is the matrix for which we wish to generate a new ordering, type may be ignored and rowperm and colperm are the row and column permutations generated by the ordering routine. The user registers the ordering routine with the command

MatOrderingRegister(MatOrderingType ordname,char *path,char *sname,PetscErrorCode (*reorder)(Mat,MatOrderingType,IS*,IS*)));

The input argument ordname is a string of the user’s choice, either an ordering defined in petscmat.h or the name of a new ordering introduced by the user. See the code in src/mat/impls/order/sorder.c and other files in that directory for examples on how the reordering routines may be written.

Once the reordering routine has been registered, it can be selected for use at runtime with the command line option -pc_factor_mat_ordering_type ordname. If reordering from the API, the user should provide the ordname as the second input argument of MatGetOrdering().

PETSc matrices interface to a variety of external factorization/solver packages via the MatSolverType which can be MATSOLVERSUPERLU_DIST, MATSOLVERMUMPS, MATSOLVERPASTIX, MATSOLVERMKL_PARDISO, MATSOLVERMKL_CPARDISO, MATSOLVERUMFPACK, MATSOLVERCHOLMOD, MATSOLVERKLU, MATSOLVERCUSPARSE, and MATSOLVERCUDA. The last three of which can run on GPUs, while MATSOLVERSUPERLU_DIST can partially run on GPUs. See Summary of Sparse Linear Solvers Available In PETSc for a table of the factorization based solvers in PETSc.

Most of these packages compute their own orderings and cannot use ones provided so calls to the following routines with those packages can pass NULL as the IS permutations.

The following routines perform incomplete and complete, in-place, symbolic, and numerical factorizations for symmetric and nonsymmetric matrices:

MatICCFactor(Mat matrix,IS permutation,const MatFactorInfo *info);
MatCholeskyFactor(Mat matrix,IS permutation,const MatFactorInfo *info);
MatILUFactor(Mat matrix,IS rowpermutation,IS columnpermutation,const MatFactorInfo *info);
MatLUFactor(Mat matrix,IS rowpermutation,IS columnpermutation,const MatFactorInfo *info);
MatQRFactor(Mat matrix, IS columnpermutation, const MatFactorInfo *info);

The argument info->fill > 1 is the predicted fill expected in the factored matrix, as a ratio of the original fill. For example, info->fill = 2.0 would indicate that one expects the factored matrix to have twice as many nonzeros as the original.

For sparse matrices it is very unlikely that the factorization is actually done in-place. More likely, new space is allocated for the factored matrix and the old space deallocated, but to the user it appears in-place because the factored matrix replaces the unfactored matrix.

The two factorization stages can also be performed separately, by using the preferred out-of-place mode, first one obtains that matrix object that will hold the factor using

MatGetFactor(Mat matrix,MatSolverType package,MatFactorType ftype,Mat *factor);

and then performs the factorization

MatICCFactorSymbolic(Mat factor,Mat matrix,IS perm,const MatFactorInfo *info);
MatCholeskyFactorSymbolic(Mat factor,Mat matrix,IS perm,const MatFactorInfo *info);
MatCholeskyFactorNumeric(Mat factor,Mat matrix,const MatFactorInfo);

MatILUFactorSymbolic(Mat factor,Mat matrix,IS rowperm,IS colperm,const MatFactorInfo *info);
MatLUFactorSymbolic(Mat factor,Mat matrix,IS rowperm,IS colperm,const MatFactorInfo *info);
MatLUFactorNumeric(Mat factor,Mat matrix,const MatFactorInfo *info);

MatQRFactorSymbolic(Mat factor,Mat matrix,IS perm,const MatFactorInfo *info);
MatQRFactorNumeric(Mat factor,Mat matrix,const MatFactorInfo *info);

In this case, the contents of the matrix factor is undefined between the symbolic and numeric factorization stages. It is possible to reuse the symbolic factorization. For the second and succeeding factorizations, one simply calls the numerical factorization with a new input matrix and the same factored factor matrix. It is essential that the new input matrix have exactly the same nonzero structure as the original factored matrix. (The numerical factorization merely overwrites the numerical values in the factored matrix and does not disturb the symbolic portion, thus enabling reuse of the symbolic phase.) In general, calling XXXFactorSymbolic with a dense matrix will do nothing except allocate the new matrix; the XXXFactorNumeric routines will do all of the work.

Why provide the plain XXXFactor routines when one could simply call the two-stage routines? The answer is that if one desires in-place factorization of a sparse matrix, the intermediate stage between the symbolic and numeric phases cannot be stored in a factor matrix, and it does not make sense to store the intermediate values inside the original matrix that is being transformed. We originally made the combined factor routines do either in-place or out-of-place factorization, but then decided that this approach was not needed and could easily lead to confusion.

We do not provide our own sparse matrix factorization with pivoting for numerical stability. This is because trying to both reduce fill and do pivoting can become quite complicated. Instead, we provide a poor stepchild substitute. After one has obtained a reordering, with MatGetOrdering(Mat A,MatOrdering type,IS *row,IS *col) one may call

MatReorderForNonzeroDiagonal(Mat A,PetscReal tol,IS row, IS col);

which will try to reorder the columns to ensure that no values along the diagonal are smaller than tol in a absolute value. If small values are detected and corrected for, a nonsymmetric permutation of the rows and columns will result. This is not guaranteed to work, but may help if one was simply unlucky in the original ordering. When using the KSP solver interface the option -pc_factor_nonzeros_along_diagonal <tol> may be used. Here, tol is an optional tolerance to decide if a value is nonzero; by default it is 1.e-10.

The external MatSolverType’s MATSOLVERSUPERLU_DIST and MATSOLVERMUMPS do manage numerical pivoting internal to their API.

The external factorization packages each provide a wide number of options to chose from, details on these may be found by consulting the manual page for the solver package, such as, MATSOLVERSUPERLU_DIST. Most of the options can be easily set via the options database even when the factorization solvers are accessed via KSP.

Once a matrix has been factored, it is natural to solve linear systems. The following four routines enable this process:

MatSolve(Mat A,Vec x, Vec y);
MatSolveTranspose(Mat A, Vec x, Vec y);
MatSolveAdd(Mat A,Vec x, Vec y, Vec w);
MatSolveTransposeAdd(Mat A, Vec x, Vec y, Vec w);

matrix A of these routines must have been obtained from a factorization routine; otherwise, an error will be generated. In general, the user should use the KSP solvers introduced in the next chapter rather than using these factorization and solve routines directly.

Some of the factorizations also support solves with multiple right-hand sides stored in a Mat using

MatMatSolve(Mat A,Mat B,Mat X);

and

MatMatSolveTranspose(Mat A,Mat B,Mat X);

Finally, MATSOLVERMUMPS, provides access to Schur complements obtained after partial factorizations as well as the inertia of a matrix via MatGetInertia().

Matrix-Matrix Products#

PETSc matrices provide code for computing various matrix-matrix products. This section will introduce the two sets of routines available. For now consult MatCreateProduct() and MatMatMult().

Creating PC’s Directly#

Users obtain their preconditioner contexts from the KSP context with the command KSPGetPC(). It is possible to create, manipulate, and destroy PC contexts directly, although this capability should rarely be needed. To create a PC context, one uses the command

PCCreate(MPI_Comm comm,PC *pc);

The routine

PCSetType(PC pc,PCType method);

sets the preconditioner method to be used. The routine

PCSetOperators(PC pc,Mat Amat,Mat Pmat);

set the matrices that are to be used with the preconditioner. The routine

PCGetOperators(PC pc,Mat *Amat,Mat *Pmat);

returns the values set with PCSetOperators().

The preconditioners in PETSc can be used in several ways. The two most basic routines simply apply the preconditioner or its transpose and are given, respectively, by

PCApply(PC pc,Vec x,Vec y);
PCApplyTranspose(PC pc,Vec x,Vec y);

In particular, for a matrix, B, that has been set via PCSetOperators(pc,Amat,Pmat), the routine PCApply(pc,x,y) computes \(y = B^{-1} x\) by solving the linear system \(By = x\) with the specified preconditioner method.

Additional preconditioner routines are

PCApplyBAorAB(PC pc,PCSide right,Vec x,Vec y,Vec work);
PCApplyBAorABTranspose(PC pc,PCSide right,Vec x,Vec y,Vec work);
PCApplyRichardson(PC pc,Vec x,Vec y,Vec work,PetscReal rtol,PetscReal atol, PetscReal dtol,PetscInt maxits,PetscBool zeroguess,PetscInt *outits,PCRichardsonConvergedReason*);

The first two routines apply the action of the matrix followed by the preconditioner or the preconditioner followed by the matrix depending on whether the right is PC_LEFT or PC_RIGHT. The final routine applies its iterations of Richardson’s method. The last three routines are provided to improve efficiency for certain Krylov subspace methods.

A PC context that is no longer needed can be destroyed with the command

PCDestroy(PC *pc);