DMSTAG: Staggered, Structured Grid#

For structured (aka “regular”) grids with staggered data (living on elements, faces, edges, and/or vertices), the DMSTAG object is available. This can be useful for problems in many domains, including fluid flow, MHD, and seismology.

It is possible, though cumbersome, to implement a staggered-grid code using multiple DMDA objects, or a single multi-component DMDA object where some degrees of freedom are unused. DMSTAG was developed for two main purposes:

  1. To help manage some of the burden of choosing and adhering to the complex indexing conventions needed for staggered grids (in parallel)

  2. To provide a uniform abstraction for which scalable solvers and preconditioners may be developed (in particular, using PCFIELDSPLIT and PCMG).

DMSTAG is design to behave much like DMDA, with a couple of important distinctions, and borrows some terminology from DMPLEX.


Like a DMPLEX object, a DMSTAG represents a cell complex, distributed in parallel over the ranks of an MPI_Comm. It is, however, a very regular complex, consisting of a structured grid of \(d\)-dimensional cells, with \(d \in \{1,2,3\}\), which are referred to as elements, \(d-1\) dimensional cells defining boundaries between these elements, and the boundaries of the domain, and in 2 or more dimensions, boundaries of these cells, all the way down to 0 dimensional cells referred to as vertices. In 2 dimensions, the 1-dimensional element boundaries are referred to as edges or faces. In 3 dimensions, the 2-dimensional element boundaries are referred to as faces and the 1-dimensional boundaries between faces are referred to as edges The set of cells of a given dimension is referred to as a stratum (which one can think of as a level in DAG representation of the mesh); a DMSTAG object of dimension \(d\) represents a complete cell complex with \(d+1\) strata (levels).

In the description of any:ch_unstructured the cells at each level are referred to as points. Thus we adopt that terminology uniformly in PETSc and so furthermore in this document, point will refer to a cell.

Each stratum has a constant number of unknowns (which may be zero) associated with each point (cell) on that level. The distinct unknowns associated with each point are referred to as components.

The structured grid, is like with DMDA, decomposed via a Cartesian product of decompositions in each dimension, giving a rectangular local subdomain on each rank. This is extended by an element-wise stencil width of ghost elements to create an atlas of overlapping patches.

Working with vectors and operators (matrices)#

DMSTAG allows the user to reason almost entirely about a global indexing of elements. Element indices are simply 1-3 PetscInt values, starting at \(0\), in the back, bottom, left corner of the domain. For instance, element \((1,2,3)\), in 3D, is the element second from the left, third from the bottom, and fourth from the back (regardless of how many MPI ranks are used).

To refer to points (elements, faces, edges, and vertices), a value of DMStagStencilLocation is used, relative to the element index. The element itself is referred to with DMSTAG_ELEMENT, the top right vertex (in 2D) or the top right edge (in 3D) with DMSTAG_UP_RIGHT, the back bottom left corner in 3D with DMSTAG_BACK_DOWN_LEFT, and so on.

Fig. 10 gives a few examples in 2D.


Fig. 10 Locations in DMSTAG are indexed according to global element indices (here, two in 2D) and a location name. Elements have unique names but other locations can be referred to in more than one way. Element colors correspond to a parallel decomposition, but locations on the grid have names which are invariant to this. Note that the face on the top right can be referred to as being to the left of a “dummy” element \((3,3)\) outside the physical domain.#

Crucially, this global indexing scheme does not include any “ghost” or “padding” unknowns outside the physical domain. This is useful for higher-level operations such as computing norms or developing physics-based solvers. However (unlike DMDA), this implies that the global Vec do not have a natural block structure, as different strata have different numbers of points (e.g. in 1D there is an “extra” vertex on the right). This regular block structure is, however, very useful for the local representation of the data, so in that case dummy DOF are included, drawn as grey in Fig. 11.


Fig. 11 Local and global representations for a 2D DMSTAG object, 3 by 4 elements, with one degree of freedom on each of the three strata: element (squares), faces (triangles), and vertices (circles). The cell complex is parallelized across 4 MPI ranks. In the global representation, the colors correspond to which rank holds the native representation of the unknown. The 4 local representations are shown, with an (elementwise) stencil “box” stencil width of 1. Unknownd are color by their native rank. Dummy unknowns, which correspond to no global degree of freedom, are colored grey. Note that the local representations have a natural block size of 4, and the global representation has no natural block size.#

For working with Vec data, this approach is used to allow direct access to a multi-dimensional, regular-blocked array. To avoid the user having to know about the internal numbering conventions used, helper functions are used to produce the proper final integer index for a given location and component, referred to as a “slot”. Similarly to DMDAVecGetArrayDOF(), this uses a \(d+1\) dimensional array in \(d\) dimensions. The following snippet give an example of this usage.

  /* Set the second component of all vertex dof to 2.0 */
  PetscCall(DMStagGetCorners(dm, &s_x, &s_y, &s_z, &n_x, &n_y, &n_z, &n_e_x, &n_e_y, &n_e_z));
  PetscCall(DMStagGetLocationSlot(dm, location_vertex, 1, &slot_vertex_2));
  PetscCall(DMStagVecGetArray(dm, x, &x_array));
  for (PetscInt k = s_z; k < s_z + n_z + n_e_z; ++k) {
    for (PetscInt j = s_y; j < s_y + n_y + n_e_y; ++j) {
      for (PetscInt i = s_x; i < s_x + n_x + n_e_x; ++i) x_array[k][j][i][slot_vertex_2] = 2.0;
  PetscCall(DMStagVecRestoreArray(dm, x, &x_array));

DMSTAG provides a stencil-based method for getting and setting entries of Mat and Vec objects. The follow excerpt from DMSTAG Tutorial ex1 demonstrates the idea. For more, see the manual page for DMStagMatSetValuesStencil().

    /* Velocity is either a BC or an interior point */
    if (isFirstRank && e == start) {
      DMStagStencil row;
      PetscScalar   val;

      row.i   = e;
      row.loc = LEFT;
      row.c   = 0;
      val     = 1.0;
      PetscCall(DMStagMatSetValuesStencil(dmSol, A, 1, &row, 1, &row, &val, INSERT_VALUES));

The array-based approach for Vec is likely to be more efficient than the stencil-based method just introduced above.


DMSTAG, unlike DMDA, supports two approaches to defining coordinates. This is captured by which type of DM is used to represent the coordinates. No default is imposed, so the user must directly or indirectly call DMStagSetCoordinateDMType().

If a second DMSTAG object is used to represent coordinates in “explicit” form, behavior is much like with DMDA - the coordinate DM has \(d\) DOF on each stratum corresponding to coordinates associated with each point.

If DMPRODUCT is used instead, coordinates are represented by a DMPRODUCT object referring to a Cartesian product of 1D DMSTAG objects, each of which features explicit coordinates as just mentioned.

Navigating these nested DM in DMPRODUCT can be tedious, but note the existence of helper functions like DMStagSetUniformCoordinatesProduct() and DMStagGetProductCoordinateArrays().

Numberings and internal data layout#

While DMSTAG aims to hide the details of its internal data layout, for debugging, optimization, and customization purposes, it can be important to know how DMSTAG internally numbers unknowns.

Internally, each point is canonically associated with an element (top-level point (cell)). For purposes of local, regular-blocked storage, an element is grouped with lower-dimensional points left of, below (“down”), and behind (“back”) it. This means that “canonical” values of DMStagStencilLocation are DMSTAG_ELEMENT, plus all entries consisting only of “LEFT”, “DOWN”, and “BACK”. In general, these are the most efficient values to use, unless convenience dictates otherwise, as they are the ones used internally.

When creating the decomposition of the domain to local ranks, and extending these local domains to handle overlapping halo regions and boundary ghost unknowns, this same per-element association is used. This has the advantage of maintaining a regular blocking, but may not be optimal in some situations in terms of data movement.

Numberings are, like DMDA, based on a local “x-fastest, z-slowest” or “PETSc” ordering of elements (see Application Orderings), with ordering of locations canonically associated with each element decided by considering unknowns on each point to be located at the center of their point, and using a nested ordering of the same style. Thus, in 3-D, the ordering of the 8 canonical DMStagStencilLocation values associated with an element is

Multiple DOF associated with a given point are stored sequentially (as with DMDA).

For local Vecs, this gives a regular-blocked numbering, with the same number of unknowns associated with each element, including some “dummy” unknowns which to not correspond to any (local or global) unknown in the global representation. See Fig. 13 for an example.

In the global representation, only physical unknowns are numbered (using the same “Z” ordering for unknowns which are present), giving irregular numbers of unknowns, depending on whether a domain boundary is present. See Fig. 12 for an example.


Fig. 12 Global numbering scheme for a 2D DMSTAG object with one DOF per stratum. Note that the numbering depends on the parallel decomposition (over 4 ranks, here).#


Fig. 13 Local numbering scheme on rank 1 (Cf. Fig. 11) for a 2D DMSTAG object with one DOF per stratum. Note that dummy locations (grey) are used to give a regular block size (here, 4).#

It should be noted that this is an interlaced (AoS) representation. If a segregated (SoA) representation is required, one should use DMCOMPOSITE collecting several DMSTAG objects, perhaps using DMStagCreateCompatibleDMStag() to quickly create additional DMSTAG objects from an initial one.