The PETSc Kernel

PETSc provides a variety of basic services for writing scalable, component-based libraries; these are referred to as the PETSc kernel [BGMS98]. The source code for the kernel is in src/sys. It contains systematic support for

  • managing PETSc types,

  • error handling,

  • memory management,

  • profiling,

  • object management,

  • Fortran interfaces (see [BBK+15])

  • mechanism for generating appropriate citations for algorithms and software used in PETSc (see [KBMS13])

  • file I/O,

  • an options database, and

  • objects and code for viewing, drawing, and displaying data and solver objects.

Each of these is discussed in a section below.

PETSc Types

For maximum flexibility, the basic data types int, double, and so on are not used in PETSc source code. Rather, it has

PetscInt can be set using configure to be either int (32 bit, the default) or long long (64 bit, with configure –with-64-bit-indices) to allow indexing into very large arrays. PetscMPIInt is used for integers passed to MPI as counts and sizes. These are always int since that is what the MPI standard uses. Similarly, PetscBLASInt is for counts, and so on passed to BLAS and LAPACK routines. These are almost always int unless one is using a special “64-bit integer” BLAS/LAPACK (this is available, for example, with Intel’s MKL and OpenBLAS).

In addition, there are special types:

These are currently always int, but their use clarifies the code.

Implementation of Error Handling

PETSc uses a “call error handler; then (depending on result) return error code” model when problems are detected in the running code. The public include file for error handling is include/petscerror.h, and the source code for the PETSc error handling is in src/sys/error/.

Simplified Interface

The simplified macro-based interface consists of the following two calls:

The macro SETERRQ() is given by

#define SETERRQ(comm,ierr,s) return PetscError(comm,__LINE__,PETSC_FUNCTION_NAME,__FILE__,ierr,PETSC_ERROR_INITIAL,s)

It calls the error handler with the current function name and location: line number, and file, plus an error code and an error message. Normally comm is PETSC_COMM_SELF; it can be another communicator only if one is absolutely sure the same error will be generated on all processes in the communicator. This feature is to prevent the same error message from being printed by many processes.

The macro CHKERRQ() is defined by

#define CHKERRQ(ierr)          do {PetscErrorCode ierr__ = (ierr); if (PetscUnlikely(ierr__)) return PetscError(PETSC_COMM_SELF,__LINE__,PETSC_FUNCTION_NAME,__FILE__,ierr__,PETSC_ERROR_REPEAT," ");} while (0)

In addition to SETERRQ(), the macros SETERRQ1(), SETERRQ2(), SETERRQ3(), and SETERRQ4() allow one to provide additional arguments to a formatted message string, for example,

SETERRQ2(comm,PETSC_ERR,"Iteration overflow: its %D norm %g",its,(double)norm);

The reason for the numbered format is that C89 CPP macros cannot handle a variable number of arguments.

Error Handlers

The error-handling function PetscError() calls the “current” error handler with the code

PetscErrorCode PetscError(MPI_Comm comm,int line,const char *func,const char *file,PetscErrorCode n,PetscErrorType p,const char *mess,...)
{
  va_list        Argp;
  size_t         fullLength;
  char           buf[2048],*lbuf = NULL;
  PetscBool      ismain;
  PetscErrorCode ierr;

  if (!func) func = "User provided function";
  if (!file) file = "User file";
  if (comm == MPI_COMM_NULL) comm = PETSC_COMM_SELF;

  /* Compose the message evaluating the print format */
  if (mess) {
    va_start(Argp,mess);
    PetscVSNPrintf(buf,2048,mess,&fullLength,Argp);
    va_end(Argp);
    lbuf = buf;
    if (p == PETSC_ERROR_INITIAL) PetscStrncpy(PetscErrorBaseMessage,lbuf,1023);
  }

  if (p == PETSC_ERROR_INITIAL && n != PETSC_ERR_MEMC) PetscMallocValidate(__LINE__,PETSC_FUNCTION_NAME,__FILE__);

  if (!eh) ierr = PetscTraceBackErrorHandler(comm,line,func,file,n,p,lbuf,NULL);
  else ierr = (*eh->handler)(comm,line,func,file,n,p,lbuf,eh->ctx);
  PetscStackClearTop;

  /*
      If this is called from the main() routine we call MPI_Abort() instead of
    return to allow the parallel program to be properly shutdown.

    Does not call PETSCABORT() since that would provide the wrong source file and line number information
  */
  PetscStrncmp(func,"main",4,&ismain);
  if (ismain) {
    PetscMPIInt errcode;
    errcode = (PetscMPIInt)(0 + 0*line*1000 + ierr);
    if (petscwaitonerrorflg) { PetscSleep(1000); }
    MPI_Abort(MPI_COMM_WORLD,errcode);
  }

#if defined(PETSC_CLANGUAGE_CXX)
  if (p == PETSC_ERROR_IN_CXX) {
    PetscCxxErrorThrow();
  }
#endif
  return ierr;
}

/* -------------------------------------------------------------------------*/

/*@C
    PetscIntView - Prints an array of integers; useful for debugging.

    Collective on PetscViewer

    Input Parameters:
+   N - number of integers in array
.   idx - array of integers
-   viewer - location to print array,  PETSC_VIEWER_STDOUT_WORLD, PETSC_VIEWER_STDOUT_SELF or 0

  Level: intermediate

    Developer Notes:
    idx cannot be const because may be passed to binary viewer where byte swapping is done

.seealso: PetscRealView()
@*/
PetscErrorCode  PetscIntView(PetscInt N,const PetscInt idx[],PetscViewer viewer)
{
  PetscErrorCode ierr;
  PetscMPIInt    rank,size;
  PetscInt       j,i,n = N/20,p = N % 20;
  PetscBool      iascii,isbinary;
  MPI_Comm       comm;

  PetscFunctionBegin;
  if (!viewer) viewer = PETSC_VIEWER_STDOUT_SELF;
  PetscValidHeaderSpecific(viewer,PETSC_VIEWER_CLASSID,3);
  if (N) PetscValidIntPointer(idx,2);
  ierr = PetscObjectGetComm((PetscObject)viewer,&comm);CHKERRQ(ierr);
  ierr = MPI_Comm_size(comm,&size);CHKERRMPI(ierr);
  ierr = MPI_Comm_rank(comm,&rank);CHKERRMPI(ierr);

  ierr = PetscObjectTypeCompare((PetscObject)viewer,PETSCVIEWERASCII,&iascii);CHKERRQ(ierr);
  ierr = PetscObjectTypeCompare((PetscObject)viewer,PETSCVIEWERBINARY,&isbinary);CHKERRQ(ierr);
  if (iascii) {
    ierr = PetscViewerASCIIPushSynchronized(viewer);CHKERRQ(ierr);
    for (i=0; i<n; i++) {
      if (size > 1) {
        ierr = PetscViewerASCIISynchronizedPrintf(viewer,"[%d] %D:", rank, 20*i);CHKERRQ(ierr);
      } else {
        ierr = PetscViewerASCIISynchronizedPrintf(viewer,"%D:",20*i);CHKERRQ(ierr);
      }
      for (j=0; j<20; j++) {
        ierr = PetscViewerASCIISynchronizedPrintf(viewer," %D",idx[i*20+j]);CHKERRQ(ierr);
      }
      ierr = PetscViewerASCIISynchronizedPrintf(viewer,"\n");CHKERRQ(ierr);
    }
    if (p) {
      if (size > 1) {
        ierr = PetscViewerASCIISynchronizedPrintf(viewer,"[%d] %D:",rank ,20*n);CHKERRQ(ierr);
      } else {
        ierr = PetscViewerASCIISynchronizedPrintf(viewer,"%D:",20*n);CHKERRQ(ierr);
      }
      for (i=0; i<p; i++) { ierr = PetscViewerASCIISynchronizedPrintf(viewer," %D",idx[20*n+i]);CHKERRQ(ierr);}
      ierr = PetscViewerASCIISynchronizedPrintf(viewer,"\n");CHKERRQ(ierr);
    }
    ierr = PetscViewerFlush(viewer);CHKERRQ(ierr);
    ierr = PetscViewerASCIIPopSynchronized(viewer);CHKERRQ(ierr);
  } else if (isbinary) {
    PetscMPIInt *sizes,Ntotal,*displs,NN;
    PetscInt    *array;

    ierr = PetscMPIIntCast(N,&NN);CHKERRQ(ierr);

    if (size > 1) {
      if (rank) {
        ierr = MPI_Gather(&NN,1,MPI_INT,NULL,0,MPI_INT,0,comm);CHKERRMPI(ierr);
        ierr = MPI_Gatherv((void*)idx,NN,MPIU_INT,NULL,NULL,NULL,MPIU_INT,0,comm);CHKERRMPI(ierr);
      } else {
        ierr      = PetscMalloc1(size,&sizes);CHKERRQ(ierr);
        ierr      = MPI_Gather(&NN,1,MPI_INT,sizes,1,MPI_INT,0,comm);CHKERRMPI(ierr);
        Ntotal    = sizes[0];
        ierr      = PetscMalloc1(size,&displs);CHKERRQ(ierr);
        displs[0] = 0;
        for (i=1; i<size; i++) {
          Ntotal   += sizes[i];
          displs[i] =  displs[i-1] + sizes[i-1];
        }
        ierr = PetscMalloc1(Ntotal,&array);CHKERRQ(ierr);
        ierr = MPI_Gatherv((void*)idx,NN,MPIU_INT,array,sizes,displs,MPIU_INT,0,comm);CHKERRMPI(ierr);
        ierr = PetscViewerBinaryWrite(viewer,array,Ntotal,PETSC_INT);CHKERRQ(ierr);
        ierr = PetscFree(sizes);CHKERRQ(ierr);
        ierr = PetscFree(displs);CHKERRQ(ierr);
        ierr = PetscFree(array);CHKERRQ(ierr);
      }
    } else {
      ierr = PetscViewerBinaryWrite(viewer,idx,N,PETSC_INT);CHKERRQ(ierr);
    }
  } else {
    const char *tname;
    ierr = PetscObjectGetName((PetscObject)viewer,&tname);CHKERRQ(ierr);
    SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_SUP,"Cannot handle that PetscViewer of type %s",tname);
  }
  PetscFunctionReturn(0);
}

You can set a new error handler with the command PetscPushErrorHandler(), which maintains a linked list of error handlers. The most recent error handler is removed via PetscPopErrorHandler().

PETSc provides several default error handlers:

Error Codes

The PETSc error handler takes an error code. The generic error codes are defined in include/petscerror.h. The same error code is used many times in the libraries. For example, the error code PETSCERRMEM is used whenever a requested memory allocation is not available.

Detailed Error Messages

In a modern parallel component-oriented application code, it does not always make sense to simply print error messages to the terminal (and more than likely there is no “terminal”, for example, with Microsoft Windows or Apple iPad applications). PETSc provides the replaceable function pointer

(*PetscErrorPrintf)("Format",...);

which, by default, prints to standard out. Thus, error messages should not be printed with printf() or fprintf(). Rather, they should be printed with (*PetscErrorPrintf)(). You can direct all error messages to stderr, instead of the default stdout, with the command line option -erroroutputstderr.

C++ Exceptions

In PETSc code, when one calls C++ functions that do not return with an error code but might instead throw C++ exceptions, one can use CHKERRCXX(func), which catches the exceptions in func and then calls SETERRQ1(). The macro CHKERRCXX(func) is given by

#define CHKERRCXX(func) do {try {func;} catch (const std::exception& e) { SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_LIB,"%s", e.what()); }} while (0)

Memory Management

PETSc provides simple wrappers for the system malloc(), calloc(), and free() routines. The public interface for these is provided in petscsys.h, while the implementation code is in src/sys/memory. The most basic interfaces are

#define PetscMalloc(a,b)  ((*PetscTrMalloc)((a),PETSC_FALSE,__LINE__,PETSC_FUNCTION_NAME,__FILE__,(void**)(b)))
#define PetscFree(a)   ((*PetscTrFree)((void*)(a),__LINE__,PETSC_FUNCTION_NAME,__FILE__) || ((a) = NULL,0))
PetscErrorCode PetscMallocA(int n,PetscBool clear,int lineno,const char *function,const char *filename,size_t bytes0,void *ptr0,...)
{
  PetscErrorCode ierr;
  va_list        Argp;
  size_t         bytes[8],sumbytes;
  void           **ptr[8];
  int            i;

  PetscFunctionBegin;
  if (n > 8) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"Attempt to allocate %d objects but only 8 supported",n);
  bytes[0] = bytes0;
  ptr[0] = (void**)ptr0;
  sumbytes = (bytes0 + PETSC_MEMALIGN-1) & ~(PETSC_MEMALIGN-1);
  va_start(Argp,ptr0);
  for (i=1; i<n; i++) {
    bytes[i] = va_arg(Argp,size_t);
    ptr[i] = va_arg(Argp,void**);
    sumbytes += (bytes[i] + PETSC_MEMALIGN-1) & ~(PETSC_MEMALIGN-1);
  }
  va_end(Argp);
  if (petscmalloccoalesce) {
    char *p;
    ierr = (*PetscTrMalloc)(sumbytes,clear,lineno,function,filename,(void**)&p);CHKERRQ(ierr);
    for (i=0; i<n; i++) {
      *ptr[i] = bytes[i] ? p : NULL;
      p = (char*)PetscAddrAlign(p + bytes[i]);
    }
  } else {
    for (i=0; i<n; i++) {
      ierr = (*PetscTrMalloc)(bytes[i],clear,lineno,function,filename,(void**)ptr[i]);CHKERRQ(ierr);
    }
  }
  PetscFunctionReturn(0);
}
PetscErrorCode PetscFreeA(int n,int lineno,const char *function,const char *filename,void *ptr0,...)
{
  PetscErrorCode ierr;
  va_list        Argp;
  void           **ptr[8];
  int            i;

  PetscFunctionBegin;
  if (n > 8) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"Attempt to allocate %d objects but only up to 8 supported",n);
  ptr[0] = (void**)ptr0;
  va_start(Argp,ptr0);
  for (i=1; i<n; i++) {
    ptr[i] = va_arg(Argp,void**);
  }
  va_end(Argp);
  if (petscmalloccoalesce) {
    for (i=0; i<n; i++) {       /* Find first nonempty allocation */
      if (*ptr[i]) break;
    }
    while (--n > i) {
      *ptr[n] = NULL;
    }
    ierr = (*PetscTrFree)(*ptr[n],lineno,function,filename);CHKERRQ(ierr);
    *ptr[n] = NULL;
  } else {
    while (--n >= 0) {
      ierr = (*PetscTrFree)(*ptr[n],lineno,function,filename);CHKERRQ(ierr);
      *ptr[n] = NULL;
    }
  }
  PetscFunctionReturn(0);
}

which allow the use of any number of profiling and error-checking wrappers for malloc(), calloc(), and free(). Both PetscMallocA() and PetscFreeA() call the function pointer values (*PetscTrMalloc) and (*PetscTrFree). PetscMallocSet() is used to set these function pointers. The functions are guaranteed to support requests for zero bytes of memory correctly. Freeing memory locations also sets the pointer value to zero, preventing later code from accidentally using memory that has been freed. All PETSc memory allocation calls are memory aligned on at least double-precision boundaries; the macro generated by configure PETSCMEMALIGN indicates in bytes what alignment all allocations have. This can be controlled at configure time with the option -with-memalign=<4,8,16,32,64>.

PetscMallocA() supports a request for up to 7 distinct memory locations of possibly different types. This serves two purposes: it reduces the number of system malloc() calls, thus potentially increasing performance, and it clarifies in the code related memory allocations that should be freed together.

The following macros are the preferred way to obtain and release memory in the PETSc source code. They automatically manage calling PetscMallocA() and PetscFreeA() with the appropriate location information.

#define PetscMalloc1(m1,r1) PetscMallocA(1,PETSC_FALSE,__LINE__,PETSC_FUNCTION_NAME,__FILE__,(size_t)(m1)*sizeof(**(r1)),(r1))
#define PetscMalloc2(m1,r1,m2,r2) PetscMallocA(2,PETSC_FALSE,__LINE__,PETSC_FUNCTION_NAME,__FILE__,(size_t)(m1)*sizeof(**(r1)),(r1),(size_t)(m2)*sizeof(**(r2)),(r2))

#define PetscMalloc7(m1,r1,m2,r2,m3,r3,m4,r4,m5,r5,m6,r6,m7,r7) PetscMallocA(7,PETSC_FALSE,__LINE__,PETSC_FUNCTION_NAME,__FILE__,(size_t)(m1)*sizeof(**(r1)),(r1),(size_t)(m2)*sizeof(**(r2)),(r2),(size_t)(m3)*sizeof(**(r3)),(r3),(size_t)(m4)*sizeof(**(r4)),(r4),(size_t)(m5)*sizeof(**(r5)),(r5),(size_t)(m6)*sizeof(**(r6)),(r6),(size_t)(m7)*sizeof(**(r7)),(r7))
#define PetscFree(a)   ((*PetscTrFree)((void*)(a),__LINE__,PETSC_FUNCTION_NAME,__FILE__) || ((a) = NULL,0))
#define PetscFree2(m1,m2)   PetscFreeA(2,__LINE__,PETSC_FUNCTION_NAME,__FILE__,&(m1),&(m2))

#define PetscFree7(m1,m2,m3,m4,m5,m6,m7)   PetscFreeA(7,__LINE__,PETSC_FUNCTION_NAME,__FILE__,&(m1),&(m2),&(m3),&(m4),&(m5),&(m6),&(m7))

Similar routines, PetscCalloc1() to PetscCalloc7(), provide memory initialized to zero. The size requests for these macros are in number of data items requested, not in bytes. This decreases the number of errors in the code since the compiler determines their sizes from the object type instead of requiring the user to provide the correct value with sizeof().

The routines PetscTrMallocDefault() and PetscTrFreeDefault(), which are set with the routine PetscSetUseTrMallocPrivate() (and are used by default for the debug version of PETSc), provide simple logging and error checking versions of memory allocation.

Implementation of Profiling

This section provides details about the implementation of event logging and profiling within the PETSc kernel. The interface for profiling in PETSc is contained in the file include/petsclog.h. The source code for the profile logging is in src/sys/plog/.

Profiling Object Creation and Destruction

The creation of objects is profiled with the command PetscLogObjectCreate()

PetscLogObjectCreate(PetscObject h);

which logs the creation of any PETSc object. Just before an object is destroyed, it should be logged with PetscLogObjectDestroy()

PetscLogObjectDestroy(PetscObject h);

These are called automatically by PetscHeaderCreate() and PetscHeaderDestroy(), which are used in creating all objects inherited from the basic object. Thus, these logging routines need never be called directly.

If an object has a clearly defined parent object (for instance, when a work vector is generated for use in a Krylov solver), this information is logged with the command PetscLogObjectParent().

PetscLogObjectParent(PetscObject parent,PetscObject child);

It is also useful to log information about the state of an object, as can be done with the command

PetscLogObjectState(PetscObject h,const char *format,...);

For example, for sparse matrices we usually log the matrix dimensions and number of nonzeros.

Profiling Events

Events are logged by using the pair PetscLogEventBegin() and PetscLogEventEnd().

This logging is usually done in the abstract interface file for the operations, for example, src/mat/interface/matrix.c.

Controlling Profiling

Routines that control the default profiling available in PETSc include the following

These routines are normally called by the PetscInitialize() and PetscFinalize() routines when the option -logview is given.

References

BBK+15

Satish Balay, Jed Brown, Matthew G. Knepley, Lois McInnes, and Barry Smith. Software Engineering for Science, chapter Providing Mixed Language and Legacy Support within a Library. Taylor & Francis, 2015.

BGMS98

Satish Balay, William D. Gropp, Lois Curfman McInnes, and Barry F. Smith. A microkernel design for component-based parallel numerical software systems. In Proceedings of the SIAM Workshop on Object Oriented Methods for Inter-operable Scientific and Engineering Computing, 58–67. SIAM, 1998. URL: ftp://info.mcs.anl.gov/pub/tech_reports/reports/P727.ps.Z.

KBMS13

Matthew G. Knepley, Jed Brown, Lois Curfman McInnes, and Barry F. Smith. Accurately citing software and algorithms used in publications. In First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE), held at SC13. 2013. doi:10.6084/m9.figshare.785731.