PetscPrefetchBlock#

Prefetches a block of memory

Synopsis#

#include <petscsys.h>
void PetscPrefetchBlock(const anytype *a,size_t n,int rw,int t)

Not Collective

Input Parameters#

  • a - pointer to first element to fetch (any type but usually PetscInt or PetscScalar)

  • n - number of elements to fetch

  • rw - 1 if the memory will be written to, otherwise 0 (ignored by many processors)

  • t - temporal locality (PETSC_PREFETCH_HINT_{NTA,T0,T1,T2}), see note

Notes#

The last two arguments (rw and t) must be compile-time constants.

Adopting Intel’s x86/x86-64 conventions, there are four levels of temporal locality. Not all architectures offer equivalent locality hints, but the following macros are always defined to their closest analogue.

  • PETSC_PREFETCH_HINT_NTA - Non-temporal. Prefetches directly to L1, evicts to memory (skips higher level cache unless it was already there when prefetched).

  • PETSC_PREFETCH_HINT_T0 - Fetch to all levels of cache and evict to the closest level. Use this when the memory will be reused regularly despite necessary eviction from L1.

  • PETSC_PREFETCH_HINT_T1 - Fetch to level 2 and higher (not L1).

  • PETSC_PREFETCH_HINT_T2 - Fetch to high-level cache only. (On many systems, T0 and T1 are equivalent.)

This function does nothing on architectures that do not support prefetch and never errors (even if passed an invalid address).

Level#

developer

Location#

include/petscsys.h


Index of all Sys routines
Table of Contents for all manual pages
Index of all manual pages