PetscDeviceContextGetStreamHandle#
Return a handle to the underlying stream of the current device context
Synopsis#
#include <petscdevice.h>
PetscErrorCode PetscDeviceContextGetStreamHandle(PetscDeviceContext dctx, void **handle)
Input Parameter#
dctx - The
PetscDeviceContext
to get the stream from
Output Parameter#
handle - A pointer to the handle to the stream
Note#
This routine is dangerous. It exists only for the most experienced users and internal PETSc development.
There is no way for PETSc’s auto-dependency system to track what the caller does with the stream.
If the user uses the stream to copy memory that was previously modified by PETSc, or launches
kernels that modify memory with the stream, it is the users responsibility to inform PETSc of
their actions via PetscDeviceContextMarkIntentFromID()
. Failure to do so may introduce a
race condition. This race condition may manifest in nondeterministic ways.
Alternatively, the user may synchronize the stream immediately before and after use. This is the safest option.
Example Usage#
PetscDeviceContext dctx;
PetscDeviceType type;
void *handle;
PetscDeviceContextGetCurrentContext(&dctx);
PetscDeviceContextGetStreamHandle(dctx, &handle);
PetscDeviceContextGetDeviceType(dctx, &type);
if (type == PETSC_DEVICE_CUDA) {
cudaStream_t stream = *(cudaStream_t *)handle;
my_cuda_kernel<<<1, 2, 3, stream>>>();
}
Alternatively, if type of PetscDeviceContext
is known (for example PETSC_DEVICE_HIP
), the
user may pass in a pointer to stream handle directly:
hipStream_t *stream;
// note the cast to void **
PetscDeviceContextGetStreamHandle(dctx, (void **)&stream);
// note the dereference
my_hip_kernel<<<1, 2, 3, *stream>>>();
Asynchronous API Notes#
This routine is explicitly marked as exhibiting asynchronous behavior. Asynchronous
behavior implies that routines launching operations on (or associated with) a
PetscDeviceContext
may return to the caller before the operation has completed.
Sequential Consistency:
Operations using the same PetscDeviceContext
which access objects or memory regions
are ordered per the language specification.
Operations using separate PetscDeviceContext
s which access the same object or
memory region are strongly write-ordered. That is, the following operations:
write-write
write-read
read-write
are strongly ordered. Formally:
Given an operation A-B
(e.g. A
= write
, B
= read
) on an object or memory
region M
such that A
“happens-before” B
, where A
uses PetscDeviceContext
X
and B
uses PetscDeviceContext
Y
, then B
shall not begin before A
completes. This implies that any side-effects resulting from A
are also observed by
B
.
Note the omission of read-read
; there is no implied ordering between separate
PetscDeviceContext
s for consecutive reads.
Operations using separate PetscDeviceContext
s which access separate objects or
memory regions may execute in an arbitrary order and offer no guarantee of sequential
consistency.
Memory Consistency:
If this routine modifies the participating object(s) then – unless otherwise stated –
the contents of any externally held references to internal data structures should be
considered to be in an undefined state. A well-defined state can only be restored by
re-acquiring these references through the appropriate API or by calling
PetscDeviceContextSynchronize()
.
Unless otherwise stated, exceptions to this rule are:
References returned by the routine itself. If a routine returns a pointer, the value of the top-most pointer is guaranteed to always be valid. For example, given a routine which asynchronously allocates memory and returns a pointer to the memory, the value of said pointer is immediately valid but dereferencing the pointer may not be.
References to structures. If a routine returns a
PetscFoo
, or array thereof then the objects themselves are always valid (though their member variablesPetscFoo->data
may not be).
See Also#
Level#
developer
Location#
src/sys/objects/device/interface/dcontext.cxx
Index of all Sys routines
Table of Contents for all manual pages
Index of all manual pages