PetscDA: Data Assimilation#

PETSc’s PetscDA object coordinates data assimilation (DA) workflows.

This is new code, please independently verify all results you obtain using it.

Some planned work for PetscDA is available as GitLab Issue #1882

Ensemble-based Data Assimilation#

Currently PetscDA only supports ensemble-based data assimilation with two PetscDAType: PETSCDAETKF and PETSCDALETKF. These focus on ensemble transform Kalman filter (ETKF)-style updates but are extensible to other assimilation techniques.

ETKF
LETKF

These centralize ensemble storage, observational metadata, and user-defined forecast/analysis operators so that algorithms can run independently of the MPI layout or the vector/matrix backends.

Lifecycle overview#

A typical assimilation cycle alternates between forecast propagation and statistical analysis:

Initialize a PetscDA context and configure ensemble sizes and data structures.
Populate the ensemble state vectors and optional observation-error descriptions.
Advance each ensemble member with a model operator supplied by the application.
Combine forecasts with observations through PetscDAEnsembleAnalysis() to produce the posterior ensemble.
Repeat until the desired simulation horizon is complete, optionally extracting diagnostics after each phase.

Throughout this loop the PetscDA object abstracts the global vectors, scatters, and reductions needed to compute ensemble means, anomalies, and square-root transforms.

Here we introduce a simple example to demonstrate PetscDA usage with the PETSCDALETKF on the Lorenz-96 model. Please read Implementations for more in-depth discussion of the available implementations.

Listing: src/ml/da/tutorials/ex1.c

int main(int argc, char **argv)
{
  /* Configuration parameters */
  PetscInt  n                 = DEFAULT_N;
  PetscInt  steps             = DEFAULT_STEPS;
  PetscInt  burn              = DEFAULT_BURN;
  PetscInt  obs_freq          = DEFAULT_OBS_FREQ;
  PetscInt  random_seed       = DEFAULT_RANDOM_SEED;
  PetscInt  ensemble_size     = DEFAULT_ENSEMBLE_SIZE;
  PetscReal F                 = DEFAULT_F;
  PetscReal dt                = DEFAULT_DT;
  PetscReal obs_error_std     = DEFAULT_OBS_ERROR_STD;
  PetscReal ensemble_init_std = 1; /* Initial ensemble spread */

  /* PETSc objects */
  Lorenz96Ctx *l95_ctx = NULL, *truth_ctx = NULL;
  DM           da_state;
  PetscDA      da;
  Vec          x0, x_mean, x_forecast;
  Vec          truth_state, rmse_work;
  Vec          observation, obs_noise, obs_error_var;
  PetscRandom  rng;
  Mat          H = NULL; /* Observation operator matrix */

  /* Statistics tracking */
  PetscReal rmse_forecast = 0.0, rmse_analysis = 0.0;
  PetscReal sum_rmse_forecast = 0.0, sum_rmse_analysis = 0.0;
  PetscInt  n_stat_steps = 0;
  PetscInt  obs_count    = 0;
  PetscInt  step, progress_interval;

  PetscFunctionBeginUser;
  PetscCall(PetscInitialize(&argc, &argv, NULL, help));

  /* Parse command-line options */
  PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "Lorenz-96 ETKF Quick Example", NULL);
  PetscCall(PetscOptionsInt("-n", "State dimension", "", n, &n, NULL));
  PetscCall(PetscOptionsInt("-steps", "Number of time steps", "", steps, &steps, NULL));
  PetscCall(PetscOptionsInt("-burn", "Burn-in steps excluded from statistics", "", burn, &burn, NULL));
  PetscCall(PetscOptionsInt("-obs_freq", "Observation frequency", "", obs_freq, &obs_freq, NULL));
  PetscCall(PetscOptionsReal("-F", "Forcing parameter", "", F, &F, NULL));
  PetscCall(PetscOptionsReal("-dt", "Time step size", "", dt, &dt, NULL));
  PetscCall(PetscOptionsReal("-obs_error", "Observation error standard deviation", "", obs_error_std, &obs_error_std, NULL));
  PetscCall(PetscOptionsReal("-ensemble_init_std", "Initial ensemble spread standard deviation", "", ensemble_init_std, &ensemble_init_std, NULL));
  PetscCall(PetscOptionsInt("-random_seed", "Random seed for ensemble perturbations", "", random_seed, &random_seed, NULL));
  PetscOptionsEnd();

  /* Validate and constrain parameters */
  PetscCall(ValidateParameters(&n, &steps, &burn, &obs_freq, &ensemble_size, &dt, &F, &obs_error_std));

  /* Calculate progress reporting interval (avoid division by zero) */
  progress_interval = (steps >= PROGRESS_INTERVALS) ? (steps / PROGRESS_INTERVALS) : 1;

  /* Create 1D periodic DM for state space */
  PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_PERIODIC, n, 1, 2, NULL, &da_state));
  PetscCall(DMSetFromOptions(da_state));
  PetscCall(DMSetUp(da_state));
  PetscCall(DMDASetUniformCoordinates(da_state, 0.0, (PetscReal)n, 0.0, 0.0, 0.0, 0.0));

  /* Create Lorenz96 context with reusable TS object */
  PetscCall(Lorenz96ContextCreate(da_state, n, F, dt, &l95_ctx));
  PetscCall(Lorenz96ContextCreate(da_state, n, F, dt, &truth_ctx));

  /* Initialize random number generator */
  PetscCall(PetscRandomCreate(PETSC_COMM_WORLD, &rng));
  PetscCall(PetscRandomSetSeed(rng, (unsigned long)random_seed));
  PetscCall(PetscRandomSetFromOptions(rng));
  PetscCall(PetscRandomSeed(rng));

  /* Initialize state vectors */
  PetscCall(DMCreateGlobalVector(da_state, &x0));
  PetscCall(PetscRandomSetInterval(rng, -.1 * F, .1 * F));
  PetscCall(VecSetRandom(x0, rng));
  PetscCall(PetscRandomSetInterval(rng, 0, 1));

  /* Initialize truth trajectory */
  PetscCall(VecDuplicate(x0, &truth_state));
  PetscCall(VecCopy(x0, truth_state));
  PetscCall(VecDuplicate(x0, &rmse_work));

  /* Spin up truth to get onto attractor */
  PetscCall(PetscPrintf(PETSC_COMM_WORLD, "Spinning up truth for %" PetscInt_FMT " steps...\n", (PetscInt)SPINUP_STEPS));
  for (PetscInt k = 0; k < SPINUP_STEPS; k++) PetscCall(Lorenz96Step(truth_state, truth_state, truth_ctx));

  /* Initialize observation vectors */
  PetscCall(VecDuplicate(x0, &observation));
  PetscCall(VecDuplicate(x0, &obs_noise));
  PetscCall(VecDuplicate(x0, &obs_error_var));
  PetscCall(VecSet(obs_error_var, obs_error_std * obs_error_std));

  /* Initialize ensemble statistics vectors */
  PetscCall(VecDuplicate(x0, &x_mean));
  PetscCall(VecDuplicate(x0, &x_forecast));

  /* Create identity observation matrix H */
  PetscCall(CreateIdentityObservationMatrix(n, &H));

  /* Create and configure PetscDA for ensemble data assimilation */
  PetscCall(PetscDACreate(PETSC_COMM_WORLD, &da));
  PetscCall(PetscDASetType(da, PETSCDAETKF)); /* Set ETKF type */
  PetscCall(PetscDASetSizes(da, n, n));
  PetscCall(PetscDAEnsembleSetSize(da, ensemble_size));
  PetscCall(PetscDASetFromOptions(da));
  PetscCall(PetscDAEnsembleGetSize(da, &ensemble_size));
  PetscCall(PetscDASetUp(da));
  PetscCall(PetscDAViewFromOptions(da, NULL, "-petscda_view"));
  PetscCall(PetscDASetObsErrorVariance(da, obs_error_var));

  /* Initialize ensemble members from spun-up truth state with appropriate spread */
  PetscCall(PetscDAEnsembleInitialize(da, truth_state, ensemble_init_std, rng));

  /* Print configuration summary */
  PetscCall(PetscPrintf(PETSC_COMM_WORLD, "Lorenz-96 ETKF Example\n"));
  PetscCall(PetscPrintf(PETSC_COMM_WORLD, "======================\n"));
  PetscCall(PetscPrintf(PETSC_COMM_WORLD,
                        "  State dimension       : %" PetscInt_FMT "\n"
                        "  Ensemble size         : %" PetscInt_FMT "\n"
                        "  Forcing parameter (F) : %.4f\n"
                        "  Time step (dt)        : %.4f\n"
                        "  Total steps           : %" PetscInt_FMT "\n"
                        "  Burn-in steps         : %" PetscInt_FMT "\n"
                        "  Observation frequency : %" PetscInt_FMT "\n"
                        "  Observation noise std : %.3f\n"
                        "  Ensemble init std     : %.3f\n"
                        "  Random seed           : %" PetscInt_FMT "\n\n",
                        n, ensemble_size, (double)F, (double)dt, steps, burn, obs_freq, (double)obs_error_std, (double)ensemble_init_std, random_seed));

  /* Main assimilation cycle: forecast and analysis steps */
  for (step = 0; step <= steps; step++) {
    PetscReal time = step * dt;

    /* Forecast step: compute ensemble mean and forecast RMSE */
    PetscCall(PetscDAEnsembleComputeMean(da, x_mean));
    PetscCall(VecCopy(x_mean, x_forecast));
    PetscCall(ComputeRMSE(x_forecast, truth_state, rmse_work, n, &rmse_forecast));
    rmse_analysis = rmse_forecast; /* Default to forecast RMSE if no analysis */

    /* Analysis step: assimilate observations when available */
    if (step % obs_freq == 0 && step > 0) {
      /* Generate synthetic noisy observations from truth */
      PetscCall(VecSetRandomGaussian(obs_noise, rng, 0.0, obs_error_std));
      PetscCall(VecWAXPY(observation, 1.0, obs_noise, truth_state));

      /* Perform ETKF analysis with observation matrix H */
      PetscCall(PetscDAEnsembleAnalysis(da, observation, H));

      /* Compute analysis RMSE */
      PetscCall(PetscDAEnsembleComputeMean(da, x_mean));
      PetscCall(ComputeRMSE(x_mean, truth_state, rmse_work, n, &rmse_analysis));
      obs_count++;
    }

    /* Accumulate statistics after burn-in period */
    if (step >= burn) {
      sum_rmse_forecast += rmse_forecast;
      sum_rmse_analysis += rmse_analysis;
      n_stat_steps++;
    }

    /* Progress reporting */
    if ((step % progress_interval == 0) || (step == steps) || (step == 0)) {
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "Step %4" PetscInt_FMT ", time %6.3f  RMSE_forecast %.5f  RMSE_analysis %.5f%s\n", step, (double)time, (double)rmse_forecast, (double)rmse_analysis, (step < burn) ? " [burn-in]" : ""));
    }

    /* Propagate ensemble and truth trajectory */
    if (step < steps) {
      PetscCall(PetscDAEnsembleForecast(da, Lorenz96Step, l95_ctx));
      PetscCall(Lorenz96Step(truth_state, truth_state, truth_ctx));
    }
  }

  /* Report final statistics */
  if (n_stat_steps > 0) {
    PetscReal avg_rmse_forecast = sum_rmse_forecast / n_stat_steps;
    PetscReal avg_rmse_analysis = sum_rmse_analysis / n_stat_steps;
    PetscCall(PetscPrintf(PETSC_COMM_WORLD, "\nStatistics (%" PetscInt_FMT " post-burn-in steps):\n", n_stat_steps));
    PetscCall(PetscPrintf(PETSC_COMM_WORLD, "==================================================\n"));
    PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Mean RMSE (forecast) : %.5f\n", (double)avg_rmse_forecast));
    PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Mean RMSE (analysis) : %.5f\n", (double)avg_rmse_analysis));
    PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Observations used    : %" PetscInt_FMT "\n\n", obs_count));
  } else {
    PetscCall(PetscPrintf(PETSC_COMM_WORLD, "\nWarning: No post-burn-in statistics collected (burn >= steps)\n\n"));
  }

  /* Test VecSetRandomGaussian to verify Gaussian distribution */
  {
    Vec          test_vec;
    PetscInt     test_size = 10000; /* Large sample for statistical testing */
    PetscScalar *array;
    PetscReal    mean_target = 2.0, std_target = 1.5;
    PetscReal    sample_mean = 0.0, sample_variance = 0.0, sample_std;
    PetscReal    skewness = 0.0, kurtosis = 0.0;
    PetscInt     i;
    PetscBool    test_gaussian = PETSC_FALSE;

    PetscCall(PetscOptionsGetBool(NULL, NULL, "-test_gaussian", &test_gaussian, NULL));

    if (test_gaussian) {
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "\n==============================================\n"));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "Testing VecSetRandomGaussian\n"));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "==============================================\n"));

      /* Create test vector */
      PetscCall(VecCreate(PETSC_COMM_WORLD, &test_vec));
      PetscCall(VecSetSizes(test_vec, PETSC_DECIDE, test_size));
      PetscCall(VecSetFromOptions(test_vec));

      /* Generate Gaussian random numbers */
      PetscCall(VecSetRandomGaussian(test_vec, rng, mean_target, std_target));

      /* Get array for statistical analysis */
      PetscCall(VecGetArray(test_vec, &array));

      /* Compute sample mean */
      for (i = 0; i < test_size; i++) sample_mean += PetscRealPart(array[i]);
      sample_mean /= test_size;

      /* Compute sample variance and higher moments */
      for (i = 0; i < test_size; i++) {
        PetscReal diff  = PetscRealPart(array[i]) - sample_mean;
        PetscReal diff2 = diff * diff;
        sample_variance += diff2;
        skewness += diff * diff2;
        kurtosis += diff2 * diff2;
      }
      sample_variance /= (test_size - 1);
      sample_std = PetscSqrtReal(sample_variance);

      /* Normalize skewness and kurtosis */
      skewness = (skewness / test_size) / PetscPowReal(sample_std, 3.0);
      kurtosis = (kurtosis / test_size) / PetscPowReal(sample_std, 4.0) - 3.0; /* Excess kurtosis */

      PetscCall(VecRestoreArray(test_vec, &array));

      /* Report results */
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "\nTarget parameters:\n"));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Mean      : %.6f\n", (double)mean_target));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Std Dev   : %.6f\n", (double)std_target));

      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "\nSample statistics (n=%" PetscInt_FMT "):\n", (PetscInt)test_size));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Mean      : %.6f (error: %.6f)\n", (double)sample_mean, (double)PetscAbsReal(sample_mean - mean_target)));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Std Dev   : %.6f (error: %.6f)\n", (double)sample_std, (double)PetscAbsReal(sample_std - std_target)));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Skewness  : %.6f (expected ~0 for Gaussian)\n", (double)skewness));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Kurtosis  : %.6f (expected ~0 for Gaussian)\n", (double)kurtosis));

      /* Statistical tests with reasonable tolerances for finite samples */
      PetscReal mean_error     = PetscAbsReal(sample_mean - mean_target);
      PetscReal std_error      = PetscAbsReal(sample_std - std_target);
      PetscReal mean_tolerance = 3.0 * std_target / PetscSqrtReal((PetscReal)test_size); /* 3-sigma rule */
      PetscReal std_tolerance  = 0.1 * std_target;                                       /* 10% tolerance for std dev */
      PetscReal skew_tolerance = 0.1;                                                    /* Skewness should be near 0 */
      PetscReal kurt_tolerance = 0.5;                                                    /* Excess kurtosis should be near 0 */

      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "\nStatistical tests:\n"));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Mean test     : %s (error %.6f < tolerance %.6f)\n", mean_error < mean_tolerance ? "PASS" : "FAIL", (double)mean_error, (double)mean_tolerance));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Std dev test  : %s (error %.6f < tolerance %.6f)\n", std_error < std_tolerance ? "PASS" : "FAIL", (double)std_error, (double)std_tolerance));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Skewness test : %s (|skewness| %.6f < tolerance %.6f)\n", PetscAbsReal(skewness) < skew_tolerance ? "PASS" : "FAIL", (double)PetscAbsReal(skewness), (double)skew_tolerance));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "  Kurtosis test : %s (|kurtosis| %.6f < tolerance %.6f)\n", PetscAbsReal(kurtosis) < kurt_tolerance ? "PASS" : "FAIL", (double)PetscAbsReal(kurtosis), (double)kurt_tolerance));

      /* Overall test result */
      PetscBool all_pass = (mean_error < mean_tolerance) && (std_error < std_tolerance) && (PetscAbsReal(skewness) < skew_tolerance) && (PetscAbsReal(kurtosis) < kurt_tolerance);
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "\nOverall result: %s\n", all_pass ? "PASS - Distribution is Gaussian" : "FAIL - Distribution may not be Gaussian"));
      PetscCall(PetscPrintf(PETSC_COMM_WORLD, "==============================================\n\n"));

      PetscCall(VecDestroy(&test_vec));
    }
  }

  /* Cleanup */
  PetscCall(MatDestroy(&H));
  PetscCall(VecDestroy(&x_forecast));
  PetscCall(VecDestroy(&x_mean));
  PetscCall(VecDestroy(&obs_error_var));
  PetscCall(VecDestroy(&obs_noise));
  PetscCall(VecDestroy(&observation));
  PetscCall(VecDestroy(&rmse_work));
  PetscCall(VecDestroy(&truth_state));
  PetscCall(VecDestroy(&x0));
  PetscCall(PetscDADestroy(&da));
  PetscCall(DMDestroy(&da_state));
  PetscCall(Lorenz96ContextDestroy(&l95_ctx));
  PetscCall(Lorenz96ContextDestroy(&truth_ctx));
  PetscCall(PetscRandomDestroy(&rng));

  PetscCall(PetscFinalize());
return 0;}

Creating a PetscDA context#

Create, configure, and destroy a PetscDA object with the standard PETSc object lifecycle:

PetscDA da;
PetscCall(PetscDACreate(PETSC_COMM_WORLD, &da));
PetscCall(PetscDASetType(da, PETSCDAETKF));
PetscCall(PetscDASetSizes(da, state_size, obs_size));
PetscCall(PetscDAEnsembleSetSize(da, ensemble_size));
PetscCall(PetscDASetFromOptions(da));
PetscCall(PetscDASetUp(da));

/* ... data assimilation loop ... */

PetscCall(PetscDADestroy(&da));

To create a PetscDA instance, call PetscDACreate():

PetscDACreate(MPI_Comm comm, PetscDA *da);

To choose an implementation type, call

PetscDASetType(PetscDA da, PetscDAType type);

or use the command-line option -petscda_type <name>; details regarding the available implementations are presented in Implementations.

PetscDASetSizes() records the global state dimension and the number of observations:

PetscDASetSizes(PetscDA da, PetscInt state_size, PetscInt obs_size)

For MPI-parallel runs where the state or observation vectors are distributed, the local partition sizes can be set explicitly with

PetscDASetLocalSizes(PetscDA da, PetscInt local_state_size, PetscInt local_obs_size);

Pass PETSC_DECIDE for either argument to let PETSc choose the partition automatically.

PetscDAEnsembleSetSize() records the number of ensemble members requested:

PetscDAEnsembleSetSize(PetscDA da, PetscInt ensemble_size)

After having set these options, call PetscDASetFromOptions() to apply any command-line overrides, then PetscDASetUp() to allocate internal storage:

PetscDASetFromOptions(PetscDA da);
PetscDASetUp(PetscDA da);

Finally, after the user is done using the DA object, destroy it with

PetscDADestroy(PetscDA *da);

Degrees of freedom per grid point#

When the state vector is laid out on a structured grid, the number of physical degrees of freedom per grid point can be recorded with

PetscDASetNDOF(PetscDA da, PetscInt ndof);
PetscDAGetNDOF(PetscDA da, PetscInt *ndof);

This metadata is used by some implementations and viewers; the default is 1.

Managing ensembles#

PetscDA stores ensemble members as PETSc Vec objects and exposes convenience helpers to access them safely.

Individual ensemble members can be read or overwritten using a get/restore ownership pattern analogous to MatDenseGetColumnVec:

/* Read-only view of member i; must be paired with Restore */
PetscDAEnsembleGetMember(PetscDA da, PetscInt i, Vec *member);
PetscDAEnsembleRestoreMember(PetscDA da, PetscInt i, Vec *member);

/* Inject an externally created vector into slot i */
PetscDAEnsembleSetMember(PetscDA da, PetscInt i, Vec member);

PetscDAEnsembleGetMember() / PetscDAEnsembleRestoreMember() map ensemble indices to Vec handles that participate in PETSc’s reference counting. PetscDAEnsembleSetMember() lets applications inject externally created vectors into specific slots, which is useful when importing state snapshots from disk or another solver component.

Ensemble statistics are computed with:

/* Form the sample mean across all members */
PetscDAEnsembleComputeMean(PetscDA da, Vec mean);

/* Return a tall-and-skinny Mat whose columns are the mean-subtracted,
   normalized anomalies X = (E - mean) / sqrt(m-1) */
PetscDAEnsembleComputeAnomalies(PetscDA da, Vec mean, Mat *anomalies);

PetscDAEnsembleComputeAnomalies() returns a dense Mat whose columns are the mean-subtracted ensemble anomalies. Many square-root filters use this matrix to construct low-rank covariance factorizations. Pass NULL for mean to have the function compute it internally.

Observation error#

Observation-error variances (or more general descriptions) are supplied through PetscDASetObsErrorVariance(). The associated vector is assumed to follow the global observation ordering; PetscDAGetObsErrorVariance() returns the stored object for later inspection or reuse.

PetscDASetObsErrorVariance(PetscDA da, Vec obs_error_var);
PetscDAGetObsErrorVariance(PetscDA da, Vec *obs_error_var);

Analysis step#

PetscDAEnsembleAnalysis() performs the assimilation step given an observation vector and a linear observation operator matrix:

/* observation - Vec of length P (number of observations) */
/* H           - Mat of size P x N (observation operator mapping state to observation space) */
PetscCall(PetscDAEnsembleAnalysis(da, observation, H));

H is a Mat (typically sparse AIJ) that maps the N-dimensional state vector to the P-dimensional observation space: y ≈ H*x. PetscDAAnalysis() handles all ensemble reductions, gain computations, and posterior updates.

Forecast step#

PetscDAEnsembleForecast() wraps the forecast step. The user supplies a function that advances a single ensemble member:

Calling sequence for model:

input - the vector to be evolved, forecasted, time-stepped, or otherwise advanced
output - the forecast, evolved, or time-stepped result
ctx - the context for the model function

/* Prototype for model forecast M(x) */
PetscErrorCode ModelForecast(Vec input, Vec output, PetscCtx ctx) {
  /* Advance input by dt to produce output */
  /* (e.g., step a TS object) */
  return PETSC_SUCCESS;
}

PetscCall(PetscDAEnsembleForecast(da, ModelForecast, ctx));

ModelForecast can call PETSc time integrators (TS: Scalable ODE and DAE Solvers), nonlinear solvers (SNES: Nonlinear Solvers), or bespoke device kernels. The PetscDA layer orchestrates calls across the entire ensemble, issuing them in rank-local loops while ensuring that ownership and recycling semantics remain correct.

Inflation#

Covariance inflation counteracts ensemble collapse by artificially widening the prior spread before the analysis step. The inflation factor \(\rho \ge 1\) scales each anomaly so that the effective prior covariance becomes \(\rho^2 P^f\). Set and retrieve the factor with

PetscDAEnsembleSetInflation(PetscDA da, PetscReal inflation);
PetscDAEnsembleGetInflation(PetscDA da, PetscReal *inflation);

or at runtime with -petscda_ensemble_inflation <value> (default: 1.0, i.e. no inflation).

Implementations#

The available PetscDA implementations are listed in Table PETSc Data Assimilation Methods. Custom types can be registered with PetscDARegister() and selected at runtime with -petscda_type <name>.

Table 22 PETSc Data Assimilation Methods#
Method	PetscDAType	Options Name
Ensemble Transform Kalman Filter	`PETSCDAETKF`	`etkf`
Local Ensemble Transform Kalman Filter	`PETSCDALETKF`	`letkf`

ETKF#

The built-in square-root ETKF (PETSCDAETKF, -petscda_type etkf) is the default implementation. It implements Algorithm 6.4 in [ABN16] using a deterministic square-root update that avoids stochastic perturbations.

The ETKF supports two factorization strategies for the reduced-space T-matrix:

PetscDAEnsembleSetSqrtType(PetscDA da, PetscDASqrtType type);
PetscDAEnsembleGetSqrtType(PetscDA da, PetscDASqrtType *type);

Table 23 ETKF square-root types#
`PetscDASqrtType`	Options string	Notes
`PETSCDA_SQRT_CHOLESKY`	`cholesky`	O(n³/3); preferred when the reduced-space matrix is positive definite
`PETSCDA_SQRT_EIGEN`	`eigen`	More robust for semi-definite matrices; handles small negative eigenvalues from round-off

Select at runtime with -petscda_ensemble_sqrt_type {cholesky,eigen} (default: eigen).

LETKF#

The Local ETKF (PETSCDALETKF, -petscda_type letkf) performs the analysis update locally around each grid point, enabling scalable assimilation on large domains by avoiding the global ensemble covariance matrix. LETKF-specific configuration:

/* Number of observations associated with each grid vertex (default: 9) */
PetscDALETKFSetObsPerVertex(PetscDA da, PetscInt n_obs_vertex);
PetscDALETKFGetObsPerVertex(PetscDA da, PetscInt *n_obs_vertex);

/* Localization weight matrix Q (N x P) and observation operator matrix H (P x N) */
PetscDALETKFSetLocalization(PetscDA da, Mat Q, Mat H);

Set the observation count at runtime with -petscda_letkf_obs_per_vertex <n> (default: 9).

Command-line options#

The PetscDA object obeys standard PETSc options parsing. Commonly used switches include:

-petscda_type <name> – select a registered PetscDA implementation (etkf, letkf).
-petscda_ensemble_inflation <value> – set the covariance inflation factor (default: 1.0).
-petscda_ensemble_sqrt_type {cholesky,eigen} – select the T-matrix square-root algorithm for ETKF (default: eigen).
-petscda_letkf_obs_per_vertex <n> – set the number of observations per grid vertex for LETKF (default: 9).
-petscda_view – inspect ensemble metadata and internal sizes.

Because PetscDA participates in the PETSc object registry, any prefix applied with PetscDASetOptionsPrefix() scopes these options.

Viewing and monitoring#

PetscDAView() and PetscDAViewFromOptions() expose ensemble sizing, observation dimensions, and implementation-specific diagnostics . Views can be directed to ASCII, HDF5, or custom PetscViewer targets, enabling lightweight instrumentation of assimilation experiments. For advanced profiling, the PetscDA package registers with PETSc’s logging infrastructure via PetscDAInitializePackage()/PetscDAFinalizePackage(), so standard -log_view outputs will include assimilation breakdown.