PetscRegressor: Regression Solvers#

The PetscRegressor library provides a framework for the scalable solution of regression and classification problems. Methods are available for

Note that by regressor, we mean an algorithm or implementation used to fit a regression model, following notation from machine-learning community. Regressor here does NOT mean independent (or predictor) variable, as it often does in the statistics community.

Basic Regressor Usage#

Here, we introduce a simple example to demonstrate PetscRegressor usage. Please read Regression Solvers for more in-depth discussion. The code presented below solves ordinary linear regressoion problem, with various regularization options.

In the simplest usage of the regressor solver, the user simply needs to provide target matrix (Mat), and a target vector (Vec) to fit the regressor against. With fitted regressor, then the user can obtain predicted value vector.

PETSc’s default method for solving regression problem is ordinary least squares, REGRESSOR_LINEAR_OLS, which is a sub-type of linear regressor, PETSCREGRESSORLINEAR.

Note that data creation, option parsings, and cleaning stages are omiited for display purposes. The complete code is available in ex3.c.

Listing: src/ml/regressor/tests/ex3.c

#include <petscregressor.h>
int main(int argc, char **args)
{
  AppCtx         ctx;
  PetscRegressor regressor;
  PetscScalar    intercept;

  /* Initialize PETSc */
  PetscCall(PetscInitialize(&argc, &args, (char *)0, help));

  /* Initialize problem parameters and data */
  PetscCall(PetscNew(&ctx));
  PetscCall(ConfigureContext(ctx));
  PetscCall(CreateData(ctx));

  /* Create Regressor solver with desired type and options */
  PetscCall(PetscRegressorCreate(PETSC_COMM_WORLD, &regressor));
  PetscCall(PetscRegressorSetType(regressor, PETSCREGRESSORLINEAR));
  PetscCall(PetscRegressorLinearSetType(regressor, REGRESSOR_LINEAR_OLS));
  PetscCall(PetscRegressorLinearSetFitIntercept(regressor, PETSC_FALSE));
  /* Testing prefix functions for Regressor */
  PetscCall(TestPrefixRegressor(regressor, ctx));
  /* Check for command line options */
  PetscCall(PetscRegressorSetFromOptions(regressor));
  /* Fit the regressor */
  PetscCall(PetscRegressorFit(regressor, ctx->X, ctx->y));
  /* Predict data with fitted regressor */
  PetscCall(PetscRegressorPredict(regressor, ctx->X, ctx->y_predicted));
  /* Get other desired output data */
  PetscCall(PetscRegressorLinearGetIntercept(regressor, &intercept));
  PetscCall(PetscRegressorLinearGetCoefficients(regressor, &ctx->coefficients));

  /* Testing Views, and GetTypes */
  PetscCall(TestRegressorViews(regressor, ctx));
  PetscCall(PetscRegressorDestroy(&regressor));
  PetscCall(DestroyCtx(&ctx));
  PetscCall(PetscFinalize());
return 0;}

To create a PetscRegressor solver, one must first call PetscRegressorCreate() as follows:

PetscRegressorCreate(MPI_Comm comm, PetscRegressor *regressor);

To choose a solver type, the user can either call

PetscRegressorSetType(PetscRegressor regressor, PetscRegressorType type);

or use the option -regressor_type <method>, where details regarding the available methods are presented in Regression Solvers. The application code can take complete control of the linear and nonlinear techniques used in the Newton-like method by calling

PetscRegressorSetFromOptions(regressor);

This routine provides an interface to the PETSc options database, so that at runtime the user can select a particular regression solver, set various parameters and customized routines. With this routine the user can also control all inner solver options in the KSP, and Tao modules, as discussed in KSP: Linear System Solvers, TAO: Optimization Solvers.

After having set these routines and options, the user can fit the problem by calling

PetscRegressorFit(PetscRegressor regressor, Mat X, Vec y);

where X is training data, and y is target values. Finally, after fitting the regressor solver, the user can compute prediction, that is, perform inference, using a fitted regressor.

PetscRegressorPredict(PetscRegressor regressor, Mat X, Vec y_predicted);

Finally, after the user is done using predicting the regressor solver, the user should destroy the PetscRegressor context with

PetscRegressorDestroy(PetscRegressor *regressor);

Note that the user should not destroy y_predicted from previous section, as this is done internally.

Regression Solvers#

One can see the list of regressor solver types in Table PETSc Regressor. Currently, we only support one type, PETSCREGRESSORLINEAR.

Table 20 PETSc Regressor#

Method

PetscRegressorType

Options Name

Linear

PETSCREGRESSORLINEAR

linear

If the particular method that the user is using supports regularizer, the user can set regularizer’s weight via

PetscRegressorSetRegularizerWeight(PetscRegressor regressor, PetscReal weight);

or with the option -regresor_regularizer_weight <weight>.

Linear regressor#

The method PETSCREGRESSORLINEAR (-regressor_type linear) constructs a linear model to reduce the sum of squared differences between the actual target values in the dataset and the target values estimated by the linear approximation. By default, this method will use bound-constrained regularized Gauss-Newton TAOBRGN to solve the regression problem.

Currently, linear regressor has three types, which are described in Table Linear Regressor types.

Table 21 Linear Regressor types#

Linear method

PetscRegressorLinearType

Options Name

Ordinary

REGRESSOR_LINEAR_OLS

ols

Lasso

REGRESSOR_LINEAR_LASSO

lasso

Ridge

REGRESSOR_LINEAR_RIDGE

ridge

If one wishes, the user can (when appropriate) use KSP to solve the problem, instead of Tao, via

PetscRegressorLinearSetUseKSP(PetscRegressor regressor, PetscBool flg);

or with the option -regressor_linear_use_ksp <true,false>.

The user can also compute the intercept, also known as the bias or offset), via

or with the option -regressor_linear_fit_intercept <true,false>.

After the regressor has been fitted and predicted, one can obtain intercept and a vector of the fitted coefficients from a linear regression model.

PetscRegressorLinearGetCoefficients(PetscRegressor regressor, Vec *coefficients);
PetscRegressorLinearGetIntercept(PetscRegressor regressor, PetscScalar *intercept);