PetscProbComputeKSStatisticWeighted#

Compute the Kolmogorov-Smirnov statistic for the weighted empirical distribution for an input vector, compared to an analytic CDF.

Synopsis#

#include "petscdt.h" 
PetscErrorCode PetscProbComputeKSStatisticWeighted(Vec v, Vec w, PetscProbFunc cdf, PetscReal *alpha)

Collective

Input Parameters#

  • v - The data vector, blocksize is the sample dimension

  • w - The vector of weights for each sample, instead of the default 1/n

  • cdf - The analytic CDF

Output Parameter#

  • alpha - The KS statistic

Notes#

The Kolmogorov-Smirnov statistic for a given cumulative distribution function F(x)F(x) is

Dn=supxFn(x)F(x) D_n = \sup_x \left| F_n(x) - F(x) \right|

where supx\sup_x is the supremum of the set of distances, and the empirical distribution function Fn(x)F_n(x) is discrete, and given by

\[ F_n = # of samples <= x / n \]

The empirical distribution function Fn(x)F_n(x) is discrete, and thus had a ``stairstep’’ cumulative distribution, making nn the number of stairs. Intuitively, the statistic takes the largest absolute difference between the two distribution functions across all xx values.

The goodness-of-fit test, or Kolmogorov-Smirnov test, is constructed using the Kolmogorov distribution. It rejects the null hypothesis at level α\alpha if

nDn>Kα, \sqrt{n} D_{n} > K_{\alpha},

where KαK_\alpha is found from

Pr(KKα)=1α. \operatorname{Pr}(K \leq K_{\alpha}) = 1 - \alpha.

This means that getting a small alpha says that we have high confidence that the data did not come from the input distribution, so we say that it rejects the null hypothesis.

See Also#

PetscProbComputeKSStatistic(), PetscProbComputeKSStatisticMagnitude(), PetscProbFunc

Level#

advanced

Location#

src/dm/dt/interface/dtprob.c


Index of all DT routines
Table of Contents for all manual pages
Index of all manual pages