Automated consistency tests of a Distribution

Hi!
I just found a bug on a specific distribution where the logPDF is wrongly implemented ("*" instead of “+”). The good way would be to check the computed value against a reference value, obtained for example from a symbolic computer system (such as Maple). However, we sometimes do not have the time to do so, hence the bug.

So I implemented a basic consistency checker. In my case, the logarithm of the PDF is obviously unconsistent with the logPDF. This is easy to implement since OT is OO. I did it because I feel that implementing this could save tens of hours of debug for most distributions. Loss of accuracy in extreme cases cannot be found this way, but it may allow to detect gross bugs.

class UnivariateDistributionChecker:
    def __init__(self, distribution, sample_size=10, verbose=False, decimal = 7):
        self.distribution = distribution
        self.verbose = verbose
        self.sample_size = sample_size
        self.sample = distribution.getSample(self.sample_size)
        self.decimal = decimal
        if self.verbose:
            print("sample=")
            print(self.sample)
        return

    def check_logPDF(self):
        """Check the consistency of logPDF against the log of the PDF."""
        if self.verbose:
            print("check_logPDF")
        logPDF1 = self.distribution.computeLogPDF(self.sample)
        logPDF2 = np.log(self.distribution.computePDF(self.sample))
        np.testing.assert_almost_equal(logPDF1, logPDF2, decimal = self.decimal)
        return

    def check_PDF(self):
        """Check the consistency of PDF against the gradient of the CDF."""
        if self.verbose:
            print("check_PDF")
        PDF1 = self.distribution.computePDF(self.sample)
        epsilon = ot.ResourceMap.GetAsScalar(
            "CenteredFiniteDifferenceGradient-DefaultEpsilon"
        )
        CDF1 = distribution.computeCDF(self.sample + epsilon)
        CDF2 = distribution.computeCDF(self.sample - epsilon)
        PDF2 = (CDF1 - CDF2) / (2.0 * epsilon)
        np.testing.assert_almost_equal(np.array(PDF1), np.array(PDF2), decimal = self.decimal)
        return

    def check_DDF(self):
        """Check the consistency of DDF against the gradient of the PDF."""
        if self.verbose:
            print("check_DDF")
        DDF1 = self.distribution.computeDDF(self.sample)
        epsilon = ot.ResourceMap.GetAsScalar(
            "CenteredFiniteDifferenceGradient-DefaultEpsilon"
        )
        PDF1 = distribution.computePDF(self.sample + epsilon)
        PDF2 = distribution.computePDF(self.sample - epsilon)
        DDF2 = (PDF1 - PDF2) / (2.0 * epsilon)
        np.testing.assert_almost_equal(np.array(DDF1), np.array(DDF2), decimal = self.decimal)
        return
    
    def check_ComplementaryCDF(self):
        """Check the consistency of complementary CDF against the CDF."""
        if self.verbose:
            print("check_ComplementaryCDF")
        CCDF1 = self.distribution.computeComplementaryCDF(self.sample)
        CCDF2 = 1.0 - np.array(distribution.computeCDF(self.sample))
        np.testing.assert_almost_equal(np.array(CCDF1), np.array(CCDF2), decimal = self.decimal)
        return

    def check_MinimumVolumeIntervalWithMarginalProbability(self, probability = 0.9):
        """Check the consistency of MinimumVolumeIntervalWithMarginalProbability against the CDF."""
        if self.verbose:
            print("check_MinimumVolumeIntervalWithMarginalProbability")
        interval = self.distribution.computeMinimumVolumeIntervalWithMarginalProbability(probability)[0]
        lower = interval.getLowerBound()[0]
        upper = interval.getUpperBound()[0]
        CDF_up = self.distribution.computeCDF(upper)
        CDF_low = self.distribution.computeCDF(lower)
        computed_probability = CDF_up - CDF_low
        np.testing.assert_almost_equal(probability, computed_probability, decimal = self.decimal)
        return
    
    def check_MinimumVolumeLevelSetWithThreshold(self, probability = 0.9):
        """Check the consistency of MinimumVolumeLevelSetWithThreshold against the CDF."""
        if self.verbose:
            print("check_MinimumVolumeLevelSetWithThreshold")
        levelSet, threshold = checker.distribution.computeMinimumVolumeLevelSetWithThreshold(probability)
        x = checker.distribution.computeQuantile(1.0 - (1.0 - probability) / 2.0)
        computed_PDF = checker.distribution.computePDF(x)
        np.testing.assert_almost_equal(threshold, computed_PDF, decimal = self.decimal)
        return
    
    def check_all(self):
        self.check_PDF()
        self.check_logPDF()
        self.check_DDF()
        self.check_ComplementaryCDF()
        self.check_MinimumVolumeIntervalWithMarginalProbability()
        self.check_MinimumVolumeLevelSetWithThreshold()
        return

From there, it is easy to loop over all continuous univariate distributions:

factory_list = ot.DistributionFactory_GetUniVariateFactories()

n = len(factory_list)

for i in range(n):
    distribution = factory_list[i].build()
    name = distribution.getName()
    if distribution.isContinuous():
        print(i, name)
        checker = UnivariateDistributionChecker(distribution, decimal = 3)
        try:
            checker.check_all()
        except AssertionError as err:
            print(name, ": error")
            #print(err)

PS
This is at: The log-PDF of the Pareto is wrong · Issue #1714 · openturns/openturns · GitHub

Here is the output:

0 Arcsine : error
1 Beta : OK
2 Burr : error
3 Chi : error
4 ChiSquare : error
5 Dirichlet : OK
6 Exponential : error
7 FisherSnedecor : error
8 Frechet : error
9 Gamma : error
10 GeneralizedPareto : error
11 Gumbel : error
12 Histogram : OK
13 InverseNormal : error
14 Laplace : OK
15 Logistic : OK
16 LogNormal : error
17 LogUniform : error
18 MeixnerDistribution : OK
19 Normal : OK
20 Pareto : error
21 Rayleigh : error
22 Rice : error
23 Student : OK
24 Trapezoidal : OK
25 Triangular : OK
26 TruncatedNormal : OK
27 Uniform : error
28 WeibullMax : error
29 WeibullMin : error

I am not 100% sure that all these warnings are real bugs (the MinimumVolumeLevelSet check is suspicious to me). However, Arcsine, FisherSnedecor are suspects. GeneralizedPareto has a bug : the PDF is not consistent with the gradient of the CDF.

Continuing this way, we may implement additional checkers for PDFs (including discrete distributions) and, perhaps, find other yet undetected bugs?

Best regards,

Michaël

Hi Michael,

Nice work! A few comments:

  • some of your tests are already implemented in distribution tests (e.g. check the PDF against a finite difference of CDF, the same for the DDF) but not in a systematic way
  • your test of the minimum volume level set is correct only for symmetric distributions. Remember the discussion we had some times ago about the many ways this level set was computed in OT: even in the 1D case, you have to take care of the symmetry, the unimodality aso.

I will have a look at the failed cases ASAP.

Cheers

Régis

Hi Régis,

I am quite happy that this might be satisfactory to you, because I have this idea in mind for quite some time, but I was not sure it could provide the quality I expect in general. So I wanted to avoid it as much as possible. This “Checker” will not detect the tiny accuracy limitations we like so much!

You are certainly right about the LevelSet check ; I was not sure how to get it right in the general, unsymmetric, case. The algorithms which return intervals are easier to check, since it suffices to compute the CDF difference and check that the mass corresponds to the required probability. When only the PDF value is returned… well I do not know how to test this easily. We can always perform Monte-Carlo simulation and check the integral on the domain

A_\alpha^\star = \{x \in \mathbb{R}^d \; | \; p(\mathbf{x}) > p_\alpha\}

where p_\alpha is the value of threshold but it seems to me that this almost replicates the internal code. Perhaps we may combine several LevelSet algorithms?

Notice that this algorithm must fail for the Uniform distribution, which has a flat PDF, so that a failure of this algorithm for this distribution is expected and should not be considered as an issue.

Would the UnivariateDistributionChecker class be for development only, or should this class be made publicly available? I guess that this might be handy for those of us who use the PythonDistribution class.

Regards,

Michaël

Hi Michaël,

I played with your code, with the following modification:

    def check_MinimumVolumeLevelSetWithThreshold(self, probability = 0.9):
        """Check the consistency of MinimumVolumeLevelSetWithThreshold against the CDF."""
        if self.verbose:
            print("check_MinimumVolumeLevelSetWithThreshold")
        levelSet, threshold = checker.distribution.computeMinimumVolumeLevelSetWithThreshold(probability)
        if self.verbose:
            print("levelSet=", levelSet)
            print("threshold=", threshold)
        event = ot.DomainEvent(ot.RandomVector(checker.distribution), levelSet)
        algo = ot.ProbabilitySimulationAlgorithm(event)
        HUGE = 10000000
        algo.setBlockSize(HUGE)
        algo.setMaximumOuterSampling(1)
        algo.run()
        p = algo.getResult().getProbabilityEstimate()
        if self.verbose:
            print("p=", p, "probability=", probability)
        np.testing.assert_almost_equal(p, probability, decimal = self.decimal)
        return

and the remaining failing distributions are those with flat PDF: Dirichlet, Histogram… and not the Uniform distribution, for which a dedicated algorithm provides an interval centered around the mean with the desired probability content! IMO this method should be tested this way as it is the most straightforward validation: does the level set contains the requested probability content? The algorithm is different from the one used to compute the minimum volume level set, as it uses crude Monte Carlo instead of one of the many specific algorithms (Uniform, Normal…) or generic algorithms. For example, the default code for univariate distributions is here:

LevelSet DistributionImplementation::computeMinimumVolumeLevelSetWithThreshold(const Scalar prob,
    Scalar & threshold) const
{
  if (!isContinuous()) throw NotYetImplementedException(HERE) << "In DistributionImplementation::computeMinimumVolumeLevelSet()";
  // 1D special case here to avoid a double construction of minimumVolumeLevelSetFunction
  if ((dimension_ == 1) && (ResourceMap::GetAsBool("Distribution-MinimumVolumeLevelSetBySampling")))
  {
    LOGINFO("Compute the minimum volume level set by sampling (QMC)");
    const LevelSet result(computeUnivariateMinimumVolumeLevelSetByQMC(prob, threshold));
    return result;
  }
  Function minimumVolumeLevelSetFunction(MinimumVolumeLevelSetEvaluation(clone()).clone());
  minimumVolumeLevelSetFunction.setGradient(MinimumVolumeLevelSetGradient(clone()).clone());
  // If dimension_ == 1 the threshold can be computed analyticaly
  Scalar minusLogPDFThreshold;
  if (dimension_ == 1)
  {
    const CompositeDistribution composite(minimumVolumeLevelSetFunction, *this);
    minusLogPDFThreshold = composite.computeQuantile(prob)[0];
    LOGINFO("Compute the minimum volume level set by using a composite distribution quantile (univariate general case)");
  } // dimension == 1
  threshold = std::exp(-minusLogPDFThreshold);

  return LevelSet(minimumVolumeLevelSetFunction, LessOrEqual(), minusLogPDFThreshold);
}

So you see that no Monte Carlo sampling is used here: even when you force the use of a sampling method, it is done by QMC.

Your code should definitely comes with OT, and we should call it in our tests.

A last remark: in several cases, the default distribution built by the corresponding factory is very special, e.g. the Dirichlet and Histogram distributions don’t have a flat PDF in general, so the test should not fail in these cases. It is why the systematic test done the way you did it, while being simple and already very useful (and already gives me enough input for a good debug session), is not enough.

Many thanks for this good work.

Cheers

Régis

1 Like

Hi guys,

Nice work! This is something I had in mind for a while but I had not found the time to move on to the implementation.

Some comments:

  • This should be implemented in C++ to allow testing similarly
  • We might start by testing CDF (0 on lower bound, 1 on upper bound) as it is the key element + testing computeProbability
  • Derive the test for discrete distribution
  • C++ implementation: +1. But I guess that a Py prototype will allow to get an overview of what exactly is the scope of the feature ; for the moment, I cannot see where it ends.
  • discrete : +1.
  • CDF : 0 on lower bound, 1 on upper bound. What do you mean?

Regards,

Michaël

Hi,
In fact, if you have a continuous distribution, you only need to define the computeCDF method. All other methods might rely on this last one (computePDF, computeDDF, computeQuantile, computeRange…)
To make sure there is no bad definition, we can add the following method :

def check_cdf_range(self):
    interval = self.distribution.getRange()
    np.testing.assert_almost_equal(self.distribution.computeCDF(interval.getLowerBound()) - np.sqrt(SpecFunc.ScalarEpsilon), 0.0, decimal = self.decimal) 
    np.testing.assert_almost_equal(self.distribution.computeCDF(interval.getUpperBound()) + np.sqrt(SpecFunc.ScalarEpsilon), 1.0, decimal = self.decimal)
    np.testing.assert_almost_equal(self.distribution.computeProbability(interval), 1.0, decimal = self.decimal)

In fact, we should define a test for each method of the distribution class.

1 Like