Automated consistency tests of a Distribution

MichaelBaudin · January 9, 2021, 11:05pm

Hi!
I just found a bug on a specific distribution where the logPDF is wrongly implemented (“*” instead of “+”). The good way would be to check the computed value against a reference value, obtained for example from a symbolic computer system (such as Maple). However, we sometimes do not have the time to do so, hence the bug.

So I implemented a basic consistency checker. In my case, the logarithm of the PDF is obviously inconsistent with the logPDF. This is easy to implement since OT is OO. I did it because I feel that implementing this could save tens of hours of debug for most distributions. Loss of accuracy in extreme cases cannot be found this way, but it may allow to detect gross bugs.

class UnivariateDistributionChecker:
    def __init__(self, distribution, sample_size=10, verbose=False, decimal = 7):
        self.distribution = distribution
        self.verbose = verbose
        self.sample_size = sample_size
        self.sample = distribution.getSample(self.sample_size)
        self.decimal = decimal
        if self.verbose:
            print("sample=")
            print(self.sample)
        return

    def check_logPDF(self):
        """Check the consistency of logPDF against the log of the PDF."""
        if self.verbose:
            print("check_logPDF")
        logPDF1 = self.distribution.computeLogPDF(self.sample)
        logPDF2 = np.log(self.distribution.computePDF(self.sample))
        np.testing.assert_almost_equal(logPDF1, logPDF2, decimal = self.decimal)
        return

    def check_PDF(self):
        """Check the consistency of PDF against the gradient of the CDF."""
        if self.verbose:
            print("check_PDF")
        PDF1 = self.distribution.computePDF(self.sample)
        epsilon = ot.ResourceMap.GetAsScalar(
            "CenteredFiniteDifferenceGradient-DefaultEpsilon"
        )
        CDF1 = distribution.computeCDF(self.sample + epsilon)
        CDF2 = distribution.computeCDF(self.sample - epsilon)
        PDF2 = (CDF1 - CDF2) / (2.0 * epsilon)
        np.testing.assert_almost_equal(np.array(PDF1), np.array(PDF2), decimal = self.decimal)
        return

    def check_DDF(self):
        """Check the consistency of DDF against the gradient of the PDF."""
        if self.verbose:
            print("check_DDF")
        DDF1 = self.distribution.computeDDF(self.sample)
        epsilon = ot.ResourceMap.GetAsScalar(
            "CenteredFiniteDifferenceGradient-DefaultEpsilon"
        )
        PDF1 = distribution.computePDF(self.sample + epsilon)
        PDF2 = distribution.computePDF(self.sample - epsilon)
        DDF2 = (PDF1 - PDF2) / (2.0 * epsilon)
        np.testing.assert_almost_equal(np.array(DDF1), np.array(DDF2), decimal = self.decimal)
        return
    
    def check_ComplementaryCDF(self):
        """Check the consistency of complementary CDF against the CDF."""
        if self.verbose:
            print("check_ComplementaryCDF")
        CCDF1 = self.distribution.computeComplementaryCDF(self.sample)
        CCDF2 = 1.0 - np.array(distribution.computeCDF(self.sample))
        np.testing.assert_almost_equal(np.array(CCDF1), np.array(CCDF2), decimal = self.decimal)
        return

    def check_MinimumVolumeIntervalWithMarginalProbability(self, probability = 0.9):
        """Check the consistency of MinimumVolumeIntervalWithMarginalProbability against the CDF."""
        if self.verbose:
            print("check_MinimumVolumeIntervalWithMarginalProbability")
        interval = self.distribution.computeMinimumVolumeIntervalWithMarginalProbability(probability)[0]
        lower = interval.getLowerBound()[0]
        upper = interval.getUpperBound()[0]
        CDF_up = self.distribution.computeCDF(upper)
        CDF_low = self.distribution.computeCDF(lower)
        computed_probability = CDF_up - CDF_low
        np.testing.assert_almost_equal(probability, computed_probability, decimal = self.decimal)
        return
    
    def check_MinimumVolumeLevelSetWithThreshold(self, probability = 0.9):
        """Check the consistency of MinimumVolumeLevelSetWithThreshold against the CDF."""
        if self.verbose:
            print("check_MinimumVolumeLevelSetWithThreshold")
        levelSet, threshold = checker.distribution.computeMinimumVolumeLevelSetWithThreshold(probability)
        x = checker.distribution.computeQuantile(1.0 - (1.0 - probability) / 2.0)
        computed_PDF = checker.distribution.computePDF(x)
        np.testing.assert_almost_equal(threshold, computed_PDF, decimal = self.decimal)
        return
    
    def check_all(self):
        self.check_PDF()
        self.check_logPDF()
        self.check_DDF()
        self.check_ComplementaryCDF()
        self.check_MinimumVolumeIntervalWithMarginalProbability()
        self.check_MinimumVolumeLevelSetWithThreshold()
        return

From there, it is easy to loop over all continuous univariate distributions:

factory_list = ot.DistributionFactory_GetUniVariateFactories()

n = len(factory_list)

for i in range(n):
    distribution = factory_list[i].build()
    name = distribution.getName()
    if distribution.isContinuous():
        print(i, name)
        checker = UnivariateDistributionChecker(distribution, decimal = 3)
        try:
            checker.check_all()
        except AssertionError as err:
            print(name, ": error")
            #print(err)

PS
This is at: The log-PDF of the Pareto is wrong · Issue #1714 · openturns/openturns · GitHub

MichaelBaudin · January 9, 2021, 11:11pm

Here is the output:

0 Arcsine : error
1 Beta : OK
2 Burr : error
3 Chi : error
4 ChiSquare : error
5 Dirichlet : OK
6 Exponential : error
7 FisherSnedecor : error
8 Frechet : error
9 Gamma : error
10 GeneralizedPareto : error
11 Gumbel : error
12 Histogram : OK
13 InverseNormal : error
14 Laplace : OK
15 Logistic : OK
16 LogNormal : error
17 LogUniform : error
18 MeixnerDistribution : OK
19 Normal : OK
20 Pareto : error
21 Rayleigh : error
22 Rice : error
23 Student : OK
24 Trapezoidal : OK
25 Triangular : OK
26 TruncatedNormal : OK
27 Uniform : error
28 WeibullMax : error
29 WeibullMin : error

I am not 100% sure that all these warnings are real bugs (the MinimumVolumeLevelSet check is suspicious to me). However, Arcsine, FisherSnedecor are suspects. GeneralizedPareto has a bug : the PDF is not consistent with the gradient of the CDF.

Continuing this way, we may implement additional checkers for PDFs (including discrete distributions) and, perhaps, find other yet undetected bugs?

Best regards,

Michaël

regislebrun · January 10, 2021, 5:04pm

Hi Michael,

Nice work! A few comments:

some of your tests are already implemented in distribution tests (e.g. check the PDF against a finite difference of CDF, the same for the DDF) but not in a systematic way
your test of the minimum volume level set is correct only for symmetric distributions. Remember the discussion we had some times ago about the many ways this level set was computed in OT: even in the 1D case, you have to take care of the symmetry, the unimodality aso.

I will have a look at the failed cases ASAP.

Cheers

Régis

MichaelBaudin · January 10, 2021, 9:51pm

Hi Régis,

I am quite happy that this might be satisfactory to you, because I have this idea in mind for quite some time, but I was not sure it could provide the quality I expect in general. So I wanted to avoid it as much as possible. This “Checker” will not detect the tiny accuracy limitations we like so much!

You are certainly right about the LevelSet check ; I was not sure how to get it right in the general, unsymmetric, case. The algorithms which return intervals are easier to check, since it suffices to compute the CDF difference and check that the mass corresponds to the required probability. When only the PDF value is returned… well I do not know how to test this easily. We can always perform Monte-Carlo simulation and check the integral on the domain

A_\alpha^\star = \{x \in \mathbb{R}^d \; | \; p(\mathbf{x}) > p_\alpha\}

where p_\alpha is the value of threshold but it seems to me that this almost replicates the internal code. Perhaps we may combine several LevelSet algorithms?

Notice that this algorithm must fail for the Uniform distribution, which has a flat PDF, so that a failure of this algorithm for this distribution is expected and should not be considered as an issue.

Would the UnivariateDistributionChecker class be for development only, or should this class be made publicly available? I guess that this might be handy for those of us who use the PythonDistribution class.

Regards,

Michaël

regislebrun · January 10, 2021, 11:28pm

Hi Michaël,

I played with your code, with the following modification:

    def check_MinimumVolumeLevelSetWithThreshold(self, probability = 0.9):
        """Check the consistency of MinimumVolumeLevelSetWithThreshold against the CDF."""
        if self.verbose:
            print("check_MinimumVolumeLevelSetWithThreshold")
        levelSet, threshold = checker.distribution.computeMinimumVolumeLevelSetWithThreshold(probability)
        if self.verbose:
            print("levelSet=", levelSet)
            print("threshold=", threshold)
        event = ot.DomainEvent(ot.RandomVector(checker.distribution), levelSet)
        algo = ot.ProbabilitySimulationAlgorithm(event)
        HUGE = 10000000
        algo.setBlockSize(HUGE)
        algo.setMaximumOuterSampling(1)
        algo.run()
        p = algo.getResult().getProbabilityEstimate()
        if self.verbose:
            print("p=", p, "probability=", probability)
        np.testing.assert_almost_equal(p, probability, decimal = self.decimal)
        return

and the remaining failing distributions are those with flat PDF: Dirichlet, Histogram… and not the Uniform distribution, for which a dedicated algorithm provides an interval centered around the mean with the desired probability content! IMO this method should be tested this way as it is the most straightforward validation: does the level set contains the requested probability content? The algorithm is different from the one used to compute the minimum volume level set, as it uses crude Monte Carlo instead of one of the many specific algorithms (Uniform, Normal…) or generic algorithms. For example, the default code for univariate distributions is here:

LevelSet DistributionImplementation::computeMinimumVolumeLevelSetWithThreshold(const Scalar prob,
    Scalar & threshold) const
{
  if (!isContinuous()) throw NotYetImplementedException(HERE) << "In DistributionImplementation::computeMinimumVolumeLevelSet()";
  // 1D special case here to avoid a double construction of minimumVolumeLevelSetFunction
  if ((dimension_ == 1) && (ResourceMap::GetAsBool("Distribution-MinimumVolumeLevelSetBySampling")))
  {
    LOGINFO("Compute the minimum volume level set by sampling (QMC)");
    const LevelSet result(computeUnivariateMinimumVolumeLevelSetByQMC(prob, threshold));
    return result;
  }
  Function minimumVolumeLevelSetFunction(MinimumVolumeLevelSetEvaluation(clone()).clone());
  minimumVolumeLevelSetFunction.setGradient(MinimumVolumeLevelSetGradient(clone()).clone());
  // If dimension_ == 1 the threshold can be computed analyticaly
  Scalar minusLogPDFThreshold;
  if (dimension_ == 1)
  {
    const CompositeDistribution composite(minimumVolumeLevelSetFunction, *this);
    minusLogPDFThreshold = composite.computeQuantile(prob)[0];
    LOGINFO("Compute the minimum volume level set by using a composite distribution quantile (univariate general case)");
  } // dimension == 1
  threshold = std::exp(-minusLogPDFThreshold);

  return LevelSet(minimumVolumeLevelSetFunction, LessOrEqual(), minusLogPDFThreshold);
}

So you see that no Monte Carlo sampling is used here: even when you force the use of a sampling method, it is done by QMC.

Your code should definitely comes with OT, and we should call it in our tests.

A last remark: in several cases, the default distribution built by the corresponding factory is very special, e.g. the Dirichlet and Histogram distributions don’t have a flat PDF in general, so the test should not fail in these cases. It is why the systematic test done the way you did it, while being simple and already very useful (and already gives me enough input for a good debug session), is not enough.

Many thanks for this good work.

Cheers

Régis

sofianehaddad · January 13, 2021, 7:51am

Hi guys,

Nice work! This is something I had in mind for a while but I had not found the time to move on to the implementation.

Some comments:

This should be implemented in C++ to allow testing similarly
We might start by testing CDF (0 on lower bound, 1 on upper bound) as it is the key element + testing computeProbability
Derive the test for discrete distribution

MichaelBaudin · January 13, 2021, 5:46pm

C++ implementation: +1. But I guess that a Py prototype will allow to get an overview of what exactly is the scope of the feature ; for the moment, I cannot see where it ends.
discrete : +1.
CDF : 0 on lower bound, 1 on upper bound. What do you mean?

Regards,

Michaël

sofianehaddad · January 14, 2021, 9:43pm

Hi,
In fact, if you have a continuous distribution, you only need to define the computeCDF method. All other methods might rely on this last one (computePDF, computeDDF, computeQuantile, computeRange…)
To make sure there is no bad definition, we can add the following method :

def check_cdf_range(self):
    interval = self.distribution.getRange()
    np.testing.assert_almost_equal(self.distribution.computeCDF(interval.getLowerBound()) - np.sqrt(SpecFunc.ScalarEpsilon), 0.0, decimal = self.decimal) 
    np.testing.assert_almost_equal(self.distribution.computeCDF(interval.getUpperBound()) + np.sqrt(SpecFunc.ScalarEpsilon), 1.0, decimal = self.decimal)
    np.testing.assert_almost_equal(self.distribution.computeProbability(interval), 1.0, decimal = self.decimal)

In fact, we should define a test for each method of the distribution class.

MichaelBaudin · September 17, 2024, 1:33pm

Hi!
For those interested on the topic, the C++ implementation is on its way at:

github.com/openturns/openturns

DistributionChecker

openturns:master ← jschueller:distchecker

opened 09:42PM - 04 Sep 24 UTC

jschueller

+2101 -5133

Adds a new DistributionChecker class dedicated to automate test of distribution …services: - PDF comparison with CDF FD gradient - PDF/CDF/DDF gradients comparison vs FD gradients - empirical moments/entropy/correlations vs analytical moments - parameters accesors set/get consistency check - min volume levelset/intervals / CI probability check - sampling check with Kolmogorov or Chi2 test - check of computeProbability method - quantile/inverse survival method check - comparison operator check Allowed to detect many problems: - Lots of unimplemented / erroneous computePDFGradient/computeCDFGradient methods (PlackettCopula, AliMikhailHaqCopula, JointDistribution, NormalCopula,Triangular, Trapezoidal, Student, Laplace, InverseChiSquare, FrankCopula, EllipticalDistribution, KernelMixture, RandomMixture) - Incorrect / missing getParameter methods (KPermutationsDistribution, ExtremeValueCopula, MinCopula, Wishart, BlockIndependentDistribution, CumulativeDistributionNetwork, MarginalDistribution, MaximumDistribution, MaximumEntropyOrderStatisticsDistribution, MixedHistogramUserDefined) - Incorrect DDF method (GumbelCopula, OrdinalSumCopula) - Slow ParametrizedDistribution because of missing getRealization override - Bug in OrdinalSumCopula::setParameter - Exclude weights from Mixture parameters to have independent parameters for gradients testing - Missing equals method in PythonDistribution & fix a list ref in example Its used at the end of distribution tests and allows to simplify them: ``` ot.Log.Show(ot.Log.TRACE) checker = ott.DistributionChecker(distribution) checker.run() ``` Its covers more services than what we had in t_Distribution_std.py, and has been applied in all Python distributions tests (coverage is ~+0.5% !), the redundant stuff like kolmogorov sampling tests and fd gradient could be removed, but I kept the other reference results for now. Remaining problems (look for checker.skip) in python/test: - Sklar/StudentCopula DDF/CDF seem incorrect - NonCentralStudent computeProbability does not accept inf intervals like the others (minor) - re-check all minimum level set services when skipped (wrong quantiles ?) - re-check if pdf/cdf gradients can be implemented analytically when the generic FD is used (look for DistributionImplementation::computePDF/CDFGradient)

This includes all the features discussed here, and many more!

Regards,
Michaël

Topic		Replies	Views
Discrete uniform distribution Development distribution	3	489	January 7, 2021
Skew distributions in OT? Development	5	104	May 14, 2024
KernelSmoothing Warning Python usage kernel-smoothing	5	357	September 1, 2023
OTbenchmark: a benchmark module for UQ available on PyPi! Announcements	2	386	February 23, 2021
Cauchy distribution in OT? Python usage distribution	2	371	December 13, 2020

Automated consistency tests of a Distribution

Related topics