Should methods be allowed to output infinity?

MichaelBaudin · January 14, 2021, 2:22pm

Hi,

I post below a summary of this morning’s discussion. I hope that this may allow to share the content and provide it for interested users.

The core topic is the evaluation of the PDF of a distribution which is defined on an interval, e.g. the Uniform distribution on [-1,1]. If x\not\in [-1,1], there are two approaches:

return 0,
generate an C++ exception.

The associated topic is the value of the LogPDF. If x\not\in [-1,1], there are two approaches:

return Specfun::LowestScalar \approx -10^{308},
return -INF.

In the current API, the method Distribution.getRange().contains(x) returns True if x is in the range.

There are several places in the code where this issue appears.

Several implementations of the PDF are based on the exponential of the log-PDF, e.g. NormalGamma:

For distributions which are defined on an interval, the Log-PDF is currently set to LowestScalar when the input is not in the interval, e.g. Beta:

The RandomWalkMetropolisHastings::getRealization evaluates the Log-PDF of the prior:

DistributionImplementation::computeMinimumVolumeLevelSetWithThreshold is based on the Log-PDF too:

There are several applications of the management of INF and NANs in the library:

The MCMC algorithm, a topic partially solved by Correct the formula of MCMC::computeLogLikelihood [1.16] by josephmure · Pull Request #1648 · openturns/openturns · GitHub
The maximum likelihood estimation (see Estimating parameters of a bounded distribution with MLE? for a related topic)

Several Bayesian algorithms are planned to be developed, which may benefit from this management:

A Metropolis-Hastings algorithm, independent from the current bayesian calibration context,
A bayesian calibration algorithm, based on the “new” CalibrationResult object.

What was unclear, however, is how these new algorithms may benefit from a change in the management of INF/NAN, in the sense that the current state would not forbid these developements (see below for another point of view).

Based on these informations, 4 options were discussed.

Option A: manage -INF everywhere necessary
- Create a new Specfunc::Infinity and set it with -INF
- This corresponds to updating and merging the experimental branch : GitHub - josephmure/openturns at infinity
- This option uses floating point arithmetic to manage a point which is not in the range.
- One issue may be that optimization solvers such as Cobyla may not manage consistently such a situation. For example, an infinite loop was found when estimating the three parameters of a Triangular distribution with MLE. It is likely (!) that the bound parameters were not managed correctly by the algorithm.
Option B: generate exceptions where necessary
- This would require to throw exceptions wherever the code currently returns LowestScalar
- The exception would be InvalidArgumentException
- This requires changes in all functions which currently use exp(logPDF) to catch the exception, if necessary.
Option C: a new computeLogPDFUnsafe()
- This new computeLogPDFUnsafe() method of any Distribution would return -INF when the input point is not in the range.
- This would also allow to manage exceptional parameters of the distribution, e.g. \alpha=10^{50} in a Beta distribution.
- The current behaviour of computeLogPDF() would be unchanged.
- The new computeLogPDFUnsafe() would be used in MCMC, and in any algorithm which would need it.
- If a Sample contains a Point which is not in the range, computeLogPDFUnsafe() would allow to search for the index in the Sample ; an exception would not allow this.

Three variants of the option C were discussed, but I am unfortunately unable to describe them .

Feel free to comment these options or suggest new ones.

Best regards,

Michaël

Topic		Replies	Views
Computation time Python usage	4	33	October 9, 2024
OpenTURNS ships 1.24 RC1 Announcements	0	276	May 6, 2024
Version 1.20 released Announcements	0	346	November 10, 2022
Version 1.20 RC1 released Announcements	0	417	October 13, 2022
Frozen inputs for calibration not really frozen Python usage calibration	6	340	May 27, 2021

Should methods be allowed to output infinity?

Related topics