Should methods be allowed to output infinity?

This question is related to Issue #1615, but not limited to it.

Some of the people present at the technical committee meeting on October 13th, 2020 (disclaimer: I am one of them) have argued that some methods in OpenTURNS should be allowed to produce infinity (+INF or -INF) as output, or else take infinity as input Scalar (which is the OT name for a float).

This would represent a change in OpenTURNS coding policy and the discussion could not be finished during the technical committee meeting. The main arguments have been summarized on the linked Github issue, and I reproduce them below. Thanks to @MichaelBaudin for the summary.

There are two correct strategies to deal with exceptional cases such as when computeLogPDF is applied to a point outside the distribution support:

  • generate exceptions,
  • generate exceptional floating point numbers such as INFs and NANs.

There are wrong strategies.

  • A wrong strategy is to produce warnings, because these rather silent message are unnoticed by most users.
  • A wrong strategy is to produce MaxScalar when the result is INF. This is because using a finite floating point number is very different from using INF. For example, we may use the following code in order to produce the number 1 based on MaxScalar:
x = MaxScalar
for i in range(53):
    x /= 2.0

This is impossible when x is INF, which is good.

Here are the arguments against INF and NANs:

A INF or a NAN is not a real number. Hence, this is not what most users expect as the return of a function. However, because we always perform computations with floating point numbers, indeed, nobody should expect a real number as the output of a C++ or a Python function returning a float.
Using them slow down the computation, because the performance of the libraries are poor when dealing with exceptional floating point numbers.
@regislebrun This assertion requires some factual proofs.
We do not know how OT dependencies (e.g. NLOPT, etc…) react to these exceptional numbers. @regislebrun This assertion requires some factual proofs, with examples of poor management of floating point numbers.

The lowest field is provided in std::numeric_limits:

However, according to the previous page, “For floating-point types: implementation-dependent; generally, the negative of max().”. Hence, the code in OT would be:

const Scalar Specfun::Lowest = -MaxScalar;

Then, we would use Lowest in place of -MaxScalar wherever appropriate.

An option would be to build the code with an option which raises an exception each time an INF number is produced.

Below is a reference which might be related:


Here are further arguments against the policy change from @regislebrun:

We all should definitely read the series of posts here, in particular this one: https://randomascii.wordpress.com/2012/05/20/thats-not-normalthe-performance-of-odd-floats/
As I said during the CT, I learned from experience that exceptional floating point numbers have a huge performance penalty. But it was based on quite old experiment, when x87 was the norm and SSE2 was not so widespread. And you can read in the previous article that the performance penalty was up to a 900x slowdown. Now, it looks like NaNs and Infs are processed at full speed.
For those who are too much optimistic about the guarantees provided by a standard regarding eg the reproducibility of a computation, they should read this article, or at least have a look at the final flowchart https://randomascii.files.wordpress.com/2012/03/image6.png
Remember that the different rounding modes are not taken into account in this flowchart!
Concerning the added value of exceptions, read this, already referenced in the summary. Note that the author underline the utility of letting inf sneak into the computation from time to time so the use of inf is not a systematic devil, but its global propagation is.

1 Like

Another discussion related to this popped up on Github:

In order to look into the consquences of allowing computeLogPDF to output -inf if the data are outside the distribution support, I have created a branch where SpecFunc::LowestScalar is -inf instead of -SpecFunc::MaxScalar: https://github.com/josephmure/openturns/tree/infinity

Non-exhaustive list of languages or libraries that use infinity:

A larger list is available here:

The author of the piece about exceptional floating points @regislebrun links to also warns against try/except blocks:

TryDivByZero does a floating-point divide-by-zero inside a Win32 __try/__except block in order to catch exceptions, print a message, and allow the tests to continue. This type of structured exception handling block should not (repeat not ) be used in production code, except possibly to record crashes and then exit. I hesitate to demonstrate this technique because I fear it will be misused. Continuing after unexpected structured exceptions is pure evil.

@josephmure Yes, it is cristal clear: “Continuing after unexpected structured exceptions is pure evil”. It does not mean that you should avoid throwing exceptions when a computation goes wrong, but that you should stop your code when such an exception occurs! The author does not promote a silent production of infs as if they were innocuous.

Forgive me if I go back to the original problem that spawned this discussion: how should computeLogPDF methods deal with inputs that are outside the distribution support? There are 2 solutions:

  1. output -inf
  2. throw an exception

If the user submitted by mistake an input outside the support, then solution 2 is an adequate response.
The problem becomes trickier if computeLogPDF is called from inside an algorithm like MCMC for example @MerlinKeller. In this case, the algorithm can try proposals outside the support of the distribution. Such proposals must be rejected, but they can happen. If computeLogPDF throws an exception, then the algorithm must handle it through try/except instructions. It is not a case where the code can be stopped, otherwise the MCMC is likely to stop prematurely.

Another interesting read:

Some information about how NLOPT treats infinities:

If a lower/upper bound is not set, the default is no bound (unconstrained, i.e. a bound of infinity); it is possible to have lower bounds but not upper bounds or vice versa. Alternatively, the user can call one of the above functions and explicitly pass a lower bound of -HUGE_VAL and/or an upper bound of +HUGE_VAL for some optimization parameters to make them have no lower/upper bound, respectively. ( HUGE_VAL is the standard C constant for a floating-point infinity, found in the math.h header file.)

On modern implementations, HUGE_VAL resolves to the positive infinity:

https://en.cppreference.com/w/cpp/numeric/math/HUGE_VAL

another interesting read:
https://people.eecs.berkeley.edu/~demmel/ExceptionHandling_Reproducibility_Feb20.pptx
(linked from dgesdd: Handle norm nan value by jschueller · Pull Request #471 · Reference-LAPACK/lapack · GitHub)

Hi,

I post below a summary of this morning’s discussion. I hope that this may allow to share the content and provide it for interested users.

The core topic is the evaluation of the PDF of a distribution which is defined on an interval, e.g. the Uniform distribution on [-1,1]. If x\not\in [-1,1], there are two approaches:

  • return 0,
  • generate an C++ exception.

The associated topic is the value of the LogPDF. If x\not\in [-1,1], there are two approaches:

  • return Specfun::LowestScalar \approx -10^{308},
  • return -INF.

In the current API, the method Distribution.getRange().contains(x) returns True if x is in the range.

There are several places in the code where this issue appears.

  • Several implementations of the PDF are based on the exponential of the log-PDF, e.g. NormalGamma:

image

  • For distributions which are defined on an interval, the Log-PDF is currently set to LowestScalar when the input is not in the interval, e.g. Beta:

image

  • The RandomWalkMetropolisHastings::getRealization evaluates the Log-PDF of the prior:

image

  • DistributionImplementation::computeMinimumVolumeLevelSetWithThreshold is based on the Log-PDF too:

There are several applications of the management of INF and NANs in the library:

Several Bayesian algorithms are planned to be developed, which may benefit from this management:

  • A Metropolis-Hastings algorithm, independent from the current bayesian calibration context,
  • A bayesian calibration algorithm, based on the “new” CalibrationResult object.

What was unclear, however, is how these new algorithms may benefit from a change in the management of INF/NAN, in the sense that the current state would not forbid these developements (see below for another point of view).

Based on these informations, 4 options were discussed.

  • Option A: manage -INF everywhere necessary
    • Create a new Specfunc::Infinity and set it with -INF
    • This corresponds to updating and merging the experimental branch : GitHub - josephmure/openturns at infinity
    • This option uses floating point arithmetic to manage a point which is not in the range.
    • One issue may be that optimization solvers such as Cobyla may not manage consistently such a situation. For example, an infinite loop was found when estimating the three parameters of a Triangular distribution with MLE. It is likely (!) that the bound parameters were not managed correctly by the algorithm.
  • Option B: generate exceptions where necessary
    • This would require to throw exceptions wherever the code currently returns LowestScalar
    • The exception would be InvalidArgumentException
    • This requires changes in all functions which currently use exp(logPDF) to catch the exception, if necessary.
  • Option C: a new computeLogPDFUnsafe()
    • This new computeLogPDFUnsafe() method of any Distribution would return -INF when the input point is not in the range.
    • This would also allow to manage exceptional parameters of the distribution, e.g. \alpha=10^{50} in a Beta distribution.
    • The current behaviour of computeLogPDF() would be unchanged.
    • The new computeLogPDFUnsafe() would be used in MCMC, and in any algorithm which would need it.
    • If a Sample contains a Point which is not in the range, computeLogPDFUnsafe() would allow to search for the index in the Sample ; an exception would not allow this.

Three variants of the option C were discussed, but I am unfortunately unable to describe them :slight_smile: .

Feel free to comment these options or suggest new ones.

Best regards,

Michaël

1 Like

Apparently, the Point and Sample constructors accept NaNs as input.

import numpy as np
import openturns as ot
mypoint = ot.Point([np.nan])
mysample = ot.Sample([[np.nan], [np.nan]])

This means NaNs could potentially creep in the library when reading CSV files with missing values. We really need to make library services robust to NaNs.

Cross-posted from GitHub: a comparison of several implementations of the DiGamma and TriGamma functions with respect to how limits are handled: R, Scipy, OpenTURNS.

The DiGamma function

DiGamma function in R:

> digamma(Inf)
[1] Inf

> digamma(0)
[1] NaN
Warning message:
In digamma(0) : production de NaN

> digamma(-1)
[1] NaN
Warning message:
In digamma(-1) : production de NaN

> digamma(-Inf)
[1] NaN
Warning message:
In digamma(-Inf) : production de NaN

DiGamma function in Scipy (check out the implementation):

from scipy.special import digamma
from numpy import inf

digamma(inf)
Out[27]: inf

digamma(0) # positive 0
Out[29]: -inf

digamma(-1.0/inf) # negative 0
Out[33]: inf

digamma(-1)
Out[34]: nan

digamma(-inf)
Out[35]: nan

DiGamma function in OpenTURNS:

from openturns import SpecFunc_DiGamma as DiGamma
from numpy import inf

DiGamma(inf)
Out[39]: inf

DiGamma(0)
TypeError: InvalidArgumentException : Error: the argument of the DiGamma function cannot be a non positive integer.

DiGamma(-1)
TypeError: InvalidArgumentException : Error: the argument of the DiGamma function cannot be a non positive integer.

DiGamma(-inf)
TypeError: InvalidArgumentException : Error: the argument of the DiGamma function cannot be a non positive integer.

Summary of DiGamma behaviors

  • Scipy chooses to return limits whenever possible (at positive infinity, at positive 0 and at negative 0) and returns nan when this is not possible.
  • R does the same except it does not distinguish between positive and negative 0: since the limits are different, it returns nan. Also, whenever it returns nan, it issues a warning.
  • OpenTURNS has the same behavior as R except it throws instead of issuing warnings.

The TriGamma function

TriGamma function in R:

> trigamma(Inf)
[1] 0

> trigamma(0)
[1] Inf

> trigamma(-1)
[1] Inf

> trigamma(-Inf)
[1] Inf

TriGamma function in Scipy

from scipy.special import polygamma
from numpy import inf

polygamma(1, inf)
Out[47]: array(0.)

polygamma(1, 0)
Out[50]: array(inf)

polygamma(1, -1)
Out[51]: array(inf)

polygamma(1, -inf)
Out[53]: array(inf)

TriGamma function in OpenTURNS:

from openturns import SpecFunc_TriGamma as TriGamma
from numpy import inf

TriGamma(inf)
Out[55]: 0.0

TriGamma(0)
TypeError: InvalidArgumentException : Error: the argument of the TriGamma function cannot be a non positive integer.

TriGamma(-1)
TypeError: InvalidArgumentException : Error: the argument of the TriGamma function cannot be a non positive integer.

TriGamma(-inf)
TypeError: InvalidArgumentException : Error: the argument of the TriGamma function cannot be a non positive integer.

Summary of TriGamma behaviors

  • Limits at positive infinity and all nonpositive integers are equal to positive infinity, so R and Scipy return them.
  • The limit at negative infinity is not defined (because the limit of the sequence(Trigamma(-x + 0.5)) with x in the set of nonnegative integers is pi^2), so I do not understand why R and Scipy do not return nan at negative infinity and return positive infinity instead.
  • OpenTURNS only returns the limit at positive infinity, and throws in all other cases.

An Interval is defined from 4 sequences:

  • lowerBound: a float sequence
  • upperBound: a float sequence
  • finiteLowerBound: a bool sequence
  • finiteUpperBound: a bool sequence

This is inconvenient when dealing with intervals with infinite boundaries:

ot.Interval([1.0] * 2, [3.0] * 2, [True] * 2, [False] * 2

is actually the set [1, +\infty) \times [1, +\infty).

Essentially, finiteLowerBound (resp. finiteUpperBound) is about whether of not to ignore lowerBound (resp. upperBound). I think this could easily cause errors.

Things would be more straightforward way if infinities were allowed: the interval above would simply be constructed as

ot.Interval([1.0] * 2, [np.inf] * 2)