Frozen inputs for calibration not really frozen

logistic_calage.py (5.0 KB)

Hi everyone,
I have a calibration problem which resemble a lot the one described in the logistic calibration example in the Openturns documentation

Only i have had some trouble using Openturns because the input parameters that are supposedly frozen during the calibration procedure (i.e. the observed years) are slightly changed. not very much changed, just around the fifth or sixth decimal but sufficiently so to make for some useless calls to the function. You can see such a problem in the file included, which comes from the OT documentation with the addition of a print call within the calibrated function, in order to monitor on what inputs the function is being called

Given that my function is a bit costly to compute, it becomes quite cumbersome and renders any memoization measure useless

is there a way to ensure that the frozen input parameters are effectively frozen during the calibration procedure?

regards,
sanaa

Hi Sanaa,

Using LinearLeastSquaresCalibration, at one point you need to compute the gradient of the function wrt the frozen parameters. Either you give a specific implementation of the gradient, or it is computed using finite differences (the default). It is exactly what’s happen here and there is no way to avoid it if you don’t provide the gradient. It is not a limitation of OT, but a characteristic of linear least squares calibration.
Nevertheless, you can reduce the cost a little bit by providing your own finite difference gradient based on a non-centered finite difference instead of the default centered one. It reduces the number of evaluations from 2d to d+1, and even to d as the central point is also needed in another part of the algorithm and will be cached.Have a look here:
https://openturns.github.io/openturns/latest/user_manual/_generated/openturns.NonCenteredFiniteDifferenceGradient.html

Cheers

RĂ©gis

Thank you Regis
I makes a bit more sense in the light of your explanation. Still, i dont understand why OT should be needing to calculate the global gradient and not just a partial one ? The optimisation is carried out on two parameters only after all (in the example), to my understanding OT does not need to compute the gradient on the whole input, but only the partial gradient over the parameters that are to be calibrated

Thank you !
Sanaa

-------- Message d’origine --------

In fact it does exactly that: if you look at the LinearLeastSquaresCalibration.cxx file, line 72:

const Matrix parameterGradient(parametrizedModel.parameterGradient(inputObservations[i]));

but if your model has been built using a ParametricFunction class, this method is based on the gradient() method of the underlying full function, where the parameters and the other inputs are merged and there is no way to tell a function to compute only a part of its gradient. I can add a specific case for finite differences but at one point it becomes a crapy software design. I will propose an evolution in this spirit and discuss this point with the development team.

Cheers

RĂ©gis

Thank you RĂ©gis
Here is what i ended up doing in order to circumvent this problem:

from functools import partial
def create_frozen_input_func(logisticFun, arg_froz ):
    froz_func = partial(logisticFun, arg_froz=arg_froz)
    return lambda args:froz_func(args)

def logisticFun (arg_cal, arg_froz):
    arg_cal = [i for i in arg_cal]
    X = arg_froz + arg_cal 
    return logisticModel(X)

# Frozen arguments:
arg_froz = np.array(timeObservationsVector).flatten().tolist()
# Wrapped function:
logisticParametric = create_frozen_input_func(logisticFun, arg_froz )
logisticParametric (thetaPrior)
logisticParametric = ot.PythonFunction(2, nbdates,logisticParametric)

# The new parametric function, with a 0-dimension input
logisticParametric = ot.ParametricFunction(logisticParametric,[0,1],thetaPrior)

populationPredicted = logisticParametric([])

algo = ot.LinearLeastSquaresCalibration(logisticParametric, [[]], 
               populationObservationsVector, thetaPrior)
algo.run()

Sanaa

I just implemented an OT version of this trick, taking into account the fact that the gradients are computed using finite differences (a very common case).
What makes your implementation work is because your full model does not provide its gradient. If it was provided, for example an analytical gradient (here: logisticFun implemented as e.g. a SymbolicFunction) then your construction would lead to a parameter gradient computed by finite differences over the frozen input function, not using any of the components of the analytical gradient.
In OT we made the decision to favor accuracy wrt speed as much as possible, and we made the assumption that if the user provides the gradient it is because he took the time to fine-tune it, hence the call to the full gradient and the extraction to its relevant subset.
Stay tuned to the following OT pull request:
https://github.com/openturns/openturns/pull/1846

2 Likes

Hi!
I created an issue on this topic to make the problem as clear as possible:

Regards,
Michaël