Understanding the getParameterPosterior method in a calibration setting - the dirac case

sanaaZ · May 31, 2021, 8:31am

Hello

I’ve had some trouble using and understanding the method getParameterPosterior when called over a a calibration object. I don’t understand when is it supposed to returns a Dirac distribution and when it is supposed to return a non degenerated distribution.

For instance I used it over an GaussianNonLinearCalibration object and it’s returned a Dirac distribution (whereas i was expecting it to return a a Gaussian distribution).

Regards

regislebrun · May 31, 2021, 8:47pm

Hi,

You are using an old version of OT (probably 1.15). There was a bug in the calibration algorithms: it returned a Dirac (ie a Normal with zero variance) distribution instead of a Normal with very large variance (ie an improper uniform measure over R). It has been fixed in OT 1.16, please update your version of OT.

Cheers

Régis

sanaaZ · June 1, 2021, 9:26am

thank you Regis

I’ve just updated my OT to the 1.17 but unfortunately the problem still persists

sanaa

regislebrun · June 1, 2021, 12:08pm

Can you share a script or a notebook? There is no reference to Dirac in the whole calibration sub-directory of the C++ sources, so I don’t see how it can pop up in your scripts!

Cheers

Régis

sanaaZ · June 1, 2021, 2:28pm

i am gonna try to build a runable example
here is my code, which calls an external code that is wrapped in the ParametricPrior :


sigma = ot.CovarianceMatrix(1, [0.01])
sigma_obs = 0.01
errObs = ot.CovarianceMatrix(np.identity(populationObservationsVector.getDimension())*sigma_obs )
algo3 = ot.GaussianNonLinearCalibration(ParametricPrior, 
                                        [[]], 
                                        populationObservationsVector, 
                                        fcPrior,
                                        sigma,
                                        errObs)
algo3.run()
calibrationResult3 = algo3.getResult()

fcPosterior3 = calibrationResult3.getParameterPosterior()

sanaaZ · June 1, 2021, 2:47pm

hello
here is a runnable example
logistic_calage.py (4.9 KB)

thank you so much Regis !

regislebrun · June 1, 2021, 2:49pm

Ok I see what is going on. You are using GaussianNonLinearCalibration with a posterior distribution computed by bootstrap, which triggers a call to KernelSmoothing to build the posterior distribution based on the bootstrap sample. As all the values are equal in this sample, KernelSmoothing returns a Dirac distribution. So you have to check why the bootstrap sample is constant if you are not happy with a Dirac distribution. Another thing to test is to use the Laplace approximation, which is done by deactivating the bootstrap estimation:

ot.ResourceMap.SetAsUnsignedInteger("GaussianNonLinearCalibration-BootstrapSize", 0)

By default the size is 100 and the parameter posterior is obtained by bootstrap+kernel smoothing.

Cheers

Régis

sanaaZ · June 1, 2021, 3:15pm

Thank you Régis, I’ve stopped having dirac distributions by using your idea, and i have instead a gaussian posterior whose variance seems to be consistent with the physics of the problem

i still need to dig into what is being implemented in OT to get a better understanding. especially, i don’t get why i kept having diracs even when specifying my calibration problem with very large prior and/or observation variances.

thanks again for your help!!
sanaa

regislebrun · June 1, 2021, 3:45pm

I answered your post before to see your script. I better understand your problem: you have a unique point in your sample, of dimension 24! So each time you generate a bootstrap sample you get the exact same point (it is unique) and the resulting parameter estimate is constant, hence the Dirac distribution. Either you find a way to express your problem in terms of a sample of larger size, or you use the Laplace approximation. I think I should add either a warning or an exception in the code when the user ask for a bootstrap estimation of the distribution and the underlying observation sample is of size 1.

Cheers

Régis

regislebrun · June 1, 2021, 3:57pm

I think that the formulation of the problem is wrong: you should consider a model which takes (t,a,b) to return y and you should calibrate this model based on a sample of size 22. This way you have access to the bootstrap estimate of the distribution. The use of functool makes the code a little bit obfuscated to me so I cannot make the change past the new definition of the model:

def logisticModel(X):
    print([X[i] for i in range(0,len(X))])

    t = [X[0]]
    a = X[1]
    c = X[2]
    t0 = 1790.
    y0 = 3.9e6
    b = np.exp(c)
    y = [0.0] * nbdates
    for i in range(nbdates):
        y[i] = a*y0/(b*y0+(a-b*y0)*np.exp(-a*(t[i]-t0)))
    z = [yi/1.e6 for yi in y] # Convert into millions
    return z

Try it if you can and tell me if it works better. IMO the example in the documentation should be changed too.

Cheers

Régis

sanaaZ · June 1, 2021, 4:10pm

well, i had to go this way because otherwise i was confronted with the other problem of having OT calculates the gradient by finite differences over all the input variables . with my implementation i ensured that finite differences were calculated only over the parameters to be calibrated, with no useless calls to the code (my code is a bit costly, takes about a minute to finish )

but yes if this skews the ensuing calculations i should probably think of something else. your suggestion sounds good because it would mean that a finite difference calculation is done over only one useless dimension (t) while preserving the consistency of the ensuing calculations

thanks for your help!
sanaa

schueller · June 4, 2021, 11:53am

hi Sanaa, are you on linux ? if so you can already use the nightly pip packages

Topic		Replies	Views
Observations errors or values have no effect on posterior's variance? Python usage calibration	1	254	July 8, 2022
Use Kriging as forward model in Bayesian calibration Python usage	15	566	August 25, 2023
Calibration with noisy input parameters Python usage calibration	4	416	April 11, 2022
Frozen inputs for calibration not really frozen Python usage calibration	6	357	May 27, 2021
MCMC proposal distribution error Python usage calibration	2	61	October 24, 2024

Understanding the getParameterPosterior method in a calibration setting - the dirac case

Related topics