I’ve had some trouble using and understanding the method getParameterPosterior when called over a a calibration object. I don’t understand when is it supposed to returns a Dirac distribution and when it is supposed to return a non degenerated distribution.
For instance I used it over an GaussianNonLinearCalibration object and it’s returned a Dirac distribution (whereas i was expecting it to return a a Gaussian distribution).
You are using an old version of OT (probably 1.15). There was a bug in the calibration algorithms: it returned a Dirac (ie a Normal with zero variance) distribution instead of a Normal with very large variance (ie an improper uniform measure over R). It has been fixed in OT 1.16, please update your version of OT.
Can you share a script or a notebook? There is no reference to Dirac in the whole calibration sub-directory of the C++ sources, so I don’t see how it can pop up in your scripts!
Ok I see what is going on. You are using GaussianNonLinearCalibration with a posterior distribution computed by bootstrap, which triggers a call to KernelSmoothing to build the posterior distribution based on the bootstrap sample. As all the values are equal in this sample, KernelSmoothing returns a Dirac distribution. So you have to check why the bootstrap sample is constant if you are not happy with a Dirac distribution. Another thing to test is to use the Laplace approximation, which is done by deactivating the bootstrap estimation:
Thank you Régis, I’ve stopped having dirac distributions by using your idea, and i have instead a gaussian posterior whose variance seems to be consistent with the physics of the problem
i still need to dig into what is being implemented in OT to get a better understanding. especially, i don’t get why i kept having diracs even when specifying my calibration problem with very large prior and/or observation variances.
I answered your post before to see your script. I better understand your problem: you have a unique point in your sample, of dimension 24! So each time you generate a bootstrap sample you get the exact same point (it is unique) and the resulting parameter estimate is constant, hence the Dirac distribution. Either you find a way to express your problem in terms of a sample of larger size, or you use the Laplace approximation. I think I should add either a warning or an exception in the code when the user ask for a bootstrap estimation of the distribution and the underlying observation sample is of size 1.
I think that the formulation of the problem is wrong: you should consider a model which takes (t,a,b) to return y and you should calibrate this model based on a sample of size 22. This way you have access to the bootstrap estimate of the distribution. The use of functool makes the code a little bit obfuscated to me so I cannot make the change past the new definition of the model:
def logisticModel(X):
print([X[i] for i in range(0,len(X))])
t = [X[0]]
a = X[1]
c = X[2]
t0 = 1790.
y0 = 3.9e6
b = np.exp(c)
y = [0.0] * nbdates
for i in range(nbdates):
y[i] = a*y0/(b*y0+(a-b*y0)*np.exp(-a*(t[i]-t0)))
z = [yi/1.e6 for yi in y] # Convert into millions
return z
Try it if you can and tell me if it works better. IMO the example in the documentation should be changed too.
well, i had to go this way because otherwise i was confronted with the other problem of having OT calculates the gradient by finite differences over all the input variables . with my implementation i ensured that finite differences were calculated only over the parameters to be calibrated, with no useless calls to the code (my code is a bit costly, takes about a minute to finish )
but yes if this skews the ensuing calculations i should probably think of something else. your suggestion sounds good because it would mean that a finite difference calculation is done over only one useless dimension (t) while preserving the consistency of the ensuing calculations