This post is an answer to a question by @BinglinW on Gitter.
Question
Hello everyone, in the model of Kriging, OpenTURNS can choose a variety of parameters, such as Basis and CovarianceModel, I want to know if there is a way in OpenTURNS to automatically select the best parameters. Thank you for your attention.
When I use the ConstantBasisFactory and MaternModel, I get the result as follows. I think this result is unreasonable.
Answer
I have tried to emulate the data from your graph, and with the standard options I get basically the same result:
import openturns as ot
from openturns.viewer import View
X = [[0.07], [0.19], [0.3], [0.45], [0.55], [0.68], [0.81], [0.92]]
Y = [[0.5], [-0.8], [0.0], [0.6], [0.9], [-4.0], [-4.2], [11.0]]
# Prepare graph
graph = ot.Graph("", "X", "Y", True, '')
cloud = ot.Cloud(X, Y)
graph.add(cloud)
# Set covariance kernel and basis
mat = ot.MaternModel([1.0], 2.5)
basis = ot.ConstantBasisFactory(1).build()
# Run the OT Kriging
kri = ot.KrigingAlgorithm(X, Y, mat, basis)
kri.run()
res = kri.getResult()
print("MLE for the scale : {}".format(res.getCovarianceModel().getScale()))
meta = res.getMetaModel()
graph.add(meta.draw(0.0, 1.0))
graph.setColors(ot.Drawable.BuildDefaultPalette(graph.getDrawables().getSize()))
graph.setLegends(['Guessed data', 'MLE metamodel'])
graph.setLegendPosition("topleft")
View(graph)
The output is:
MLE for the scale : [0.01]
We can check that 0 really is the MLE by plotting the log-likelihood function:
View(kri.getReducedLogLikelihoodFunction().draw(0.01, 2.0))
How can we do better? A while ago, we built an OpenTURNS module based on Gu methods. One of them is to penalize the log-likelihood with the reference prior (see the references below for further information). When we optimize, we get the MAP instead of the MLE.
Unfortunately, this module is no longer up to date. I had to tweak it and recompile to get the result below:
# Run the OTGU Kriging
import otgu
krigu = otgu.KrigingAlgorithm(X, Y, mat, basis)
# Penalize the likelihood with the reference prior:
# The optimal scale parameter is no longer the MLE, but the MAP.
krigu.setScalePrior(otgu.GeneralLinearModelAlgorithm.REFERENCE)
krigu.run()
resgu = krigu.getResult()
print("MAP for the scale : {}".format(resgu.getCovarianceModel().getScale()))
metagu = resgu.getMetaModel()
graph.add(metagu.draw(0.0, 1.0))
graph.setColors(ot.Drawable.BuildDefaultPalette(graph.getDrawables().getSize()))
graph.setLegends(['Guessed data', 'MLE metamodel', 'MAP metamodel'])
graph.setLegendPosition("topleft")
View(graph)
Output:
MAP for the scale : [0.999779]
Bottom line
With this set of data, optimizing the likelihood yields poor results. You are better off with the unoptimized covariance model. You can tell OT not to optimize with:
# Set covariance kernel and basis
mat = ot.MaternModel([1.0], 2.5)
basis = ot.ConstantBasisFactory(1).build()
# Run the OT Kriging
kri = ot.KrigingAlgorithm(X, Y, mat, basis)
kri.setOptimizeParameters(False) # do not optimize covariance model parameters
kri.run()
References
Berger, J. O., De Oliveira, V., & Sansó, B. (2001). Objective Bayesian analysis of spatially correlated data. Journal of the American Statistical Association , 96 (456), 1361-1374.
Muré, J. (2021). Propriety of the reference posterior distribution in Gaussian process modeling. The Annals of Statistics , 49 (4), 2356-2377.