Bug with noisy kriging?

Hi,

It seems that there is a bug when calibrating a kriging model with noise. It looks like the training set outputs need to be normalized to properly account for the noise. For example, if I do a kriging model on np.sin(x)*1000 for x in [0, np.pi] and that I add a small noise with

vec_noise = np.ones(len(X_train))*0.1  
algokriging.setNoise(vec_noise)

I get a large uncertainty on the predictions (even on the training set). I can send you some small scripts that show the problem more clearly.

Is this a bug or am I not calibrating the noisy kriging properly?

Thanks !

1 Like

Hello,

Noisy kriging is unfortunately not supported!

With the above usage, you are settting a nugget factor (or noise) but this last one is not calibrated.

The model you are writing is \sigma^2 * \rho_{\theta}(x,y) + \delta^2 (\rho_{\theta}(x,y) might be the AbsoluteExponential or SquaredExponential for example) where only the scale \theta and the amplitude \sigma parameters are calibrated

Hoping this feature will be implemented soon

Cheers

Hi, thanks for you answer.

I’m not sure to understand. Actually, I don’t need to calibrate the noise. I know its value. What is strange is that, whether I add a noise of variance 0.1 to sin(x) or to sin(x)*1000, I get a significant noise in the prediction in both cases. Here are the graphs of these 2 krigings with the same noise.

sin

In the second case, I should get a small uncertainty since 0.1 of noise for sin(x)*1000 is negligible.

Here is the second graph
sin1000

Hi,

Sorry for the late reply
To make sure I understood your point, you are considering a covariance model of the form \sigma^2 \exp(-\frac{|\tau|^2}{2\theta^2}) + \delta where \theta, \sigma are calibrated ?
The noise \delta above defines as \delta(x) = factor * \sin(x) with factor \in \{0.1, 1000\}?

Could you share your script please?

Hi,

Thanks for your replay and sorry if I was not clear. It seems that I can’t upload attachments as a new user. So I put below the script to obtain the first graph above. To obtain the second one, you just need to change the value of factor to 1000 in the definition of the fun_ref function. Let me know if I can provide the script in a better way.

import numpy as np
import openturns as ot
import matplotlib.pyplot as plt


# %% definitions of functions for kriging calibration

def get_param(X):  
    # return objects needed for normalization and standardization
    vec_min_col = np.amin(X,axis=0) 
    vec_max_col = np.amax(X,axis=0)
    vec_mean_col = np.mean(X, axis =0)
    vec_std_col  = np.std(X, axis =0)
    
    return vec_min_col, vec_max_col, vec_mean_col, vec_std_col

def normalize(X, params, bool_standard = True):
    # return normalization or standardization of X using params
    vec_min_col = params[0] 
    vec_max_col = params[1] 
    vec_mean_col = params[2] 
    vec_std_col  = params[3] 
    
    if not(bool_standard):
        return (X - vec_min_col)/(vec_max_col-vec_min_col) 
    else:
        return (X-vec_mean_col) / vec_std_col
        
def kriging_calibration(DoE, Response, cov_str= 'Gaussian', Hyperparam_old=[  ], return_likelihood = False, bool_noise = False, noise_vec = [  ] ):
    #Inputs :   DoE : DoE used to build the kriging (vector of input points of the model)       
    #           Response : vector of outputs of the model at the points of the DoE
    #Outputs :  the calibrated Kriging model                                 
    dim_space= DoE.shape[1]
    Response = np.vstack(Response)

    #ot.RandomGenerator.SetSeed(1235)
    
    # --- 1. basis ---
    
    #basis = [] #trend 0
    basis = ot.ConstantBasisFactory(dim_space).build() # trend constante
    #basis = ot.LinearBasisFactory(dim_space).build() # trend lineaire 
    #basis = ot.QuadraticBasisFactory(dim_space).build() # trend quadratic
    
    # --- 2. covariance model ---
    
    if cov_str == 'Matern1.5':
        cov = ot.MaternModel(np.ones(dim_space), 1.5)
    elif cov_str == 'Matern2.5':
        cov = ot.MaternModel(np.ones(dim_space), 2.5)
    elif cov_str == 'Gaussian':
        cov = ot.SquaredExponential(np.ones(dim_space))

    #cov.setNuggetFactor(1e-6)   # to add a nugget effect ("effet pepite")

    #cov.setScale(np.array([1.57188,0.992836,1.27414,2.05575,5,1.32729]))  # to set the hyperparameters
    
    # --- 3. kriging algorithm ---
    
    algokriging = ot.KrigingAlgorithm(DoE, Response, cov, basis)

    if bool_noise:
        algokriging.setNoise(noise_vec)
        
    # --- 4. Optimization ---

    # for kriging calibration
    nb_multistart_hyperparam = 100
    lower_bound_theta = 1e-5
    upper_bound_theta = 10.

    if type(Hyperparam_old)== list :
        distribution = ot.ComposedDistribution(  np.ones(dim_space)*ot.Uniform(lower_bound_theta ,  upper_bound_theta ) )
    else :
        distrib = []
        Lower_bounds = []
        Upper_bounds = []
        for theta in Hyperparam_old :
            Lower_bounds.append(max(lower_bound_theta,theta - 1.))
            Upper_bounds.append(min(theta +1.,upper_bound_theta))
            distrib.append(ot.Uniform(max(lower_bound_theta,theta - 1.) ,  min(theta +1.,upper_bound_theta)))
        distribution = ot.ComposedDistribution(distrib)
    
    startingPoint = ot.LHSExperiment(distribution, nb_multistart_hyperparam).generate()
    #startingPoint =  loguniform.rvs(1e-3,10.,size = (10, dim_space))
    algokriging.setOptimizationAlgorithm(ot.MultiStart(ot.TNC(), startingPoint))
    #algokriging.setOptimizationAlgorithm(ot.MultiStart(ot.NLopt('LD_LBFGS'), startingPoint))
    algokriging.setOptimizationBounds(ot.Interval(np.ones( dim_space)*lower_bound_theta, np.ones(dim_space)* upper_bound_theta ))
    #algokriging.setOptimizationBounds(ot.Interval(np.array(Lower_bounds), np.array(Upper_bounds)))
    algokriging.setOptimizeParameters(True) 
        
    # # if we choose not to optimize parameters
    #algokriging.setOptimizeParameters(False)
    
    # --- 5. run the algorithm ---
    
    algokriging.run()
    
    # # get the metamodel
    krigingResult = algokriging.getResult()
    #krigingMeta = krigingResult.getMetaModel()
   
    if return_likelihood:
        F = algokriging.getReducedLogLikelihoodFunction()
        cov_model = krigingResult.getCovarianceModel()
        print(cov_model)
        print(cov_model.getParameter())
        
        
        likelihood = F(cov_model.getParameter()[:dim_space])
        return krigingResult, cov_model, likelihood
    else :
        return krigingResult 

# %% toy case

def fun_ref(x):
    factor = 1               # !!! or factor = 1000 !!!
    return factor*np.sin(x)     


# --- DoE ---

nb_point = 6

x_max = np.pi
np.random.seed(1) ; DoE = np.random.random_sample(nb_point)*x_max
params_DoE = get_param(np.array(DoE))
DoE_normalized = np.vstack(normalize(np.array(DoE), params_DoE))

Y_true = fun_ref(DoE)

# -- noise ---
vec_noise = 0.1 * np.ones(nb_point)

# --- kriging calibration ---

krigingResult = kriging_calibration(DoE_normalized, Y_true, cov_str='Matern2.5', bool_noise = True, noise_vec = vec_noise)

# --- kriging prediction at x_plot ---

x_plot = np.linspace(0, x_max, 1000)
x_plot_normalized = np.vstack(normalize(np.array(x_plot), params_DoE))

means = np.squeeze(krigingResult.getConditionalMean(np.vstack(x_plot_normalized)))
    
variances =  np.array(krigingResult.getConditionalMarginalVariance(np.vstack(x_plot_normalized))) 
stds = np.squeeze(np.sqrt(variances.clip(min=0)))

# --- plot ---
    
lb_predict = np.squeeze(means) - 2 * stds
ub_predict = np.squeeze(means) + 2 * stds
plt.plot(x_plot, np.squeeze(means), label = 'km', linestyle=  '--', c = 'black')
plt.fill_between(x_plot, lb_predict , ub_predict , alpha = 0.1, color = 'black')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.scatter(DoE, Y_true)
plt.plot(x_plot, fun_ref(x_plot), label = 'sin(x)')
plt.legend()
#plt.savefig('sin')
plt.show()

Hello

Sorry for the late reply.
I see now what you mean w.r.t. the scale (0.01 vs 1 / 0.01 vs 1000). My initial reading on the noise you set was wrong.

I might be wrong but I don’t think there should be a direct linear link between noise \delta^2 (here 0.01) and y scale (1000 or 1 in your use cases). The thing is that the covariance amplitude \sigma^2 becomes very huge when y scale is 1000
If we compute the total variance (\sigma^2 + \delta^2), then the ratio \frac{ \delta^2}{\sigma^2 + \delta^2} becomes smaller with y scale set to 1000. Thus the part of uncertainty that comes from \delta does not play a linear role with respect to output sample scale.

@josephmure @JPelamatti any thought on that ?

Hi, thanks for your reply.

What is strange is compared to the noise. that the delta plays a significant role in the plot result, even in the case where the factor = 1000. In this case, the impact of the delta should be minor since the output is very large

You can see that there seems to be a problem since, when you consider factor = 1000, if you change delta = 0.1 to delta = 0, you shouldn’t change much the result but you see that this is not the case.

Moreover, the case factor = 1000 and delta = 0.1 is “as if” we have almost no noise at all. Thus we should have a std of the kriging at the training points close to 0 which is not the case.

using “np.array(krigingResult.getConditionalMarginalVariance(np.vstack(DoE_normalized)))”

In fact, the std of the kriging at the training points is exactly 1000 times larger when factor = 1000 compare to the case factor = 1.

Alexis

Hi @Alexis

Sorry for the (very) late reply.
I took the time for a complete code review. My conclusion is you are right : there is a bug!
Thanks for catching it.

The optimisation process is done accurately, ie we took care of the additional noise but the output results (both trend and conditional variance) are incorrect

I will open a ticket and propose a patch to fix it

Thanks again

Great, thanks :slight_smile:

How can I know when the bug will be fixed please ?

Alexis