Calculate Sobol without model

Hi there,

In the past 3 weeks I ran a quite huge model in Matlab in a Monte Carlo simulation.
I use Python for analyzing the outputs. One part of the analysis is the calculation of the Sobol’ indices of first and total order. So I am not able to rerun this in Python.
My data is a df called data which i split into my input and output values:

X = data.iloc[:, :-1]
y = data.iloc[:, -1]

I am just not able to use this and calculate the Sobol’ indices. Could you tell me how this is done?

Regards,
Chris

Hi and welcome on this forum !
So you have a (x, y) dataset and you want to compute Sobol’ indices. I see several ways to compute Sobol’ indices in this case.

  • The most basic method is to plot the data. You can find an example of the plotXvsY function here.
  • The simplest way is to compute the squared SRC indices. If the model is approximately linear, then these indices are equivalent to Sobol’ indices. You can check that the linear model has a good fit using the MetaModelValidation class. If the fit is good (say, a Q2 larger than 0.9), estimating the squared SRC indices can then be done with the CorrelationAnalysis.computeSquaredSRC() method. See the example here.
  • You can create a meta-model, and use an estimator based on this.
    • One method is to create a FunctionalChaosAlgorithm. Then the Sobol’ indices can be obtained from FunctionalChaosSobolIndices. See the example here.
    • One other method is to create a Kriging metamodel using the KrigingAlgorithm class. See the example here. Then you can use any sampling-based Sobol’ indices estimator. The easiest way is to use the SobolSimulationAlgorithm class, so that you let the algorithm find the sample size depending on a pre-defined tolerance criteria. See the example here.
  • You may want to use the new RankSobolSensitivityAlgorithm class. There is an example here. You will only have first order Sobol’ indices (total Sobol’ indices cannot be estimated with this method) and this is why I rank this method as the third option here.

You asked for Sobol’ indices, and this is why my first answer is the set of three methods above. Computing the Sobol’ indices from the polynomial chaos expansion is very accurate, provided you can get an accurate meta-model, of course. There are, however, other sensitivity indices that may be worth trying in your setup. Please have a look at HSIC indices : they can be used in a given-data setting, such as yours.

Regards,
Michaël