Sobol Indices do not add up to 1

Hello everybody, I am new to openTURNS and fairly new to python in general.

I am trying to perform an UQ of some complex physical model, for that my aim is to compute the sobol indices using sobolIndicesExperiment and the SaltelliAlgorithm.

Now, my physical model has a ton of variables and is actually written in Fortran so I am basically calling it from python. I am selecting a handful of random variables (parameters of interest), and giving them a determined distribution, then save them into a dataframe so I can extract them and propagate them through my model which is located in another script:

ot.RandomGenerator.SetSeed(0)
marginals = [ot.LogNormal(np.log(0.03),np.log(4/3)), ot.Normal(-40,4), ot.Normal(1,0.1), ot.Normal(1,0.1)]
distribution = ot.ComposedDistribution(marginals, ot.IndependentCopula(len(marginals)))

size = 1000
sie = ot.SobolIndicesExperiment(distribution, size)
inputDesign = sie.generate()
print(inputDesign.getSize())
print(inputDesign)

AAA = inputDesign.asDataFrame()
AAA.to_csv('AAA.csv', index=False)

After I run all the simulations I get back my output:

results_AAA = pd.read_csv('results_AAA.csv')
outputSample = ot.Sample.BuildFromDataFrame(results_AAA.loc[:,['ai']])
print(outputSample)

I know I have to use SobolIndicesExperiment to have the right format so that SaltelliSensitivityAlgorithm can be applied.

sensitivityAnalysis = ot.SaltelliSensitivityAlgorithm(inputDesign, outputSample, 1000)

But then when i check the results and sum the first order indices, these are less than 1, impossible!

I do´nt really know this is actually working, as OT is completly unaware of my model and it´s strong non-linearities, so in the first place I am not even sure this is the right thing to do. I know that a good idea is to parametrized my physical mdoel using PCE but I wanted to try with the model without an polynomial approximation first.

So basically my questions are:

  • Which can beethe root of the problem for summing the 1st order indices and these not adding up to 1?
  • Can I use SobolIndicesExperiment and Saltelli (or Martinez) to compute sobol indices of a model that OT is completly unaware of, receiving only the input and output distributions, but the rest being a black box?
  • Is there any papers/documents in which somebody has used OT to perform a UQ of some software/model or any example in which sobol indices are computed for a function that is not already pre-built in OT?

Thank you for your answers!!

Hi,

There is no theoretical reason why the first order Sobol indices should sum to 1.
It is the case only for additive models f(x_1,...,x_d)=f_1(x_1)+...+f_d(x_d), in the other cases the sum is strictly less than 1. It is the sum of all the Sobol indices of all orders which is equal to 1.
But in many applications, the sum is very close to 1 because many models are nearly additive, so they are often presented as pie chart, enforcing the (wrong) idea that they sum to 1.
As a result:

  • There is no problem having first order Sobol indices not summing to 1
  • Yes, you can use OT with a black-box function
  • There is no specific implementation of Sobol indices computation on pre-built OT functions. These algorithms are routinely used at EDF R&D, Airbus, Renault and many other industries, on simple symbolic functions as well as large scale numerical simulations. These functions are here to illustrate the use of the algorithms, you just have to plug your own code in one of the examples and adapt it to your needs as a first start.

Cheers

Régis

1 Like

Thank you for your speedy answer @regislebrun, it is gladly recieved.

I think I got confused the first time I read the theory:

Now I understad better that is the sum of all first order and group first order interactions what need to sum up to 1. In a function with only 2 predictor variables that would be the addition of S1+S2+S{1,2}=1 right?

Nice to hear that the algorithms work with the model being a black-box. I assume that there are complicated transformations underneath the sampling of the distributions which is performed. I am still a bit confused with the methodology used by Saltelli’s algorithm. The sample received is n(k+2) being the first n rows vector split A and the next n rows veector split B. Regarding the last n.k rows I read these are generated by shifting columns, is this some type of bootstraping method?

For me, as an engineer, is difficult to see how is the mapping between the OutputDesign and InputDesign performed, such that the model doesn’t need to be known in order to calculate sobol indices.

Thank you for your previous response, is very much appreciated!

Cheers.