Mauntz-Kucherenko formula

This post follows the discussion started on Github:

In the following, I use the notations defined on this page of the doc:

Applying Equation (12) of the original paper

the Mauntz-Kucherenko estimator of the first order Sobol index is:

\hat{V}_i = \frac{1}{N} \sum_{k=1}^N \tilde{G}(\boldsymbol{B}_k) \left[ \tilde{G}(\boldsymbol{E}_k^{i}) - \tilde{G}(\boldsymbol{A}_k) \right].

Logically, the estimator \hat{V}_{-i} should then be

\hat{V}_{-i} = \frac{1}{N} \sum_{k=1}^N \tilde{G}(\boldsymbol{B}_k) \left[ \tilde{G}(\boldsymbol{C}_k^{i}) - \tilde{G}(\boldsymbol{A}_k) \right].

If we cannot do that (for example because we have not implemented \boldsymbol{C}_k^{i} :wink: ), then we could define

\hat{V}_{-i}^{alt} = \frac{1}{N} \sum_{k=1}^N \tilde{G}(\boldsymbol{A}_k) \left[ \tilde{G}(\boldsymbol{E}_k^{i}) - \tilde{G}(\boldsymbol{B}_k) \right].

and that would remain consistent with the paper.

In that case, we would define \widehat{VT}_{i}^{alt} as \frac{1}{N} \sum_{k=1}^N \tilde{G}(\mathcal{A}_k)^2 - \hat{V}_{-i}^{alt} (keep in mind that \sum_{k=1}^N \tilde{G}(\boldsymbol{A}_k) is supposed to be null). Therefore

\widehat{VT}_{i}^{alt} = \frac{1}{N} \sum_{k=1}^N \tilde{G}(\boldsymbol{A}_k) \left[ \tilde{G}(\boldsymbol{A}_k) + \tilde{G}(\boldsymbol{B}_k) - \tilde{G}(\boldsymbol{E}_k^{i}) \right].

Of course, in OpenTURNS, we typically use unbiased variance estimators so \frac{1}{N} should be replaced in all formulas by \frac{1}{N-1}. Not that it matters with the large N involved in Sobol estimation.

Here is the formula for \widehat{VT}_i^{alt} actually implemented in the code:

VTi(q, p) = referenceVariance_[q] + (size * muA[q] * muA[q] - yEDotyA[q]) / (size - 1.0);

In the code, q is the output dimension. For the sake of simplicity, let us assume that the output is one-dimensional, so q is 0. The input dimension (denoted by i in the post above) is here denoted by p: p =i.

Currently, \tilde{G} is centered with respect to the full (G(\boldsymbol{A}), G(\boldsymbol{B}), G(\boldsymbol{E}^1),...,G(\boldsymbol{E}^{n_X}))^T:

Case 1 : \tilde{G} centerered with respect to the sample \boldsymbol{A}

In this case, we assume that the code snippet above is changed in order to center the output sample with respect to \boldsymbol{A}.

That is to say that \tilde{G}(\cdot) = G(\cdot) - \frac{1}{N} \sum_{k=1}^N G(\boldsymbol{A}_k). In this case we would have:

referenceVariance_[q] = \frac{1}{N-1} \sum_{k=1}^{N} \tilde{G}(\boldsymbol{A}_k)^2,

size = N,

muA[q] = \frac{1}{N} \sum_{k=1}^N \tilde{G}(\boldsymbol{A}_k) = 0,

yEDotyA[q] = \sum_{k=1}^N \tilde{G}(\boldsymbol{E}_k^i) \tilde{G}(\boldsymbol{A}_k).

Therefore,

VTi(q, p) = \frac{1}{N-1} \sum_{k=1}^{N} \tilde{G}(\boldsymbol{A}_k) \left( \tilde{G}(\boldsymbol{A}_k) - \tilde{G}(\boldsymbol{E}_k^i) \right)

Case 2 : \tilde{G} centerered with respect to the full sample

This is what is currently implemented, so the formulas below are what is really computed in OT version 1.16.

With \tilde{G}(\cdot) = G(\cdot) - \frac{1}{(2+n_X)N} \left( \sum_{k=1}^N G(\boldsymbol{A}_k) + \sum_{k=1}^N G(\boldsymbol{B}_k) + \sum_{i=1}^{n_X} \sum_{k=1}^N G(\boldsymbol{E}_k^i) \right), we have

referenceVariance_[q] = \frac{1}{N-1} \sum_{k=1}^{N} \left( \tilde{G}(\boldsymbol{A}_k) - \frac{1}{N} \sum_{k'=1}^N \tilde{G}(\boldsymbol{A}_{k'})\right)^2,

size = N,

muA[q] = \frac{1}{N} \sum_{k=1}^N \tilde{G}(\boldsymbol{A}_k) \neq 0,

yEDotyA[q] = \sum_{k=1}^N \tilde{G}(\boldsymbol{E}_k^i) \tilde{G}(\boldsymbol{A}_k).

Therefore,

VTi(q, p) = \frac{1}{N-1} \sum_{k=1}^{N} \left( \tilde{G}(\boldsymbol{A}_k) - \frac{1}{N} \sum_{k'=1}^N \tilde{G}(\boldsymbol{A}_{k'})\right)^2 + \frac{N}{N-1}\left( \frac{1}{N} \sum_{k=1}^{N} \tilde{G}(\boldsymbol{A}_k) \right)^2
- \frac{1}{N-1} \sum_{k=1}^N \tilde{G}(\boldsymbol{E}_k^i) \tilde{G}(\boldsymbol{A}_k).

Conclusion

The formulas of both Case 1 and Case 2 do not fit the theory. The doc is wrong too, but in a different way.

@josephmure Using what your wrote and analysing a bit theory/other sources, we can refine the analysis.

Starting from the estimate of first order:

\hat{V}_i=\frac{1}{N}\sum_{k=1}^N \tilde{G(\boldsymbol{B}_k)}\left[\tilde{G(\boldsymbol{E}_k^{i})} - \tilde{G(\boldsymbol{A}_k)}\right]

We get:

\hat{V}_{-i}=\frac{1}{N}\sum_{k=1}^N \tilde{G(\boldsymbol{B}_k)}\left[\tilde{G(\boldsymbol{C}_k^{i}}) - \tilde{G(\boldsymbol{A}_k)}\right]

As C_k are missing, we use (as you mentioned):

\widehat{V}_{-i}^{alt}=\frac{1}{N}\sum_{k=1}^N\tilde{G(\boldsymbol{A}_k)}\left[\tilde{G(\boldsymbol{E}_k^{i})} - \tilde{G(\boldsymbol{B}_k)}\right].

Then to get \widehat{VT}_{i}^{alt}, we use:

\widehat{VT}_{i}^{alt}= Var( \tilde{G(\boldsymbol{A}_k)} ) - \widehat{V}_{-i}^{alt}

The tip here consists, as for first order, to replace \left(\frac{1}{N}\sum_{k=1}^N \tilde{G(\boldsymbol{A}_k)}\right)^2 by \frac{1}{N}\sum_{k=1}^N \tilde{G(\boldsymbol{A}_k)} \tilde{G(\boldsymbol{B}_k)}. Thus we get:

\widehat{VT}_{i}^{alt}= \frac{1}{N}\sum_{k=1}^N\tilde{G(\boldsymbol{A}_k)}\left[(\tilde{G(\boldsymbol{A}_k}) - \tilde{G(\boldsymbol{E}_k^{i})})\right]

If we have a look for example to R-sensitivity, the estimator is implemented like that.

However, in OpenTURNS, we replace the first term \frac{1}{N}\sum_{k=1}^N\tilde{G(\boldsymbol{A}_k)}(\tilde{G(\boldsymbol{A}_k}) by var(y_a) + \mu_a^2. In other words, we should implement the good one with the previous equation