Impossible truncature degree for Functional Chaos

Hi,

I worked on this topic, and wanted to see how I may be able to reproduce your experiments.

In your script, I notice that you use the variable name input_sample_training_rescaled. I do not know what this means exactly, but I guess that you scaled in the inputs into the unit cube [0,1]^p, where p is the dimension. In general, this is not necessary, because PCE uses standardized variables anyway. This is straightforward in your example, since you provide the distribution as input argument. In other words, scaling the inputs seems useless to me in this case.

I wanted to see how the number of coefficients increases in this example. According to [1] page 34, eq.2.53, the number of coefficients of a PCE with total degree d and p input variables is:

\textrm{Card}(\mathcal{J}) = \binom{p + d}{d}

where the binomial coefficient is:

{p + d \choose d} = \frac{(p + d)!}{p! d!}.

The following script computes the number of coefficients when the dimension is equal to p = 3.

dimension = 3
# Plot
degree_maximum = 15
degree_vs_coeffs = np.zeros((degree_maximum, 2))
for totalDegree in range(1, 1 + degree_maximum):
    degree_vs_coeffs[totalDegree - 1, 0] = totalDegree
    degree_vs_coeffs[totalDegree - 1, 1] = ot.SpecFunc_BinomialCoefficient(
        dimension + totalDegree, totalDegree
    )

I got this figure:

image

This shows that the polynomial degree required to get more than 800 coefficients is 15, which is a little larger than the polynomial degree 11 you mentioned.

I wondered the number of coefficients that is obtained when we use a full polynomial chaos with total degree 50, as you used in your simulation. Increasing the maximum polynomial degree up to 50 and using log-scale for the Y axis produces the following figure.

image

This shows that for a model with p = 3 inputs, the total polynomial degree d = 50 produces more than 10^4 coefficients. This seems much larger than usual.

This indicates the number of coefficients of the full (non sparse) PCE, but does not indicate the reduction in the number of coefficients of the sparse PCE with LARS selection method. So I created a sparse PCE with LARS selection method and least squares, and counted the number of coefficients actually in the selected basis. I compared that with the number of coefficients of the full PCE. For this experiment, I used a training size equal to 800 and a simple Monte-Carlo DOE.

image

We see that the sparse PCE drastically reduced the number of coefficients.

In order to reproduce your results, I performed the same experiment as before, using full PCE this time and with maximum polynomial degree equal to 12. This produces the following figure:

image

I was surprised to see that the Q2 coefficient is much better as I would guess: the PCE performs rather well on this point of view. Looking more closely to the results, I was surprised that the coefficients are rather accurate with the full PCE. It is difficult to exhibit overfitting in this case, perhaps because the training sample is quite large: with polynomial degree equal to 12, we have approximately 400 coefficients to estimate with 800 points in the training DOE. This seems to be more than enough for the Ishigami test function.

I compared the time required to produce the previous figure with two different PCE decompositions:

  • a sparse PCE using least squares and LARS selection method : 41 seconds,
  • a full PCE using least squares : 2 min 34 seconds.

Therefore, to create a full PCE with total degree up to 50, you were able to compute more than 10^4 coefficients, which must require much more than 10 minutes, isn’it? What CPU / wall clock time was required to produce the figure you showed in your message?

Best regards,

Michaël


  1. Le Maître, O. and Knio, O. (2010). Spectral Methods for Uncertainty Quantification with Applications to Computational Fluid Dynamics. Springer Series Scientific Computation. ↩︎