PCE design matrix and coefficient fit

Greetings,

I am trying to reproduce the results of the example “Create a polynomial chaos for the Ishigami function: a quick start guide to polynomial chaos” (Create a polynomial chaos for the Ishigami function: a quick start guide to polynomial chaos — OpenTURNS 1.20 documentation) by computing the design matrix for the polynomial basis computed by OpenTURNS and then computing the PCE coefficients with numpy.linalg.lstsq, however, I get incorrect results.

In particular, once OpenTURNS is done with its PCE approximation, I extract the PCE basis and compute the design matrix and the PCE coefficients as follows:

image

Note that “result2” is the same as “result” in the original implementation. I would have expected that this solution yields exactly the same results as the FunctionalChaosAlgorithm, but it is not. The coefficients are way off and the design matrix has a pretty large condition number (close to 1M). Apparently I am missing something here. Could anyone point out where my error is? In particular, what would be the correct way to compute the design matrix?

Best,
Dimitris

Hi Dimitris,

Are you aware of the fact that OT performs a change of measure in many situations in order to build the orthonormal basis? In the Ishigami case, the orthonormal basis is the tensorized Legendre polynomial basis, so it is associated to the uniform distribution over [-1,1]^3 and not the uniform distribution over [-\pi,\pi]^3 You have to map the inputTrain sample to [-1,1]^3 using the DistributionTransformation class, see DistributionTransformation — OpenTURNS 1.20 documentation, which is a linear transformation here. Then you evaluate the reduced basis on the transformed sample. This way you should get a well conditioned design matrix and the correct coefficients.

Cheers

Régis

Hi Regis,

Thank you very much for the insight. Indeed, this transformation did the trick.

image

Based on my prior experience with other UQ packages (e.g., UQpy, UQLab), I was expecting that the transformation is already included in the evaluation of the orthogonal polynomials, see e.g. UQpy/Hermite.py at master · SURGroup/UQpy · GitHub. But the OT way also works just fine. Thanks a lot!

Best regards,
Dimitris

Hi Dimitris,
There is no need to use Numpy to get the design matrix: the library has everything you need. But we have to combine four classes to get the job done.

There are several points that we have to take into account:

  • there is a transformation of the input,
  • depending on the way we create the polynomial, model selection can be involved, which selects specifics functions in the basis (or, equivalenty, columns in the design matrix),
  • we need an algorithm to actually solve the least squares problem (e.g. the Cholesky decomposition or the SVD or QR orthogonal decompositions).

One of the keys is the DesignProxy class, which provides methods to solve the least squares problem. It provides the computeDesign() method that you need. But please see the script below, because in many cases we do not need that matrix, but the Gram matrix (or its inverse). This class is needed by a least squares solver, e.g. QRMethod that is used to actually compute the coefficients.

On the specific topic of the transformation, @regislebrun already mentioned the
DistributionTransformation class which provides the feature. Also, notice that this topic is presented in the example that I wrote to illustrate the principle.

Another class is the Basis class which manages a set of functions as the functional basis for the decomposition. This basis is need by the constructor of the DesignProxy because it defines the columns of the matrix.

Here the relevant part of a script that I use when I want to manage the design matrix. In my case, I do not need the design matrix, but the inverse of the Gram matrix.

[...]
chaosalgo = ot.FunctionalChaosAlgorithm(
    x_train, y_train, input_distribution, adaptiveStrategy, projectionStrategy
)
result = chaosAlgorithm.getResult()
reduced_basis = result.getReducedBasis()  # As a result of the model selection
n_parameters = len(reduced_basis)
all_indices = result.getIndices()
transformation = result.getTransformation()  # As a result of the input distribution
z_train = transformation(x_train)  # Map from the physical input into the transformed input
basis = ot.Basis(reduced_basis)
design_proxy = ot.DesignProxy(z_train, basis)  # This is it!
design_matrix = design_proxy.computeDesign(all_indices)  # That is what you asked for (edit: with the indices)
indices = range(n_parameters)
lsq_method = ot.QRMethod(design_proxy, indices)
beta_hat = lsq_method.solve(y_train.asPoint())
inverse_gram = lsq_method.getGramInverse()  # That is what we need in many cases

I have a technical report on the way that I wish to publish that contains examples of how to use this script. As soon as this can be done, I will push so that the new getDesignProxy() method gets created with a consistent set of examples to show its use.

As far as I understand the topic, you should get a condition number pretty close to 1, but not exactly. This is because the orthogonal polynomials are … orthogonal! Their orthogonality properties are with respect to the scalar product that defines the Hilbert space associated with the marginal probability density function:

\langle g_1, g_2 \rangle_{c} = \int_{\mathcal{Z}} g_1(z) g_2(z) f(z) dz

where the subscript “c” is for “continuous”, \mathcal{Z} \subseteq \mathbb{R} is the transformed domain, z is the transformed input, f : \mathcal{X} \rightarrow \mathbb{R} is the probability density function of the transformed input and g_1 and g_2 are two integrable functions. Hence, the orthogonal polynomials \{\pi_k\}_{k \geq 0} are orthogonal with respect to that (integral) scalar product, but not with respect to the discrete scalar product:

\langle g_1, g_2 \rangle_d = \sum_{i = 1}^n g_1(z_i) g_2(z_i)

where the subscript “d” is for discrete, \{z_i \in \mathcal{Z}\}_{i = 1, ..., n} is a simple Monte-Carlo sample. But the integral can be approximated by the Monte-Carlo sum, hence:

\int_{\mathcal{Z}} g_1(z) g_2(z) f(z) dz \approx \frac{1}{n} \sum_{i = 1}^n g_1(z_i) g_2(z_i)

provided that the sample size n is sufficiently large. Hence, the columns of the design matrix associated to the orthogonal basis are not exactly orthogonal with respect to the discrete scalar product, but very close to. In other words, the design matrix is close to an orthogonal matrix. The design matrix is exactly orthogonal with respect to the discrete scalar product if the family of orthogonal polynomials is designed to be so, but this is not the case for most families of orthogonal polynomials for continuous distributions (e.g. Legendre, Hermite, etc.). This implies that its condition number is close to 1. This is why @regislebrun sometimes mentions the fact that we may not need to use orthogonal decomposition methods of matrices to solve the linear least squares problem associated to the polynomial chaos expansion: using the normal equations can be sufficient in some cases.
Regards,
Michaël

PS
Chebyshev polynomials are orthogonal with respect to the discrete scalar product on some well chosen nodes, but the weight function that make them orthogonal may not necessarily correspond to the distribution function that we are interested in. See e.g. (Mason, Handscomb, 2003) section 4.6 “Discrete orthogonality of Chebyshev polynomials”, page 85.

  • “Chebyshev polynomials”, J.C. Mason, D.C. Handscomb. CRC Press. (2003)

Dear Michaël,

Thank you for the very much for the analytical reply. I will also use the script you provided, it might indeed simplify things.

One note: You mention that “As far as I understand the topic, you should get a condition number pretty close to 1, but not exactly.” At least for the regression-based PCE, the condition number of the design matrix (of the Gram matrix as well) depends significantly on the size of the experimental design, e.g., see this paper: Analysis of Discrete 𝐿2 Projection on Polynomial Spaces with Random Evaluations. In fact, the condition number should converge to 1 as the number of sample points in the experimental design increases (accordingly, as the discrete sum converges to the continuous integral).

Best regards,
Dimitris

P.S. I think that you forgot to include the all_indices variable inside the design_proxy.computeDesign() method:

design_matrix = design_proxy.computeDesign(all_indices) 

When using design_proxy.computeDesign() (i.e., without the indices argument), the following error appears:

TypeError: DesignProxy.computeDesign() missing 1 required positional argument: 'indices'
1 Like