Error while building chao algo

Hi everyone !
I build a sample from my physics simulation tools. From this sample I would like to build metamodel.
I first try a chaos algo. But an error occured and I don’t understand why. I generate my sample with LHS DOE. I keep sample input values and associated results in CSV file I reload when I need it.

OT version 1.15 with python 3.8

See pictures joined[quote=“Flore, post:1, topic:84, full:true”]

erreur_chaos_polynomial_part1

Hi this means that your output_sample is constant…

Sofiane is right, if your output_sample is constant, using the simplified interface of FunctionalChaosAlgorithm you will have a problem.
IMO you should first find why your output sample is constant (maybe a wrong decimal separator or a wrong field separator).

The logic implemented by the simplified interface is the following:

  • If your sample size is less than ResouceMap.GetAsUnsignedInteger("FunctionalChaosAlgorithm-SmallSampleSize") then the model selection is done using KFold() cross-validation, which requires a nonzero output variance.
  • Else if your sample size is less than ResouceMap.GetAsUnsignedInteger("FunctionalChaosAlgorithm-LargeSampleSize") then the model selection is done using CorrectedLeaveOneOut() cross-validation, which also requires a nonzero output variance.
  • Else there is no model selection and the full basis is used to create the meta-model.
    A simple workaround is to set these two keys to zero to force the algorithm to use the last method. Add the following lines anywhere before the creation of chaosalgo:
ot.ResourceMap.SetAsUnsignedInteger("FunctionalChaosAlgorithm-SmallSampleSize", 0)
ot.ResourceMap.SetAsUnsignedInteger("FunctionalChaosAlgorithm-LargeSampleSize", 0)

and it should work like a charm!

Cheers

Régis

Hi Regis, Sofiane, thanks for your answers

my sample is not constant
I generated it with a DOE and saved it in a CSV file, and I reload this CSV file to use these points to perform some statistics, try to build a metamodel etc

Some precisions : I have a numerical model with 2D finite element method
I choose 7 input parameters and more than 10 results to study
It’s a test case “just to see” what I can do with openturns

my sample has 100 points (I have another sample with 1000 points)
[100 points because with 7 input parameters I thought it was enough to build a “good” metamodel]

my results sample, called “output_sample” is not constant, I can compute mean, median, standard deviation, quantile values that show this is not constant
But maybe I have to build a metamodel for each result apart ?

Today I work at home so I can’t execute my openturns python script
I will try your suggestion about SmallSampleSize / LargeSampleSize
I did not change default values (a priori 1000 / 10000)

Next step is to try to build a kriging metamodel
Which covariance models can I try, which basis can I try to build a kriging metamodel ?
I see in examples;

  • basis generated with ConstantBasisFactory, LinearBasisFactory, Quadratic…
  • allways covariance model = SquaredExponential
    some other ideas to suggest ?

Thanks a lot for all !

Flore

I’ve just understood my mistake !
Because first I worked with same solver, same geometry but with 27 input parameters ! But I thought it will be too much to build metamodel, so I reduced my input numbers… but some results in this case get always same values !
So I will suppress these results and I think it will be better

If you can answer to my other questions in previous message, it will be great !

Thanks again !
Flore

Hi @Flore! The choice of Kriging covariance model depends on how smooth you expect the model to be.

  • If you know your model is continuous (but not differentiable), try out ExponentialModel.
  • If you know your model is exactly one time continuously differentiable, use a MaternModel with \nu parameter 1.5
  • If you know your model is exactly 2 times continuously differentiable, use a MaternModel with \nu parameter 2.5
  • If you know your model to be really smooth (i.e. infinitely differentiable), a SquaredExponential model might actually be the best solution.
  • If you do not know, a MaternModel with \nu parameter 2.5 is usually a solid choice.

Hope this helps,
Joseph

Thanks a lot Joseph !
I will try

@josephmure I fix your first sentence : if you know your model is continuous (but not differentiable), try out AbsoluteExponential
ExponentialModel relies on GeneralizedModel thus requires more regularity

1 Like

Hi
I achieved to build metamodels using chaos polynomials, thanks a lot !
For 3 of my 5 results I observe I got very good predictivity factor Q2, very near from 1!

Now I’am trying kriging algo
but it’s weird my python script stops during algo.run
without any messages

I have 7 input parameters and 5 results
my sample size (to build metamodel) is 100
my “check” sample size is 10

if you have any idea about the reason why…

I used exactly same samples to build different chaos algo without any problems

Thanks in advance
Flore

Hi Flore,
Welcome on this forum.
The algo.run() algorithm performs two main steps:

  • it learns the coefficients of the covariance model using an optimization algorithm,
  • then it creates the actual gaussian process.

The usual suspect is the optimization algorithm which may fail. There are various ways to fail: the likelihood may go in unbounded zones of the parameters, etc…

Turning on the logger may help to print intermediate messages:

http://openturns.github.io/openturns/master/user_manual/_generated/openturns.Log.html

The ot.Log.Show(ot.Log.ALL) will turn on the software and should print lots of messages. Please look into the Linux / Windows terminal, because not all messages are printed in the IDE (e.g. Spyder) or Jupyter Notebook.

Once done, I would personally change the optimization algorithm. The following example shows how to configure it:

http://openturns.github.io/openturns/master/auto_meta_modeling/kriging_metamodel/plot_kriging_hyperparameters_optimization.html

The tricks we often use are: change the starting point, fine tune the bounds, change the covariance model.

I hope this will help!

Best regards,

Michaël

Thanks a lot Michael !
I will try (but not today) and I’ll keep you informed
Best regards
Flore

Hi
I manage to build kriging metamodel ! I just build a metamodel for each of my outputs, it took a little time.
I tried constant, linear and quadratic basis factory, with always cov model “squared exponential”. For some of my results the Q2 predictivity factor is very well (0.999) but for some others Q2 is just about 0.6 or 0.8. Constant basis seems to be the best choice (strange ?)
I will try later with other cov models
Thanks a lot for all your advices !
Flore

2 Likes

Hi @Flore I am not completely surprised by your findings regarding the constant basis. Linear and quadratic terms in the estimated trend are only useful if the actual trend of your data is linear or quadratic. Besides, expanding the basis means that a bigger portion of your data will need to be used to estimated the corresponding coefficients.

Think of it this way:

  • With a constant basis, the \boldsymbol{\beta} parameter has size 1.
  • With a “linear” basis (actually an affine basis because linear terms are added to the constant term), the size of \boldsymbol{\beta} is d+1, with d being the input dimension.
  • With a “quadratic” basis (where the quadratic terms are added to the linear terms and the constant term), the size of \boldsymbol{\beta} is d(d+1)/2 + d + 1.

Practically speaking, the larger the size of \boldsymbol{\beta}, the bigger the part of your data used to estimate it. The part of the data used to estimated \boldsymbol{\beta} cannot be reused to estimate the other parameters \sigma and \boldsymbol{\theta}: there is a tradeoff. This could explain why a constant basis can yield better results.

Hi @Flore

Depending on the smoothness of the output wrt the input, the choice of a squared exponential model may be questionable. This covariance model will give you an infinitely smooth interpolation regardless of the actual smoothness your function. You can explore this point by using the Matern model and an increasing value for its nu parameter, starting from 1.5 by to let’s say 5.5 by steps of 1. If you get better results with low values of nu it means that your function is not so smooth.

Cheers