How to evaluate the sampling methods

jacktang · March 7, 2023, 10:16am

Hello,

There are some sampling methods in OT, and my question is how to compare those sampling method in the specified problem? We can suppose the problem is optimization problem in bound space. Thanks!

JPelamatti · March 8, 2023, 10:27am

Hello there,

I’m not sure I understand correctly your question. You want to compare different sampling methods, like standard Monte Carlo, optimized LHS and Sobol sequence, for a given task?
And concerning your example, when you mention optimization, do you wish to optimize a given bounded problem through brute-force sampling, our rather use different sampling method in order to generarte the initial solution in the context of a multi-start optimization OT class? I will consider the first option here.

I will try to answer given my understanding of your question.
First of all, most of the sampling methods in OT are related to the parent class experiment, you can find a list of all derived experiment classes here. We consider here three dfferent approaches : standard Monte Carlo, a standard LHS, and an optimized LHS. Just for illustrative purposes, I consider the first marginal (x) as being distributed following a uniform distribution, whereas the second marginal (y) is distributed following a truncated normal distribution.

Here’s a small example (OT version 1.20 )

import openturns as ot
from matplotlib import pyplot as plt
from openturns.viewer import View

ot.RandomGenerator.SetSeed(0)


def fun(inp):
    x, y = inp
    return [(1.5-x-x*y)**2 + (2.25-x+x*y**2)**2 + (2.625-x+x*y**3)**2]


bealFun = ot.PythonFunction(2, 1, fun)

bounds = ot.Interval([-4.5, -4.5], [4.5, 4.5])

xDistribution = ot.Uniform(-4.5, 4.5)
yDistribution = ot.TruncatedNormal(0, 2, -4.5, 4.5)
#Alternatively, one can also use the truncateddistribution class directly

#We consider x and y as independent
inpDistribution = ot.ComposedDistribution([xDistribution, yDistribution])

designSize = 100

#Monte Carlo sampling
monteCarlo = ot.MonteCarloExperiment(inpDistribution, designSize)

# non optimized lhs
lhs = ot.LHSExperiment(inpDistribution, designSize)

#optimized LHS (through simulated annealing, we considered the PhiP space filling criterion, other options are available
annealingLHS = ot.SimulatedAnnealingLHS(lhs, ot.SpaceFillingPhiP())

#we now generate the samples associated to the three methods
monteCarloSample = monteCarlo.generate()
lhsSample = lhs.generate()
annealingLHSSample = annealingLHS.generate()

#we evaluate the optimization function on these samples :
monteCarloOutput = bealFun(monteCarloSample)
lhsOutput = bealFun(lhsSample)
annealingLHSOutput = bealFun(annealingLHSSample)

# we can plot the input samples, and see the differences
graph = ot.Graph("Monte Carlo", "x", "y", True, "")
cloud = ot.Cloud(monteCarloSample)
graph.add(cloud)
View(graph)

graph = ot.Graph("LHS", "x", "y", True, "")
cloud = ot.Cloud(lhsSample)
graph.add(cloud)
View(graph)

graph = ot.Graph("Optimized LHS", "x", "y", True, "")
cloud = ot.Cloud(annealingLHSSample)
graph.add(cloud)
View(graph)

# Post processing
print("Monte Carlo result mininum = ", monteCarloOutput.getMin())
print("LHS mininum = ", lhsOutput.getMin())
print("Optimized LHS mininum = ", annealingLHSOutput.getMin())


# We repeat the experiment 20 times to assess the average behaviour, this might take a few seconds

minMC = []
minLHS = []
minOptLHS = []
for i in range(20):
    monteCarloSample = monteCarlo.generate()
    lhsSample = lhs.generate()
    annealingLHSSample = annealingLHS.generate()

    monteCarloOutput = bealFun(monteCarloSample)
    lhsOutput = bealFun(lhsSample)
    annealingLHSOutput = bealFun(annealingLHSSample)

    minMC.append(monteCarloOutput.getMin()[0])
    minLHS.append(lhsOutput.getMin()[0])
    minOptLHS.append(annealingLHSOutput.getMin()[0])

plt.figure()
plt.boxplot([minMC, minLHS, minOptLHS])
plt.xticks([1, 2, 3], ["Monte Carlo", "LHS", "Optimized LHS"])
plt.ylabel("Min")

This produces:

Please note that although the result seem very different, the function varies on a range from 0 to 500, so the scale is fairly large. Also, for a brute force sampling optimization, I would personally advise not using anything else than standard Monte Carlo sampling, unless you are very constrained by your computation time. Another potential alternative would be to use a deterministic Sobol’ sequence (here), but in this case please be mindful of the sample size, which should be a power of 2 for optimal coverage if I recall correctly (please verify this information).

Edit : please note also that the low discrepancy sequences, such as Sobol’, require for you to work with uniform marginals.

Cheers, and do not hesitate if you have follow up questions

jacktang · March 9, 2023, 3:27pm

@JPelamatti Thanks for your reply!

First of all, let me clarify the problem. I am now solving multi-object optimization problem, and adopt NSGA-II algorithm. From my experience, the solving performance and the quality of solution depend on the first feasible solutions, which are generated using sampling method.

And yes, I can compare the Pareto fronts from MOO using different sampling methods. I just wonder is there any general evaluation for the sampling methods so that I don’t need to run MOO program (it takes around 30-60 min to finish).

JPelamatti · March 10, 2023, 9:00am

Oh I see,

Well, of course the best option would be to repeat the MOO for different initial NSGA-II populations, generated with different sampling methods. The issue here being that there is no ‘best’ sampling method in general, everything depends on the task at hand, and on the underlying hypotheses. Every specific task will have its own criterion allowing to assess a given sampling method performance.

However, I guess the next best thing would be to do compare the generated NSGA-II population samples on the objective and constraint functions directly (similarlh to what I did in my example). For instance, you might be able to compare the ratio of initial feasible solutions which are obtained, on average, by the different sampling methods used to generate the population.

Hoping that helps a bit,
Cheers

jacktang · March 10, 2023, 2:12pm

Thanks for your suggestion! It helps a lot

MichaelBaudin · March 26, 2023, 3:47pm

Hi Jack,
@JPelamatti has already given some answers to your question: thank you for that
I might add a few details.

First, I wanted to know if you used Pagmo, the new library that was just added to the library?
I would suggest to consider Sobol’ sequences, as Julien suggested, because it fills the space more uniformly than any other design of experiment available in OpenTURNS, provided the sample size is large enough (which depends on the dimension). Any central dispersion study using more than, say, 10 points will be improved by a low discrepancy sequence. One of the reasons not to use a low discrepancy sequences is to compute a low probability, but you did not mention that application. I would suggest to use a sample size equal to a power of 2 because this guarantees that the base 2 elementary volumes are filled with the number of points that exactly corresponds to the sample size. I showed that in this example. All low (t,m,s)-net discrepancy sequences have this property, but Sobol’ sequence uses the base 2, which leads to smaller elementary volumes than other sequences (e.g. Halton).
Notice that the distribution does not have to be uniform. This is because the LowDiscrepancyExperiment class (doc) takes the distribution as input. If you provide a distribution which is not uniform, the corresponding iso-probabilistic transformation is applied.

Regards,
Michaël

jacktang · March 28, 2023, 5:04am

Hello @MichaelBaudin, First of all, let me clarify the optimization problem:

The optimization problem contains 24x4 = 96 decision variables (continues), and the value range is around [0, 10000]
The solver is based on Pagmo NSGA-II and we extended it to reference point based NSGA-II, and we also added makeFeasible operator in the algorithm. The constraints of the problem are complex, some are dynamical, which means the feasible space is depended on the input values. As the below figure shown:

All the blue square plots are generated by sampling in static boundary constraints, and the real feasible space is the red polygon, and items out of the red polygon will be turned into feasible (turning the blue dots out of the red polygon into the red dots) using the makeFeasible operator. And that’s why we need to make some evaluations about sampling method.

In your informative reply, I notice when you compare the sampling methods, you consider “smaller elementary volumes than other sequences”, I get the “elementary volumes” calculation from the documentation for 2D space, and my question is how to calculate the metric in high dimension space?

I also read the documentation(Comparing initial sampling methods — scikit-optimize 0.8.1 documentation) which takes Euclidian distance as evaluation metric. Do you think Euclidian distance makes sense for the evaluation?

BR

/Jack

jacktang · March 29, 2023, 3:04am

Hello there

I simplified the optimization problem using ZDT3 problem(ZDT test suite — pagmo 2.19.0 documentation) with 2 objectives and 96 decision variables, and adding the “dynamic” constraints:

"""
The constraints for sequence amplitude
"""
def amplitude_constraints(x):
   delta = np.diff(x)
   v_changes = abs(np.diff(delta))
   threshold = 2 * 0.4
   return v_changes - threshold # < 0

Topic		Replies	Views
Updating an existing LH sample Methodology	3	448	April 22, 2021
IntegrationStrategy guidelines Methodology polynomial-chaos	2	289	August 2, 2022
Chaos Vs Saltelli Methodology sensitivity-analysis , polynomial-chaos	6	762	December 23, 2020
Sobol' and Halton sequences should start with 0 Methodology	6	787	October 26, 2020
New otbenchmark module Python usage	0	392	September 25, 2020

How to evaluate the sampling methods

Related topics