SubsetSampling intermediary quantiles estimation

efekhari27 · January 17, 2023, 2:46pm

Hi everyone,

I recently started using the OpenTURNS implementation of the Subset sampling algorithm to estimate rare event probabilities. As described in the documentation example, I tried to isolate the different subset samples to generate nice plots. In this process, I also recomputed the intermediary quantiles returned by the accessor “getThresholdPerStep()”. In the following minimalist example, you will notice that the estimated values are different (intermediary quantiles returned by “getThresholdPerStep” vs. recomputed quantiles on samples from the algorithm).

import openturns as ot
## Problem definition
X = ot.RandomVector(ot.Normal([0.25] * 2, [1] * 2, ot.IdentityMatrix(2)))
g = ot.SymbolicFunction(["x1", "x2"], ["20-(x1-x2)^2-8*(x1+x2-4)^3"])
g = ot.MemoizeFunction(g)
Y = ot.CompositeRandomVector(g, X)
myEvent = ot.ThresholdEvent(Y, ot.LessOrEqual(), 0.0)
## Subset sampling
algo = ot.SubsetSampling(myEvent)
algo.setKeepEventSample(True)
algo.run()
inputSampleSubset = g.getInputHistory()
outputSampleSubset = g.getOutputHistory()
result = algo.getResult()
proba = result.getProbabilityEstimate()
print("Estimated failure probability = {:.2e}".format(proba))
levels = algo.getThresholdPerStep()
N = algo.getMaximumOuterSampling() * algo.getBlockSize()
Ns = algo.getStepsNumber()
## Retrieve subset samples from Memoize function
list_subSamples = list()
list_subSamples_outputs = list()
for i in range(Ns):
    list_subSamples.append(inputSampleSubset[i * N: i * N + N])
    list_subSamples_outputs.append(outputSampleSubset[i * N: i * N + N])
alpha = algo.getConditionalProbability()
quantiles = [ss_output.computeQuantile(alpha)[0] for ss_output in list_subSamples_outputs]
print("Quantiles estimated on subset samples: \n", ot.Point(quantiles))
print("Quantiles from SubsetSampling accessor: \n", levels)

I understand why the two computations of the last quantile are different but not the intermediary ones. Is it coming from the use of the MemoizeFunction to retrieve the algorithm’s samples? Thanks for your help!

Elias

dumas · January 18, 2023, 11:28am

Hello,

I also used the history sample to draw such graphs and for almost the same reason I would like to have access to the real sample used to compute the probabilities. Indeed due the Markov Chain process, the computed realization is not always selected as point to compute the intermediate probabilities, I think this is why there is such a difference.

Openturns provides 2 methods called getEventInputSample and getEventOutputSample, however it only returns the last step sample. Maybe we could suggest to return the full sample or a collection of sample.

Antoine

vchabri · January 24, 2023, 2:50pm

Hi @dumas,

Thanks for your answer. To be honest, I didn’t get the point about your sentende “indeed due to the Markov Chain process […] intermediate probabilities”. Could you please explain us a little bit more what you have in mind?

Thanks for your help,
Regards,
Vince

dumas · January 26, 2023, 9:39am

When I looked into the SubsetSampling class in the function generatePoints, there is this piece of code:

    Sample blockSample(standardEvent_.getFunction()(inputSample));

    for (UnsignedInteger j = 0; j < getBlockSize(); ++ j)
    {
      // 2. accept the new point if in the failure domain
      if (getEvent().getOperator()(blockSample(j, 0), threshold))
      {
        currentPointSample_[i * blockSize + j] = inputSample[j];
        currentLevelSample_[i * blockSize + j] = blockSample[j];
      }
    }

The function evaluates a new input sample but all realizations are not accepted for the evaluation of the the conditional probability of failure. This is why there is a difference when you get history from the function.
I did not try to modify OT but on my opinion it would be nice to get currentPointSample and currentLevelSample.

schueller · October 30, 2023, 3:16pm

I modified subset to add accessors to the samples at each step

Topic		Replies	Views
Wilks quantile in OpenTURNS Python usage	10	964	September 19, 2021
OpenTURNS 1.25 is available Announcements	0	38	June 17, 2025
Why does the LHS class exist? Python usage reliability	8	277	November 20, 2023
Inconsistencies in OpenTURNS API Development api	0	343	December 14, 2021
How to evaluate the sampling methods Methodology sampling	7	448	March 29, 2023

SubsetSampling intermediary quantiles estimation

Related topics