Hi everyone,
I recently started using the OpenTURNS implementation of the Subset sampling algorithm to estimate rare event probabilities. As described in the documentation example, I tried to isolate the different subset samples to generate nice plots. In this process, I also recomputed the intermediary quantiles returned by the accessor “getThresholdPerStep()”. In the following minimalist example, you will notice that the estimated values are different (intermediary quantiles returned by “getThresholdPerStep” vs. recomputed quantiles on samples from the algorithm).
import openturns as ot
## Problem definition
X = ot.RandomVector(ot.Normal([0.25] * 2, [1] * 2, ot.IdentityMatrix(2)))
g = ot.SymbolicFunction(["x1", "x2"], ["20-(x1-x2)^2-8*(x1+x2-4)^3"])
g = ot.MemoizeFunction(g)
Y = ot.CompositeRandomVector(g, X)
myEvent = ot.ThresholdEvent(Y, ot.LessOrEqual(), 0.0)
## Subset sampling
algo = ot.SubsetSampling(myEvent)
algo.setKeepEventSample(True)
algo.run()
inputSampleSubset = g.getInputHistory()
outputSampleSubset = g.getOutputHistory()
result = algo.getResult()
proba = result.getProbabilityEstimate()
print("Estimated failure probability = {:.2e}".format(proba))
levels = algo.getThresholdPerStep()
N = algo.getMaximumOuterSampling() * algo.getBlockSize()
Ns = algo.getStepsNumber()
## Retrieve subset samples from Memoize function
list_subSamples = list()
list_subSamples_outputs = list()
for i in range(Ns):
list_subSamples.append(inputSampleSubset[i * N: i * N + N])
list_subSamples_outputs.append(outputSampleSubset[i * N: i * N + N])
alpha = algo.getConditionalProbability()
quantiles = [ss_output.computeQuantile(alpha)[0] for ss_output in list_subSamples_outputs]
print("Quantiles estimated on subset samples: \n", ot.Point(quantiles))
print("Quantiles from SubsetSampling accessor: \n", levels)
I understand why the two computations of the last quantile are different but not the intermediary ones. Is it coming from the use of the MemoizeFunction to retrieve the algorithm’s samples? Thanks for your help!
Elias