Empirical Bernstein copula on a weighted sample?

efekhari27 · November 27, 2023, 10:33am

Hi everyone,

I was wondering if there is a way in OpenTURNS to fit an empirical Bernstein copula (EBC) using a weighted sample.

Let us consider two samples \mathbf{X}_{0, n}, \mathbf{X}_{1, n} (each in \mathbb{R}^d, with size n) independently generated after two well known distributions: \mathbf{X}_{0, n} \sim h_0 and \mathbf{X}_{1, n} \sim h_1. I would like to use the two samples to fit the copula associated with the distribution h_0.

Would it be legitimate to apply importance sampling weights to the sample \mathbf{X}_{1, n} and fit an EBC using the union of \mathbf{X}_{1, n} weighted and \mathbf{X}_{0, n}? And is there a way to use the ot.EmpiricalBernsteinCopula class to do so?

Thanks in advance for your answers!
Elias

josephmure · November 30, 2023, 8:15am

Hi Elias, for now the only way to do that I see is to repeat the points with larger weight in the sample. That is because the EmpiricalBernsteinCopula constructor requires a Sample, whereas a weighted sample would be represented in the library by a WeightedExperiment.

efekhari27 · December 1, 2023, 11:06am

Hi Joseph,

Thanks for your answer. Working with repetitions works perfectly to emulate weights even if it’s probably not optimal numerically.

Best,
Elias

regislebrun · December 5, 2023, 11:34am

In addition to Joseph’s excellent answer (the repetition of points according to their weights), I would like to confirm that the approach you describe is perfectly sounded. With w_i=h_0(\mathbf{X}_{1,n}^i)/h_1(\mathbf{X}_{1,n}^i), the weighted sample (w,\mathbf{X}_{1,n}) is distributed according to h_0.

Unfortunately we don’t have the concept of weighted sample in OT yet. We manipulate separately the sample and the weights, as produced e.g by a WeightedExperiment (which is indeed a weighted sample generator).

The current implementation of EmpiricalBernsteinCopula relies heavily on the uniform weights. It could be adapted to nonuiform weights (and BernsteinCopulaFactory too) but the cost in terms of performance (sampling, PDF/CDF computation) will probably be significant. It can be added to the whish list on github with a short description of the context, in particular the dimension and the size of the samples you want to use.

Topic		Replies	Views
Conditional Sampling with Distinct Distributions Based on Sampled Values Python usage distribution	2	51	December 2, 2024
Simulation sensitivity analysis with Importance Sampling Methodology sensitivity-analysis , reliability	0	432	February 26, 2021
Is it possible to correlate two distributions in a BayesDistribution? Python usage	4	82	February 11, 2025
Generate multivariate joint distribution Python usage	8	1223	March 14, 2022
Construct conditional distribution and transform the jointly distribution Python usage distribution , api	3	252	June 10, 2023

Empirical Bernstein copula on a weighted sample?

Related topics