Hello,
I allow myself to ask a question in this forum because i want to use the Kernel Smoothing method in order to build a generic PDF for my experimental data i used this statement to create the distribution:
But doing this gives me an error that i never had when a used the kde method from scipy: WRN - Warning! The distribution number 1023 has a too small weight=0 for a relative threshold equal to Mixture-SmallWeight=1e-12 with respect to the maximum weight=13.0606. It is removed from the collection.
I don’t really understand it, and it does this for 1023 points.
I hope i’ll find some help here.
Thank you very much
PS : I did not know in which language to write, so if you want to answer in french i understand
Hi @WinterSpark and sorry for the late answer. This warning is probably not important, but maybe you can tell us a little more about your dataset. What is its dimension? Are there repeated points in the dataset? If you could share it, we could perhaps try to reproduce the warning.
This is produced because the KernelSmoothing.build() method produces a Mixture where each atom is a KernelMixture. The number of atoms is equal to the number of bins.
I assume that some kernels in the mixture have a very small weight. This might be because the sample size is so large that the contribution of some bins to the value of the PDF at some point x is negligible.
@MichaelBaudin and @josephmure are right. The message is produced by the above class (Mixture) and is not important (it is just a warning)
The weight of the kernel smoothing function are evaluated and passed to a Mixture distribution, that keep only significant coefficients (> 1e-12). In your example, a weight was smaller with respect to the threshold and is not kept in the collection. You have a collection of 1023 atoms instead of 1024
If you don’t want openturns to drop the coefficients, use ot.ResourceMap.SetAsScalar("ot.ResourceMap.GetAsScalar("Mixture-SmallWeight", 0.0) for example before performing the kernel smoothing.
I have also noticed that KS on samples with size above 1000 tend to struggle in OpenTURNS. For example, something very simple like this does not converge on my machine:
>>> import openturns as ot
>>> sample = ot.WeibullMin(1.3, 1.2).getSample(2000)
>>> ks = ot.KernelSmoothing()
>>> ks_weibull = ks.build(sample)
By increasing the value of the following default setting, the algorithm converges: