hello,
This might be a naive question but i was surprised to find that KS test in OT does not show a “deterministic” behavior. The resulting p-value is sometimes big enough, sometimes very close to 0, for a given sample and a given theoretical distributionFactory object.
am i missing something here ? or is it just some numerical approximation thing ?
Hi,
You will get a lot of information here on this topic. To make a long story short, the exact p-value is estimated using a Monte Carlo method in the case where some parameters are estimated and not readily known. It was the idea of Lilliefors for the normal distribution, and it has been extended (and implemented this way) e.g. in the matlab statistical toolbox.
Please do not feel sorry about asking questions and getting already answered answers, as the forum is designed just for this purpose. What is unwanted is unkind messages, a category in which your message does not fall!
There is a series of examples in the doc which presents the topic and the issue when parameters are estimated.
PS
Notice that there is a third case which is not presented in the doc: the parameters are estimated from the sample, but the user wrongly use the Kolmogorov class:
This could be used to improve the speed, as presented in https://github.com/openturns/openturns/issues/1061 : Denote by p_1 the p-value evaluated assuming that the parameters are known. The p-value p_1 is fast to compute. Denote p2 the p-value assuming that the parameters are estimated. We always have p_2<p_1.