# Quality of student parameter estimation

Dear developers,

I was trying to improve my estimation of a Student distribution from the scipy.stats method but found out that my use of OpenTURNS was way worse. So worse that I fear I made an huge mistake in using OpenTURNS.
But the Scipy estimator gave the right answer on a sample Student distribution generated by OpenTURNS.
Thus I am perplexed …
Any ideas ?

Thanks

Here is my code :

import openturns as ot
from scipy import stats

Nbre_mesures = 327680
ecart_type = 1
df = 1.5

# test openturns
# sample generation from scipy and parameter estimation
# from scipy and openturns
student = stats.t.rvs(df, scale=ecart_type, size=Nbre_mesures)

sample = ot.Sample.BuildFromPoint(student)
distribution = ot.StudentFactory().build(sample)
print(stats.t.fit(student), distribution.getParameter())

# In reverse
sample = ot.Student(df, 0.0, ecart_type).getSample(Nbre_mesures)
distribution = ot.StudentFactory().buildAsStudent(sample)
print(stats.t.fit(sample), distribution.getParameter())


and I got :

WRN - TNC went to an abnormal point=[nan]

WRN - Warning! As nu <= 2, the covariance of the distribution will not be defined
(1.5033992737466937, 0.0016595840358914786, 0.9979708660554066) [2.00211,-0.0396894,1.11723]
(1.4966359121213684, -0.0023588844106721698, 0.9989014331961306) [2.01209,-0.00297962,1.12155]



Very bad estimation of the degree of freedom

Hi,

Unfortunately the StudentFactory class does not work well for the heaviest tails, it means for \nu \leq 2 here, as it depends on the existence of the variance in a (slightly) obfuscated way. If you replace df=1.5 by df=2.5 you get the correct result, but even if df=2.05 is greater than 2, the results start to be strange. You may have better success using the MaximumLikelihoodFactory class but you will have to set the optimization bounds manually, which is not very satisfactory.
Would you be kind enough to open an issue on Github?

Cheers

Régis