Hi!
I am trying to do the Second order Sensitivity analysis of a Gradient Boosted model from SKLearn package. However the OT is giving this msg.
model = ot.SymbolicFunction(feature_list,[GB1(feature_list)])
'GradientBoostingRegressor' object is not callable
Kindly help
with regards
Saurav
Hi,
Indeed GradientBoosting
are not analytical function and thus could not be casted like that.
You need to rely on a PythonFunction
. Here an example for that purpose:
import openturns as ot
import numpy as np
class SklearnPyFunction(ot.OpenTURNSPythonFunction):
"""
Define a OpenTURNS Function using Machine learning algorithms from scikit.
Parameters
----------
algo : a scikit algo
Algo for response surface, already trained/validated
in_dim : int
Input dimension
out_dim: int
Output dimension
"""
def __init__(self, algo, in_dim, out_dim):
super(SklearnPyFunction, self).__init__(in_dim, out_dim)
self.algo = algo
def _exec(self, x):
X = np.reshape(x, (1, -1))
return self.algo.predict(X).ravel()
def _exec_sample(self, x):
X = np.array(x)
size = len(X)
return self.algo.predict(X).reshape(size, self.getOutputDimension())
class GradientBoosting(ot.Function):
"""
Define an OpenTURNS Function using sklearn algorithms
Parameters
----------
algo : a scikit algo
Algo for response surface, already trained/validated
in_dim : int
Input dimension
out_dim: int
Output dimension
"""
def __new__(self, algo, in_dim, out_dim):
python_function = SklearnPyFunction(algo, in_dim, out_dim)
return ot.Function(python_function)
As an example:
import openturns as ot
from sklearn.ensemble import GradientBoostingRegressor
size = 10
model = ot.SymbolicFunction("x", "(1.0 + sign(x)) * cos(x) - (sign(x) - 1) * sin(2*x)")
dataX = ot.Uniform().getSample(size)
dataY = model(dataX)
algo = GradientBoostingRegressor()
algo.fit(dataX, dataY)
f = GradientBoosting(algo, 1, 1)
print(f(dataX))
Hope this helps
BR
Sofiane
1 Like
Dear Sir
Thank you very much for your kind response. Now I am able to run the sensitivity analysis. My data set has six input parameters which are following log normal distribution (As they can not have a negative value). The input parameters are correlated so I am trying trying to run the sensitivity analysis based on ANCOVA indices. However, with the analysis, I am getting Negative and very small ANCOVA indices for all the parameters with following message.
ANCOVA indices [-0.00427258,0.0114251,0.00705848,0.00992637,0.010744,0.00159234]
ANCOVA uncorrelated indices [0.000100539,0.000217306,0.00039698,0.00229071,0.000511348,0.000109577]
ANCOVA correlated indices [-0.00437312,0.0112077,0.0066615,0.00763566,0.0102326,0.00148276]
I am not able to infer the results from this data. Kindly help
Saurav