TreeFeatureSelectionTransform#
- class TreeFeatureSelectionTransform(model: Literal['catboost'] | Literal['random_forest'] | DecisionTreeRegressor | ExtraTreeRegressor | RandomForestRegressor | ExtraTreesRegressor | GradientBoostingRegressor | CatBoostRegressor, top_k: int, features_to_use: List[str] | Literal['all'] = 'all', return_features: bool = False)[source]#
Bases:
BaseFeatureSelectionTransform
Transform that selects features according to tree-based models feature importance.
Notes
Transform works with any type of features, however most of the models works only with regressors. Therefore, it is recommended to pass the regressors into the feature selection transforms.
Init TreeFeatureSelectionTransform.
- Parameters:
model (Literal['catboost'] | ~typing.Literal['random_forest'] | ~sklearn.tree._classes.DecisionTreeRegressor | ~sklearn.tree._classes.ExtraTreeRegressor | ~sklearn.ensemble._forest.RandomForestRegressor | ~sklearn.ensemble._forest.ExtraTreesRegressor | ~sklearn.ensemble._gb.GradientBoostingRegressor | ~catboost.core.CatBoostRegressor) –
Model to make selection, it should have
feature_importances_
property (e.g. all tree-based regressors in sklearn).If
catboost.CatBoostRegressor
is given with nocat_features
parameter, thencat_features
are set duringfit
to be equal to columns of category type.Pre-defined options are also available:
catboost:
catboost.CatBoostRegressor(iterations=1000, silent=True)
;random_forest:
sklearn.ensemble.RandomForestRegressor(n_estimators=100, random_state=0)
.
top_k (int) – num of features to select; if there are not enough features, then all will be selected
features_to_use (List[str] | Literal['all']) – columns of the dataset to select from; if “all” value is given, all columns are used
return_features (bool) – indicates whether to return features or not.
Methods
fit
(ts)Fit the transform.
fit_transform
(ts)Fit and transform TSDataset.
Return the list with regressors created by the transform.
Inverse transform TSDataset.
load
(path)Load an object.
Get default grid for tuning hyperparameters.
save
(path)Save the object.
set_params
(**params)Return new object instance with modified parameters.
to_dict
()Collect all information about etna object in dict.
transform
(ts)Transform TSDataset inplace.
Attributes
This class stores its
__init__
parameters as attributes.- fit(ts: TSDataset) Transform [source]#
Fit the transform.
- Parameters:
ts (TSDataset) – Dataset to fit the transform on.
- Returns:
The fitted transform instance.
- Return type:
Transform
- fit_transform(ts: TSDataset) TSDataset [source]#
Fit and transform TSDataset.
May be reimplemented. But it is not recommended.
- inverse_transform(ts: TSDataset) TSDataset [source]#
Inverse transform TSDataset.
Apply the _inverse_transform method.
- classmethod load(path: Path) Self [source]#
Load an object.
- Parameters:
path (Path) – Path to load object from.
- Returns:
Loaded object.
- Return type:
Self
- params_to_tune() Dict[str, BaseDistribution] [source]#
Get default grid for tuning hyperparameters.
This grid tunes parameters:
model
,top_k
. Other parameters are expected to be set by the user.For
model
parameter only pre-defined options are suggested. Fortop_k
parameter the maximum suggested value is not greater thanself.top_k
.- Returns:
Grid to tune.
- Return type:
- set_params(**params: dict) Self [source]#
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters:
**params (dict) – Estimator parameters
- Returns:
New instance with changed parameters
- Return type:
Self
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = model=NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )