# deeppavlov.models.sklearn¶

class deeppavlov.models.sklearn.sklearn_component.SklearnComponent(model_class: str, save_path: Union[str, pathlib.Path] = None, load_path: Union[str, pathlib.Path] = None, infer_method: str = 'predict', ensure_list_output: bool = False, **kwargs)[source]

Class implements wrapper for sklearn components for feature extraction, feature selection, classification, regression etc.

Parameters
• model_class – string with full name of sklearn model to use, e.g. sklearn.linear_model:LogisticRegression

• save_path – save path for model, e.g. full name model_path/model.pkl or prefix model_path/model (still model will be saved to model_path/model.pkl)

• load_path – load path for model, e.g. full name model_path/model.pkl or prefix model_path/model (still model will be loaded from model_path/model.pkl)

• infer_method – string name of class method to use for infering model, e.g. predict, predict_proba, predict_log_proba, transform

• ensure_list_output – whether to ensure that output for each sample is iterable (but not string)

• kwargs – dictionary with parameters for the sklearn model

model

sklearn model instance

model_class

string with full name of sklearn model to use, e.g. sklearn.linear_model:LogisticRegression

model_params

dictionary with parameters for the sklearn model without pipe parameters

pipe_params

dictionary with parameters for pipe: in, out, fit_on, main, name

save_path

save path for model, e.g. full name model_path/model.pkl or prefix model_path/model (still model will be saved to model_path/model.pkl)

load_path

load path for model, e.g. full name model_path/model.pkl or prefix model_path/model (still model will be loaded from model_path/model.pkl)

infer_method

string name of class method to use for infering model, e.g. predict, predict_proba, predict_log_proba, transform

ensure_list_output

whether to ensure that output for each sample is iterable (but not string)

__call__(*args)[source]

Infer on the given data according to given in the config infer method, e.g. "predict", "predict_proba", "transform"

Parameters

*args – list of inputs

Returns

predictions, e.g. list of labels, array of probability distribution, sparse array of vectorized samples

fit(*args)None[source]

Fit model on the given data

Parameters

*args – list of x-inputs and, optionally, one y-input (the last one) to fit on. Possible input (x0, …, xK, y) or (x0, …, xK) ‘ where K is the number of input data elements (the length of list in from config). In case of several inputs (K > 1) input features will be stacked. For example, one has x0: (n_samples, n_features0), …, xK: (n_samples, n_featuresK), then model will be trained on x: (n_samples, n_features0 + … + n_featuresK).

Returns

None

init_from_scratch()None[source]

Initialize self.model as some sklearn model from scratch with given in self.model_params parameters.

Returns

None

load(fname: str = None)None[source]

Initialize self.model as some sklearn model from saved re-initializing self.model_params parameters. If in new given parameters warm_start is set to True and given model admits warm_start parameter, model will be initilized from saved with opportunity to continue fitting.

Parameters

fname – string name of path to model to load from

Returns

None

save(fname: str = None)None[source]

Save self.model to the file from fname or, if not given, self.save_path. If self.save_path does not have .pkl extension, then it will be replaced to str(Path(self.save_path).stem) + ".pkl"

Parameters

fname – string name of path to model to save to

Returns

None

static compose_input_data(x: List[Union[Tuple[Union[numpy.ndarray, list, scipy.sparse.base.spmatrix, str]], List[Union[numpy.ndarray, list, scipy.sparse.base.spmatrix, str]], numpy.ndarray, scipy.sparse.base.spmatrix]]) → Union[scipy.sparse.base.spmatrix, numpy.ndarray][source]

Stack given list of different types of inputs to the one matrix. If one of the inputs is a sparse matrix, then output will be also a sparse matrix

Parameters

x – list of data elements

Returns

sparse or dense array of stacked data

static get_class_attributes(cls: type) → List[str][source]

Get list of names of given class’ attributes

Parameters

cls – class

Returns

list of names of given class’ attributes

static get_function_params(f: Callable) → List[str][source]

Get list of names of given function’s parameters

Parameters

f – function

Returns

list of names of given function’s parameters