deeppavlov.models.classifiers

class deeppavlov.models.classifiers.torch_classification_model.TorchTextClassificationModel(n_classes: int, model_name: str, embedding_size: Optional[int] = None, multilabel: bool = False, criterion: str = 'CrossEntropyLoss', optimizer: str = 'AdamW', optimizer_parameters: dict = {'lr': 0.1}, lr_scheduler: Optional[str] = None, lr_scheduler_parameters: dict = {}, embedded_tokens: bool = True, vocab_size: Optional[int] = None, lr_decay_every_n_epochs: Optional[int] = None, learning_rate_drop_patience: Optional[int] = None, learning_rate_drop_div: Optional[float] = None, return_probas: bool = True, **kwargs)[source]

Class implements torch model for classification of texts. Input can either be embedded tokenized texts OR indices of words in the vocabulary. Number of tokens is not fixed while the samples in batch should be padded to the same (e.g. longest) lengths.

Parameters
  • n_classes – number of classes

  • model_name – name of TorchTextClassificationModel methods which initializes model architecture

  • embedding_size – size of vector representation of words

  • multilabel – is multi-label classification (if so, sigmoid activation will be used, otherwise, softmax)

  • criterion – criterion name from torch.nn

  • optimizer – optimizer name from torch.optim

  • optimizer_parameters – dictionary with optimizer’s parameters, e.g. {‘lr’: 0.1, ‘weight_decay’: 0.001, ‘momentum’: 0.9}

  • lr_scheduler – string name of scheduler class from torch.optim.lr_scheduler

  • lr_scheduler_parameters – parameters for scheduler

  • embedded_tokens – True, if input contains embedded tokenized texts; False, if input containes indices of words in the vocabulary

  • vocab_size – vocabulary size in case of embedded_tokens=False, and embedding is a layer in the Network

  • lr_decay_every_n_epochs – how often to decay lr

  • learning_rate_drop_patience – how many validations with no improvements to wait

  • learning_rate_drop_div – the divider of the learning rate after learning_rate_drop_patience unsuccessful validations

  • return_probas – whether to return probabilities or index of classes (only for multilabel=False)

opt

dictionary with all model parameters

n_classes

number of considered classes

model

torch model itself

epochs_done

number of epochs that were done

optimizer

torch optimizer instance

criterion

torch criterion instance

__call__(texts: List[numpy.ndarray], *args) → Union[List[List[float]], List[int]][source]

Infer on the given data.

Parameters
  • texts – list of tokenized text samples

  • labels – labels

  • *args – additional arguments

Returns

vector of probabilities to belong with each class or list of labels sentence belongs with

Return type

for each sentence

cnn_model(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, dropout_rate: float = 0.0, **kwargs) → torch.nn.Module[source]

Build un-compiled model of shallow-and-wide CNN.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • dropout_rate – dropout rate, after convolutions and between dense.

  • kwargs – other parameters

Returns

instance of torch Model

Return type

torch.models.Model

process_event(event_name: str, data: dict)[source]

Process event after epoch

Parameters
  • event_name – whether event is send after epoch or batch. Set of values: "after_epoch", "after_batch"

  • data – event data (dictionary)

Returns

None

train_on_batch(texts: List[List[numpy.ndarray]], labels: list) → Union[float, List[float]][source]

Train the model on the given batch.

Parameters
  • texts – vectorized texts

  • labels – list of labels

Returns

metrics values on the given batch

class deeppavlov.models.classifiers.keras_classification_model.KerasClassificationModel(*args, **kwargs)[source]

Class implements Keras model for classification task for multi-class multi-labeled data.

Parameters
  • embedding_size – embedding_size from embedder in pipeline

  • n_classes – number of considered classes

  • model_name – particular method of this class to initialize model configuration

  • optimizer – function name from keras.optimizers

  • loss – function name from keras.losses.

  • last_layer_activation – parameter that determines activation function after classification layer. For multi-label classification use sigmoid, otherwise, softmax.

  • restore_lr – in case of loading pre-trained model whether to init learning rate with the final learning rate value from saved opt

  • classes – list or generator of considered classes

  • text_size – maximal length of text in tokens (words), longer texts are cut, shorter ones are padded with zeros (pre-padding)

  • paddingpre or post padding to use

opt

dictionary with all model parameters

n_classes

number of considered classes

model

keras model itself

epochs_done

number of epochs that were done

batches_seen

number of epochs that were seen

train_examples_seen

number of training samples that were seen

sess

tf session

optimizer

keras.optimizers instance

classes

list of considered classes

padding

pre or post padding to use

__call__(data: List[List[numpy.ndarray]]) → List[List[float]][source]

Infer on the given data

Parameters

data – list of tokenized text samples

Returns

vector of probabilities to belong with each class or list of labels sentence belongs with

Return type

for each sentence

bigru_model(units_gru: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model BiGRU.

Parameters
  • units_gru – number of units for GRU.

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for GRU. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for GRU. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bigru_with_max_aver_pool_model(units_gru: int, dense_size: int, coef_reg_gru: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model Bidirectional GRU with concatenation of max and average pooling after BiGRU.

Parameters
  • units_gru – number of units for GRU.

  • dense_size – number of units for dense layer.

  • coef_reg_gru – l2-regularization coefficient for GRU. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for GRU. Default: 0.0.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_bilstm_model(units_lstm_1: int, units_lstm_2: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled two-layers BiLSTM.

Parameters
  • units_lstm_1 – number of units for the first LSTM layer.

  • units_lstm_2 – number of units for the second LSTM layer.

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_cnn_model(units_lstm: int, kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled BiLSTM-CNN.

Parameters
  • units_lstm – number of units for LSTM.

  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_model(units_lstm: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled BiLSTM.

Parameters
  • units_lstm (int) – number of units for LSTM.

  • dense_size (int) – number of units for dense layer.

  • coef_reg_lstm (float) – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den (float) – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate (float) – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate (float) – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_self_add_attention_model(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model of BiLSTM with self additive attention.

Parameters
  • units_lstm – number of units for LSTM.

  • self_att_hid – number of hidden units in self-attention

  • self_att_out – number of output units in self-attention

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_self_mult_attention_model(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model of BiLSTM with self multiplicative attention.

Parameters
  • units_lstm – number of units for LSTM.

  • self_att_hid – number of hidden units in self-attention

  • self_att_out – number of output units in self-attention

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

check_input(texts: List[List[numpy.ndarray]]) → numpy.ndarray[source]

Check and convert input to array of tokenized embedded samples

Parameters

texts – list of tokenized embedded text samples

Returns

array of tokenized embedded texts samples that are cut and padded

cnn_bilstm_model(kernel_sizes_cnn: List[int], filters_cnn: int, units_lstm: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled BiLSTM-CNN.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • units_lstm – number of units for LSTM.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

cnn_model(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled model of shallow-and-wide CNN.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions.

  • coef_reg_den – l2-regularization coefficient for dense layers.

  • dropout_rate – dropout rate used after convolutions and between dense layers.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

cnn_model_max_and_aver_pool(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled model of shallow-and-wide CNN where average pooling after convolutions is replaced with concatenation of average and max poolings.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate used after convolutions and between dense layers. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

compile(model: tensorflow.keras.models.Model, optimizer_name: str, loss_name: str, learning_rate: Optional[Union[float, List[float]]], learning_rate_decay: Optional[Union[float, str]]) → tensorflow.keras.models.Model[source]

Compile model with given optimizer and loss

Parameters
  • model – keras uncompiled model

  • optimizer_name – name of optimizer from keras.optimizers

  • loss_name – loss function name (from keras.losses)

  • learning_rate – learning rate.

  • learning_rate_decay – learning rate decay.

Returns:

dcnn_model(kernel_sizes_cnn: List[int], filters_cnn: List[int], dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled model of deep CNN.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions.

  • coef_reg_den – l2-regularization coefficient for dense layers.

  • dropout_rate – dropout rate used after convolutions and between dense layers.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

get_optimizer()[source]

Return an instance of keras optimizer

init_model_from_scratch(model_name: str) → tensorflow.keras.models.Model[source]

Initialize uncompiled model from scratch with given params

Parameters

model_name – name of model function described as a method of this class

Returns

compiled model with given network and learning parameters

pad_texts(sentences: List[List[numpy.ndarray]]) → Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]][source]

Cut and pad tokenized texts to self.opt[“text_size”] tokens

Parameters

sentences – list of lists of tokens

Returns

array of embedded texts

save(fname: str = None)None[source]

Save the model parameters into <<fname>>_opt.json (or <<ser_file>>_opt.json) and model weights into <<fname>>.h5 (or <<ser_file>>.h5) :param fname: file_path to save model. If not explicitly given seld.opt[“ser_file”] will be used

Returns

None

train_on_batch(texts: List[List[numpy.ndarray]], labels: list) → Union[float, List[float]][source]

Train the model on the given batch

Parameters
  • texts – list of tokenized embedded text samples

  • labels – list of labels

Returns

metrics values on the given batch

class deeppavlov.models.classifiers.cos_sim_classifier.CosineSimilarityClassifier(top_n: int = 1, save_path: str = None, load_path: str = None, **kwargs)[source]

Classifier based on cosine similarity between vectorized sentences

Parameters
  • save_path – path to save the model

  • load_path – path to load the model

__call__(q_vects: Union[scipy.sparse.csr.csr_matrix, List]) → Tuple[List[str], List[int]][source]

Found most similar answer for input vectorized question

Parameters

q_vects – vectorized questions

Returns

Tuple of Answer and Score

fit(x_train_vects: Tuple[Union[scipy.sparse.csr.csr_matrix, List]], y_train: Tuple[str])None[source]

Train classifier

Parameters
  • x_train_vects – vectorized question for train dataset

  • y_train – answers for train dataset

Returns

None

load()None[source]

Load classifier parameters

save()None[source]

Save classifier parameters

class deeppavlov.models.classifiers.proba2labels.Proba2Labels(max_proba: bool = None, confident_threshold: float = None, top_n: int = None, **kwargs)[source]

Class implements probability to labels processing using the following ways: choosing one or top_n indices with maximal probability or choosing any number of indices which probabilities to belong with are higher than given confident threshold

Parameters
  • max_proba – whether to choose label with maximal probability

  • confident_threshold – boundary probability value for sample to belong with the class (best use for multi-label)

  • top_n – how many top labels with the highest probabilities to return

max_proba

whether to choose label with maximal probability

confident_threshold

boundary probability value for sample to belong with the class (best use for multi-label)

top_n

how many top labels with the highest probabilities to return

__call__(data: Union[numpy.ndarray, List[List[float]], List[List[int]]], *args, **kwargs) → Union[List[List[int]], List[int]][source]

Process probabilities to labels

Parameters

data – list of vectors with probability distribution

Returns

list of labels (only label classification) or list of lists of labels (multi-label classification)