class deeppavlov.models.classifiers.torch_classification_model.TorchTextClassificationModel(n_classes: int, model_name: str, embedding_size: Optional[int] = None, multilabel: bool = False, criterion: str = 'CrossEntropyLoss', optimizer: str = 'AdamW', optimizer_parameters: dict = {'lr': 0.1}, lr_scheduler: Optional[str] = None, lr_scheduler_parameters: dict = {}, embedded_tokens: bool = True, vocab_size: Optional[int] = None, lr_decay_every_n_epochs: Optional[int] = None, learning_rate_drop_patience: Optional[int] = None, learning_rate_drop_div: Optional[float] = None, return_probas: bool = True, **kwargs)[source]

Class implements torch model for classification of texts. Input can either be embedded tokenized texts OR indices of words in the vocabulary. Number of tokens is not fixed while the samples in batch should be padded to the same (e.g. longest) lengths.

  • n_classes – number of classes

  • model_name – name of TorchTextClassificationModel methods which initializes model architecture

  • embedding_size – size of vector representation of words

  • multilabel – is multi-label classification (if so, sigmoid activation will be used, otherwise, softmax)

  • criterion – criterion name from torch.nn

  • optimizer – optimizer name from torch.optim

  • optimizer_parameters – dictionary with optimizer’s parameters, e.g. {‘lr’: 0.1, ‘weight_decay’: 0.001, ‘momentum’: 0.9}

  • lr_scheduler – string name of scheduler class from torch.optim.lr_scheduler

  • lr_scheduler_parameters – parameters for scheduler

  • embedded_tokens – True, if input contains embedded tokenized texts; False, if input containes indices of words in the vocabulary

  • vocab_size – vocabulary size in case of embedded_tokens=False, and embedding is a layer in the Network

  • lr_decay_every_n_epochs – how often to decay lr

  • learning_rate_drop_patience – how many validations with no improvements to wait

  • learning_rate_drop_div – the divider of the learning rate after learning_rate_drop_patience unsuccessful validations

  • return_probas – whether to return probabilities or index of classes (only for multilabel=False)


dictionary with all model parameters


number of considered classes


torch model itself


number of epochs that were done


torch optimizer instance


torch criterion instance

__call__(texts: List[numpy.ndarray], *args)Union[List[List[float]], List[int]][source]

Infer on the given data.

  • texts – list of tokenized text samples

  • labels – labels

  • *args – additional arguments


vector of probabilities to belong with each class or list of labels sentence belongs with

Return type

for each sentence

cnn_model(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, dropout_rate: float = 0.0, **kwargs)torch.nn.Module[source]

Build un-compiled model of shallow-and-wide CNN.

  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • dropout_rate – dropout rate, after convolutions and between dense.

  • kwargs – other parameters


instance of torch Model

Return type


process_event(event_name: str, data: dict)[source]

Process event after epoch

  • event_name – whether event is send after epoch or batch. Set of values: "after_epoch", "after_batch"

  • data – event data (dictionary)



train_on_batch(texts: List[List[numpy.ndarray]], labels: list)Union[float, List[float]][source]

Train the model on the given batch.

  • texts – vectorized texts

  • labels – list of labels


metrics values on the given batch

class deeppavlov.models.classifiers.cos_sim_classifier.CosineSimilarityClassifier(top_n: int = 1, save_path: Optional[str] = None, load_path: Optional[str] = None, **kwargs)[source]

Classifier based on cosine similarity between vectorized sentences

  • save_path – path to save the model

  • load_path – path to load the model

__call__(q_vects: Union[scipy.sparse.csr.csr_matrix, List])Tuple[List[str], List[int]][source]

Found most similar answer for input vectorized question


q_vects – vectorized questions


Tuple of Answer and Score

fit(x_train_vects: Tuple[Union[scipy.sparse.csr.csr_matrix, List]], y_train: Tuple[str])None[source]

Train classifier

  • x_train_vects – vectorized question for train dataset

  • y_train – answers for train dataset




Load classifier parameters


Save classifier parameters

class deeppavlov.models.classifiers.proba2labels.Proba2Labels(max_proba: Optional[bool] = None, confidence_threshold: Optional[float] = None, top_n: Optional[int] = None, is_binary: bool = False, **kwargs)[source]

Class implements probability to labels processing using the following ways: choosing one or top_n indices with maximal probability or choosing any number of indices which probabilities to belong with are higher than given confident threshold

  • max_proba – whether to choose label with maximal probability

  • confidence_threshold – boundary probability value for sample to belong with the class (best use for multi-label)

  • top_n – how many top labels with the highest probabilities to return


whether to choose label with maximal probability


boundary probability value for sample to belong with the class (best use for multi-label)


how many top labels with the highest probabilities to return

__call__(*args, **kwargs)[source]

Process probabilities to labels :param Every argument is a list of vectors with probability distribution:


list of labels (only label classification) or list of lists of labels (multi-label classification), or list of the following lists (in multitask setting) for every argument