deeppavlov.models.classifiers

class deeppavlov.models.classifiers.keras_classification_model.KerasClassificationModel(embedding_size: int, n_classes: int, model_name: str, optimizer: str = 'Adam', loss: str = 'binary_crossentropy', learning_rate: Union[None, float, List[float]] = None, learning_rate_decay: Union[float, str, None] = 0.0, last_layer_activation: str = 'sigmoid', restore_lr: bool = False, classes: Union[list, Generator, None] = None, text_size: Optional[int] = None, padding: Optional[str] = 'pre', **kwargs)[source]

Class implements Keras model for classification task for multi-class multi-labeled data.

Parameters
  • embedding_size – embedding_size from embedder in pipeline

  • n_classes – number of considered classes

  • model_name – particular method of this class to initialize model configuration

  • optimizer – function name from keras.optimizers

  • loss – function name from keras.losses.

  • last_layer_activation – parameter that determines activation function after classification layer. For multi-label classification use sigmoid, otherwise, softmax.

  • restore_lr – in case of loading pre-trained model whether to init learning rate with the final learning rate value from saved opt

  • classes – list or generator of considered classes

  • text_size – maximal length of text in tokens (words), longer texts are cut, shorter ones are padded with zeros (pre-padding)

  • paddingpre or post padding to use

opt

dictionary with all model parameters

n_classes

number of considered classes

model

keras model itself

epochs_done

number of epochs that were done

batches_seen

number of epochs that were seen

train_examples_seen

number of training samples that were seen

sess

tf session

optimizer

keras.optimizers instance

classes

list of considered classes

padding

pre or post padding to use

__call__(data: List[List[numpy.ndarray]]) → List[List[float]][source]

Infer on the given data

Parameters

data – list of tokenized text samples

Returns

vector of probabilities to belong with each class or list of labels sentence belongs with

Return type

for each sentence

bigru_model(units_gru: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model BiGRU.

Parameters
  • units_gru – number of units for GRU.

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for GRU. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for GRU. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bigru_with_max_aver_pool_model(units_gru: int, dense_size: int, coef_reg_gru: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model Bidirectional GRU with concatenation of max and average pooling after BiGRU.

Parameters
  • units_gru – number of units for GRU.

  • dense_size – number of units for dense layer.

  • coef_reg_gru – l2-regularization coefficient for GRU. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiGRU and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for GRU. Default: 0.0.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_bilstm_model(units_lstm_1: int, units_lstm_2: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled two-layers BiLSTM.

Parameters
  • units_lstm_1 – number of units for the first LSTM layer.

  • units_lstm_2 – number of units for the second LSTM layer.

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_cnn_model(units_lstm: int, kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled BiLSTM-CNN.

Parameters
  • units_lstm – number of units for LSTM.

  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_model(units_lstm: int, dense_size: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled BiLSTM.

Parameters
  • units_lstm (int) – number of units for LSTM.

  • dense_size (int) – number of units for dense layer.

  • coef_reg_lstm (float) – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den (float) – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate (float) – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate (float) – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_self_add_attention_model(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model of BiLSTM with self additive attention.

Parameters
  • units_lstm – number of units for LSTM.

  • self_att_hid – number of hidden units in self-attention

  • self_att_out – number of output units in self-attention

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

bilstm_self_mult_attention_model(units_lstm: int, dense_size: int, self_att_hid: int, self_att_out: int, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Method builds uncompiled model of BiLSTM with self multiplicative attention.

Parameters
  • units_lstm – number of units for LSTM.

  • self_att_hid – number of hidden units in self-attention

  • self_att_out – number of output units in self-attention

  • dense_size – number of units for dense layer.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

check_input(texts: List[List[numpy.ndarray]]) → numpy.ndarray[source]

Check and convert input to array of tokenized embedded samples

Parameters

texts – list of tokenized embedded text samples

Returns

array of tokenized embedded texts samples that are cut and padded

cnn_bilstm_model(kernel_sizes_cnn: List[int], filters_cnn: int, units_lstm: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_lstm: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, rec_dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled BiLSTM-CNN.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • units_lstm – number of units for LSTM.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.

  • coef_reg_lstm – l2-regularization coefficient for LSTM. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate to be used after BiLSTM and between dense layers. Default: 0.0.

  • rec_dropout_rate – dropout rate for LSTM. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

cnn_model(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled model of shallow-and-wide CNN.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions.

  • coef_reg_den – l2-regularization coefficient for dense layers.

  • dropout_rate – dropout rate used after convolutions and between dense layers.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

cnn_model_max_and_aver_pool(kernel_sizes_cnn: List[int], filters_cnn: int, dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled model of shallow-and-wide CNN where average pooling after convolutions is replaced with concatenation of average and max poolings.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions. Default: 0.0.

  • coef_reg_den – l2-regularization coefficient for dense layers. Default: 0.0.

  • dropout_rate – dropout rate used after convolutions and between dense layers. Default: 0.0.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

compile(model: tensorflow.keras.models.Model, optimizer_name: str, loss_name: str, learning_rate: Union[float, List[float], None], learning_rate_decay: Union[float, str, None]) → tensorflow.keras.models.Model[source]

Compile model with given optimizer and loss

Parameters
  • model – keras uncompiled model

  • optimizer_name – name of optimizer from keras.optimizers

  • loss_name – loss function name (from keras.losses)

  • learning_rate – learning rate.

  • learning_rate_decay – learning rate decay.

Returns:

dcnn_model(kernel_sizes_cnn: List[int], filters_cnn: List[int], dense_size: int, coef_reg_cnn: float = 0.0, coef_reg_den: float = 0.0, dropout_rate: float = 0.0, input_projection_size: Optional[int] = None, **kwargs) → tensorflow.keras.models.Model[source]

Build un-compiled model of deep CNN.

Parameters
  • kernel_sizes_cnn – list of kernel sizes of convolutions.

  • filters_cnn – number of filters for convolutions.

  • dense_size – number of units for dense layer.

  • coef_reg_cnn – l2-regularization coefficient for convolutions.

  • coef_reg_den – l2-regularization coefficient for dense layers.

  • dropout_rate – dropout rate used after convolutions and between dense layers.

  • input_projection_size – if not None, adds Dense layer (with relu activation) right after input layer to the size input_projection_size. Useful for input dimentionaliry recuction. Default: None.

  • kwargs – other non-used parameters

Returns

uncompiled instance of Keras Model

Return type

keras.models.Model

get_optimizer()[source]

Return an instance of keras optimizer

init_model_from_scratch(model_name: str) → tensorflow.keras.models.Model[source]

Initialize uncompiled model from scratch with given params

Parameters

model_name – name of model function described as a method of this class

Returns

compiled model with given network and learning parameters

pad_texts(sentences: List[List[numpy.ndarray]]) → Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]][source]

Cut and pad tokenized texts to self.opt[“text_size”] tokens

Parameters

sentences – list of lists of tokens

Returns

array of embedded texts

save(fname: str = None)None[source]

Save the model parameters into <<fname>>_opt.json (or <<ser_file>>_opt.json) and model weights into <<fname>>.h5 (or <<ser_file>>.h5) :param fname: file_path to save model. If not explicitly given seld.opt[“ser_file”] will be used

Returns

None

train_on_batch(texts: List[List[numpy.ndarray]], labels: list) → Union[float, List[float]][source]

Train the model on the given batch

Parameters
  • texts – list of tokenized embedded text samples

  • labels – list of labels

Returns

metrics values on the given batch

class deeppavlov.models.classifiers.cos_sim_classifier.CosineSimilarityClassifier(top_n: int = 1, save_path: str = None, load_path: str = None, **kwargs)[source]

Classifier based on cosine similarity between vectorized sentences

Parameters
  • save_path – path to save the model

  • load_path – path to load the model

__call__(q_vects: Union[scipy.sparse.csr.csr_matrix, List]) → Tuple[List[str], List[int]][source]

Found most similar answer for input vectorized question

Parameters

q_vects – vectorized questions

Returns

Tuple of Answer and Score

fit(x_train_vects: Tuple[Union[scipy.sparse.csr.csr_matrix, List]], y_train: Tuple[str])None[source]

Train classifier

Parameters
  • x_train_vects – vectorized question for train dataset

  • y_train – answers for train dataset

Returns

None

load()None[source]

Load classifier parameters

save()None[source]

Save classifier parameters

class deeppavlov.models.classifiers.proba2labels.Proba2Labels(max_proba: bool = None, confident_threshold: float = None, top_n: int = None, **kwargs)[source]

Class implements probability to labels processing using the following ways: choosing one or top_n indices with maximal probability or choosing any number of indices which probabilities to belong with are higher than given confident threshold

Parameters
  • max_proba – whether to choose label with maximal probability

  • confident_threshold – boundary probability value for sample to belong with the class (best use for multi-label)

  • top_n – how many top labels with the highest probabilities to return

max_proba

whether to choose label with maximal probability

confident_threshold

boundary probability value for sample to belong with the class (best use for multi-label)

top_n

how many top labels with the highest probabilities to return

__call__(data: Union[numpy.ndarray, List[List[float]], List[List[int]]], *args, **kwargs) → Union[List[List[int]], List[int]][source]

Process probabilities to labels

Parameters

data – list of vectors with probability distribution

Returns

list of labels (only label classification) or list of lists of labels (multi-label classification)