deeppavlov.models.ranking

Ranking classes.

class deeppavlov.models.ranking.bilstm_siamese_network.BiLSTMSiameseNetwork(*args, **kwargs)[source]

The class implementing a siamese neural network with BiLSTM and max pooling.

There is a possibility to use a binary cross-entropy loss as well as a triplet loss with random or hard negative sampling.

Parameters
  • len_vocab – A size of the vocabulary to build embedding layer.

  • seed – Random seed.

  • shared_weights – Whether to use shared weights in the model to encode contexts and responses.

  • embedding_dim – Dimensionality of token (word) embeddings.

  • reccurent – A type of the RNN cell. Possible values are lstm and bilstm.

  • hidden_dim – Dimensionality of the hidden state of the RNN cell. If reccurent equals bilstm hidden_dim should be doubled to get the actual dimensionality.

  • max_pooling – Whether to use max-pooling operation to get context (response) vector representation. If False, the last hidden state of the RNN will be used.

  • triplet_loss – Whether to use a model with triplet loss. If False, a model with crossentropy loss will be used.

  • margin – A margin parameter for triplet loss. Only required if triplet_loss is set to True.

  • hard_triplets – Whether to use hard triplets sampling to train the model i.e. to choose negative samples close to positive ones. If set to False random sampling will be used. Only required if triplet_loss is set to True.

class deeppavlov.models.ranking.keras_siamese_model.KerasSiameseModel(*args, **kwargs)[source]

The class implementing base functionality for siamese neural networks in keras.

Parameters
  • learning_rate – Learning rate.

  • use_matrix – Whether to use a trainable matrix with token (word) embeddings.

  • emb_matrix – An embeddings matrix to initialize an embeddings layer of a model. Only used if use_matrix is set to True.

  • max_sequence_length – A maximum length of text sequences in tokens. Longer sequences will be truncated and shorter ones will be padded.

  • dynamic_batch – Whether to use dynamic batching. If True, the maximum length of a sequence for a batch will be equal to the maximum of all sequences lengths from this batch, but not higher than max_sequence_length.

  • attention – Whether any attention mechanism is used in the siamese network.

  • *args – Other parameters.

  • **kwargs – Other parameters.

class deeppavlov.models.ranking.siamese_model.SiameseModel(batch_size: int, num_context_turns: int = 1, *args, **kwargs)[source]

The class implementing base functionality for siamese neural networks.

Parameters
  • batch_size – A size of a batch.

  • num_context_turns – A number of context turns in data samples.

  • *args – Other parameters.

  • **kwargs – Other parameters.

load(*args, **kwargs)None[source]
save(*args, **kwargs)None[source]
train_on_batch(samples_generator: Iterable[List[numpy.ndarray]], y: List[int])float[source]

This method is called by trainer to make one training step on one batch. The number of samples returned by samples_generator is always equal to batch_size, so we need to: 1) accumulate data for all of the inputs of the model; 2) format inputs of a model in a proper way using self._make_batch function; 3) run a model with provided inputs and ground truth labels (y) using self._train_on_batch function; 4) return mean loss value on the batch

Parameters
  • samples_generator (Iterable[List[np.ndarray]]) – generator that returns list of numpy arrays of words of all sentences represented as integers. Its shape: (number_of_context_turns + 1, max_number_of_words_in_a_sentence)

  • y (List[int]) – tuple of labels, with shape: (batch_size, )

Returns

value of mean loss on the batch

Return type

float

__call__(samples_generator: Iterable[List[numpy.ndarray]])Union[numpy.ndarray, List[str]][source]

This method is called by trainer to make one evaluation step on one batch.

Parameters
  • samples_generator (Iterable[List[np.ndarray]]) – generator that returns list of numpy arrays

  • words of all sentences represented as integers. (of) –

  • shape (Has) – (number_of_context_turns + 1, max_number_of_words_in_a_sentence)

Returns

predictions for the batch of samples

Return type

np.ndarray

reset()None[source]