deeppavlov.models.squad

class deeppavlov.models.squad.squad.SquadModel(word_emb: numpy.ndarray, char_emb: numpy.ndarray, context_limit: int = 450, question_limit: int = 150, char_limit: int = 16, train_char_emb: bool = True, char_hidden_size: int = 100, encoder_hidden_size: int = 75, attention_hidden_size: int = 75, keep_prob: float = 0.7, min_learning_rate: float = 0.001, noans_token: bool = False, **kwargs)[source]

SquadModel predicts answer start and end position in given context by given question.

High level architecture: Word embeddings -> Contextual embeddings -> Question-Context Attention -> Self-attention -> Pointer Network

If noans_token flag is True, then special noans_token is added to output of self-attention layer. Pointer Network can select noans_token if there is no answer in given context.

Parameters
  • word_emb – pretrained word embeddings

  • char_emb – pretrained char embeddings

  • context_limit – max context length in tokens

  • question_limit – max question length in tokens

  • char_limit – max number of characters in token

  • char_hidden_size – hidden size of charRNN

  • encoder_hidden_size – hidden size of encoder RNN

  • attention_hidden_size – size of projection layer in attention

  • keep_prob – dropout keep probability

  • min_learning_rate – minimal learning rate, is used in learning rate decay

  • noans_token – boolean, flags whether to use special no_ans token to make model able not to answer on question

__call__(c_tokens: numpy.ndarray, c_chars: numpy.ndarray, q_tokens: numpy.ndarray, q_chars: numpy.ndarray, *args, **kwargs) → Tuple[numpy.ndarray, numpy.ndarray, List[float]][source]

Predicts answer start and end positions by given context and question.

Parameters
  • c_tokens – batch of tokenized contexts

  • c_chars – batch of tokenized contexts, each token split on chars

  • q_tokens – batch of tokenized questions

  • q_chars – batch of tokenized questions, each token split on chars

Returns

answer_start, answer_end positions, answer logits which represent models confidence

train_on_batch(c_tokens: numpy.ndarray, c_chars: numpy.ndarray, q_tokens: numpy.ndarray, q_chars: numpy.ndarray, y1s: Tuple[List[int], …], y2s: Tuple[List[int], …])float[source]

This method is called by trainer to make one training step on one batch.

Parameters
  • c_tokens – batch of tokenized contexts

  • c_chars – batch of tokenized contexts, each token split on chars

  • q_tokens – batch of tokenized questions

  • q_chars – batch of tokenized questions, each token split on chars

  • y1s – batch of ground truth answer start positions

  • y2s – batch of ground truth answer end positions

Returns

value of loss function on batch

process_event(event_name: str, data)None[source]

Processes events sent by trainer. Implements learning rate decay.

Parameters
  • event_name – event_name sent by trainer

  • data – number of examples, epochs, metrics sent by trainer