TfidfRanker(vectorizer: deeppavlov.models.vectorizers.hashing_tfidf_vectorizer.HashingTfIdfVectorizer, top_n=5, active: bool = True, **kwargs)¶
Rank documents according to input strings.
- vectorizer – a vectorizer class
- top_n – a number of doc ids to return
- active – whether to return a number specified by
True) or all ids (
a number of doc ids to return
an instance of vectorizer class
a dataset iterator used for generating batches while fitting the vectorizer
__call__(questions: List[str]) → Tuple[List[Any], List[float]]¶
Rank documents and return top n document titles with scores.
Parameters: questions – list of queries used in ranking Returns: a tuple of selected doc ids and their scores
LogitRanker(squad_model: deeppavlov.core.models.component.Component, batch_size: int = 50, sort_noans: bool = False, **kwargs)¶
Select best answer using squad model logits. Make several batches for a single batch, send each batch to the squad model separately and get a single best answer for each batch.
- squad_model – a loaded squad model
- batch_size – batch size to use with squad model
- sort_noans – whether to downgrade noans tokens in the most possible answers
a loaded squad model
batch size to use with squad model
__call__(contexts_batch: List[List[str]], questions_batch: List[List[str]]) → List[str]¶
Sort obtained results from squad reader by logits and get the answer with a maximum logit.
- contexts_batch – a batch of contexts which should be treated as a single batch in the outer JSON config
- questions_batch – a batch of questions which should be treated as a single batch in the outer JSON config
a batch of best answers