Ranking and paraphrase identification¶
This library model solves the tasks of ranking and paraphrase identification based on semantic similarity which is trained with siamese neural networks. The trained network can retrieve the response closest semantically to a given context from some database or answer whether two sentences are paraphrases or not. It is possible to build automatic semantic FAQ systems with such neural architectures.
Training and inference models on predifined datasets¶
Before using models make sure that all required packages are installed running the command:
python -m deeppavlov install ranking_ubuntu_v2_torch_bert_uncased
Before using the model make sure that all required packages are installed running the command:
python -m deeppavlov install paraphraser_rubert
To train the model on the paraphraser.ru dataset one can use the following code in Python:
from deeppavlov import configs, train_model para_model = train_model('paraphraser_rubert', download=True)
train.csv: the same as for ranking.
valid.csv, test.csv: each line in the file contains
label separated by the tab key.
binary, i.e. 1 or 0 corresponding to the correct or incorrect
response for the given
context it can be simply two phrases which are paraphrases or non-paraphrases as indicated by the
Classification metrics on the valid and test dataset parts (the parameter
metrics in the JSON configuration file)
log_loss can be calculated.