Features¶

Models ¶

NER model [docs]¶

Named Entity Recognition task in DeepPavlov is solved with BERT-based model. The models predict tags (in BIO format) for tokens in input.

BERT-based model is described in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.

Dataset	Lang	Model	Test F1
Persons-1000 dataset with additional LOC and ORG markup (Collection 3)	Ru	ner_rus_bert.json	97.9
		ner_rus_convers_distilrubert_2L.json	88.4 ± 0.5
		ner_rus_convers_distilrubert_6L.json	93.3 ± 0.3
Ontonotes	Multi	ner_ontonotes_bert_mult.json	88.9
Ontonotes	En	ner_ontonotes_bert.json	89.2
ConLL-2003	En	ner_conll2003_bert.json	91.7

Model for classification tasks (intents, sentiment, etc) on word-level. Shallow-and-wide CNN, Deep CNN, BiLSTM, BiLSTM with self-attention and other models are presented. The model also allows multilabel classification of texts. Several pre-trained models are available and presented in Table below.

Task	Dataset	Lang	Model	Metric	Valid	Test	Downloads
Insult detection	Insults	En	English BERT	ROC-AUC	0.9327	0.8602	1.1 Gb
Sentiment	SST	En	5-classes SST on conversational BERT	Accuracy	0.6293	0.6626	1.1 Gb
Sentiment	Twitter mokoron	Ru	RuWiki+Lenta emb w/o preprocessing	Accuracy	0.9918	0.9923	5.8 Gb
	RuSentiment		Multi-language BERT	F1-weighted	0.6787	0.7005	1.3 Gb
			Conversational RuBERT		0.739	0.7724	1.5 Gb
			Conversational DistilRuBERT-tiny		0.703 ± 0.0031	0.7348 ± 0.0028	690 Mb
			Conversational DistilRuBERT-base		0.7376 ± 0.0045	0.7645 ± 0.035	1.0 Gb

As no one had published intent recognition for DSTC-2 data, the comparison of the presented model is given on SNIPS dataset. The evaluation of model scores was conducted in the same way as in 3 to compare with the results from the report of the authors of the dataset. The results were achieved with tuning of parameters and embeddings trained on Reddit dataset.

Model	AddToPlaylist	BookRestaurant	GetWheather	PlayMusic	RateBook	SearchCreativeWork	SearchScreeningEvent
api.ai	0.9931	0.9949	0.9935	0.9811	0.9992	0.9659	0.9801
ibm.watson	0.9931	0.9950	0.9950	0.9822	0.9996	0.9643	0.9750
microsoft.luis	0.9943	0.9935	0.9925	0.9815	0.9988	0.9620	0.9749
wit.ai	0.9877	0.9913	0.9921	0.9766	0.9977	0.9458	0.9673
snips.ai	0.9873	0.9921	0.9939	0.9729	0.9985	0.9455	0.9613
recast.ai	0.9894	0.9943	0.9910	0.9660	0.9981	0.9424	0.9539
amazon.lex	0.9930	0.9862	0.9825	0.9709	0.9981	0.9427	0.9581

Shallow-and-wide CNN	0.9956	0.9973	0.9968	0.9871	0.9998	0.9752	0.9854

3: https://www.slideshare.net/KonstantinSavenkov/nlu-intent-detection-benchmark-by-intento-august-2017

Automatic spelling correction model [docs]¶

Pipelines that use candidates search in a static dictionary and an ARPA language model to correct spelling errors.

Note

About 4.4 GB on disc required for the Russian language model and about 7 GB for the English one.

Comparison on the test set for the SpellRuEval competition on Automatic Spelling Correction for Russian:

Correction method	Precision	Recall	F-measure	Speed (sentences/s)
Yandex.Speller	83.09	59.86	69.59
Damerau Levenshtein 1 + lm	53.26	53.74	53.50	29.3
Hunspell + lm	41.03	48.89	44.61	2.1
JamSpell	44.57	35.69	39.64	136.2
Hunspell	30.30	34.02	32.06	20.3

Ranking model [docs]¶

Available pre-trained models for paraphrase identification:

Dataset	Model config	Val (accuracy)	Test (accuracy)	Val (F1)	Test (F1)	Val (log_loss)	Test (log_loss)	Downloads
paraphraser.ru	paraphrase_rubert	89.8	84.2	92.2	87.4	–	–	1325M
paraphraser.ru	paraphraser_convers_distilrubert_2L	76.1 ± 0.2	64.5 ± 0.5	81.8 ± 0.2	73.9 ± 0.8	–	–	618M
paraphraser.ru	paraphraser_convers_distilrubert_6L	86.5 ± 0.5	78.9 ± 0.4	89.6 ± 0.3	83.2 ± 0.5	–	–	930M

References:

Yu Wu, Wei Wu, Ming Zhou, and Zhoujun Li. 2017. Sequential match network: A new architecture for multi-turn response selection in retrieval-based chatbots. In ACL, pages 372–381. https://www.aclweb.org/anthology/P17-1046
Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu and Hua Wu. 2018. Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1118-1127, ACL. http://aclweb.org/anthology/P18-1103
Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, and Rui Yan. Multi-Representation Fusion Network for Multi-turn Response Selection in Retrieval-based Chatbots. In WSDM’19. https://dl.acm.org/citation.cfm?id=3290985
Gu, Jia-Chen & Ling, Zhen-Hua & Liu, Quan. (2019). Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. https://arxiv.org/abs/1901.01824

TF-IDF Ranker model [docs]¶

Based on Reading Wikipedia to Answer Open-Domain Questions. The model solves the task of document retrieval for a given query.

Dataset	Model			Wiki dump	Recall@5	Downloads
SQuAD-v1.1	doc_retrieval			enwiki (2018-02-11)	75.6	33 GB

Question Answering model [docs]¶

Models in this section solve the task of looking for an answer on a question in a given context (SQuAD task format). There are two models for this task in DeepPavlov: BERT-based and R-Net. Both models predict answer start and end position in a given context.

BERT-based model is described in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.

RuBERT-based model is described in Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language.

Dataset	Model config	lang	EM (dev)	F-1 (dev)	Downloads
SQuAD-v1.1	DeepPavlov BERT	en	81.49	88.86	1.2 Gb
SQuAD-v2.0	DeepPavlov BERT	en	75.71	80.72	1.2 Gb
SDSJ Task B	DeepPavlov RuBERT	ru	66.21	84.71	1.7 Mb
SDSJ Task B	DeepPavlov RuBERT, trained with tfidf-retrieved negative samples	ru	66.24	84.71	1.6 Gb
SDSJ Task B	DeepPavlov DistilRuBERT-tiny	ru	44.2 ± 0.46	65.1 ± 0.36	867Mb
SDSJ Task B	DeepPavlov DistilRuBERT-base	ru	61.23 ± 0.42	80.36 ± 0.28	1.18Gb

In the case when answer is not necessary present in given context we have qa_squad2_bert model. This model outputs empty string in case if there is no answer in context.

ODQA [docs]¶

An open domain question answering model. The model accepts free-form questions about the world and outputs an answer based on its Wikipedia knowledge.

Dataset	Model config	Wiki dump	F1	Downloads
SQuAD-v1.1	ODQA	enwiki (2018-02-11)	46.24	9.7Gb
SDSJ Task B	ODQA with RuBERT	ruwiki (2018-04-01)	37.83	4.3Gb

python -m deeppavlov interact insults_kaggle_bert -d

Run insults detection model with REST API:

python -m deeppavlov riseapi insults_kaggle_bert -d

Predict whether it is an insult on every line in a file:

python -m deeppavlov predict insults_kaggle_bert -d --batch-size 15 < /data/in.txt > /data/out.txt

Features¶

Models ¶

NER model [docs]¶

Classification model [docs]¶

Automatic spelling correction model [docs]¶

Ranking model [docs]¶

TF-IDF Ranker model [docs]¶

Question Answering model [docs]¶

ODQA [docs]¶

AutoML ¶

Hyperparameters optimization [docs]¶

Embeddings ¶

Pre-trained embeddings [docs]¶

Examples of some models ¶