Open Domain Question Answering Skill on Wikipedia

Task definition

Open Domain Question Answering (ODQA) is a task to find an exact answer to any question in Wikipedia articles. Thus, given only a question, the system outputs the best answer it can find:

:: What is the name of Darth Vader's son?
>> Luke Skywalker


There are pretrained ODQA models for English and Russian languages in DeepPavlov DeepPavlov.


The architecture of ODQA skill is modular and consists of two models, a ranker and a reader. The ranker is based on DrQA [1] proposed by Facebook Research and the reader is based on R-NET [2] proposed by Microsoft Research Asia and its implementation [3] by Wenxuan Zhou.

Running ODQA

Tensorflow-1.8.0 with GPU support is required to run this model.

About 16 GB of RAM required


TensorFlow 1.8 with GPU support is required to run this skill.

About 16 GB of RAM required.


ODQA ranker and ODQA reader should be trained separately. Read about training the ranker here. Read about training the reader in our separate reader tutorial.


When interacting, the ODQA skill returns a plain answer to the user’s question.

Run the following to interact with English ODQA:

cd deeppavlov/
python interact deeppavlov/configs/odqa/en_odqa_infer_wiki.json -d

Run the following to interact with Russian ODQA:

cd deeppavlov/
python interact deeppavlov/configs/odqa/ru_odqa_infer_wiki.json -d


The ODQA configs suit only model inferring purposes. For training purposes use the ranker configs and the reader configs accordingly.


Scores for ODQA skill:

Model Dataset Wiki dump F1 EM
DeepPavlov SQuAD (dev) enwiki (2018-02-11) 28.0 22.2
DrQA [1] SQuAD (dev) enwiki (2016-12-21) - 27.1

EM stands for “exact-match accuracy”. Metrics are counted for top 5 documents returned by retrieval module.