Open Domain Question Answering Skill on Wikipedia¶
Task definition¶
Open Domain Question Answering (ODQA) is a task to find an exact answer to any question in Wikipedia articles. Thus, given only a question, the system outputs the best answer it can find:
:: What is the name of Darth Vader's son?
>> Luke Skywalker
Languages¶
There are pretrained ODQA models for English and Russian languages in DeepPavlov DeepPavlov.
Models¶
The architecture of ODQA skill is modular and consists of two models, a ranker and a reader. The ranker is based on DrQA [1] proposed by Facebook Research and the reader is based on R-NET [2] proposed by Microsoft Research Asia and its implementation [3] by Wenxuan Zhou.
Running ODQA¶
Tensorflow-1.8.0 with GPU support is required to run this model.
About 16 GB of RAM required
Note
TensorFlow 1.8 with GPU support is required to run this skill.
About 16 GB of RAM required.
Training¶
ODQA ranker and ODQA reader should be trained separately. Read about training the ranker here. Read about training the reader in our separate reader tutorial.
Interacting¶
When interacting, the ODQA skill returns a plain answer to the user’s question.
Run the following to interact with English ODQA:
cd deeppavlov/
python deep.py interact deeppavlov/configs/odqa/en_odqa_infer_wiki.json -d
Run the following to interact with Russian ODQA:
cd deeppavlov/
python deep.py interact deeppavlov/configs/odqa/ru_odqa_infer_wiki.json -d
Configuration¶
The ODQA configs suit only model inferring purposes. For training purposes use the ranker configs and the reader configs accordingly.
Comparison¶
Scores for ODQA skill:
Skill | Config | Ranker Recall (top 5) | Reader f1 |
---|---|---|---|
ODQA English | en_odqa_infer_wiki.json | 0.756 | 0.257 |