Conceptual overview¶
Our goal is to enable AI-application developers and researchers with:
set of pre-trained NLP models, pre-defined dialog system components (ML/DL/Rule-based) and pipeline templates;
a framework for implementing and testing their own dialog models;
tools for application integration with adjacent infrastructure (messengers, helpdesk software etc.);
benchmarking environment for conversational models and uniform access to relevant datasets.
Key Concepts¶
Skillfulfills user’s goal in some domain. Typically, this is accomplished by presenting information or completing transaction (e.g. answer question by FAQ, booking tickets etc.). However, for some tasks a success of interaction is defined as continuous engagement (e.g. chit-chat).Modelis any NLP model that doesn’t necessarily communicates with user in natural language.Componentis a reusable functional part ofModelorSkill.Rule-based Modelscannot be trained.Machine Learning Modelscan be trained only stand alone.Deep Learning Modelscan be trained independently and in an end-to-end mode being joined in a chain.Skill Managerperforms selection of theSkillto generate response.Chainerbuilds a model pipeline from heterogeneous components (Rule-based/ML/DL). It allows to train and infer models in a pipeline as a whole.
The smallest building block of the library is Component.
Component stands for any kind of function in an NLP pipeline. It can
be implemented as a neural network, a non-neural ML model or a
rule-based system.
Components can be joined into a Model or a Skill. Model
solves a larger NLP task compared to Component. However, in terms of
implementation Modelss are not different from Components. The
difference of a Skill from a Model is that its input and output should
both be strings. Therefore, Skills are usually associated with
dialogue tasks.
DeepPavlov is built on top of machine learning frameworks TensorFlow and Keras. Other external libraries can be used to build basic components.