Few-shot Text Classification¶
Table of contents¶
1. Introduction to the task¶
Text classification is a task of identifying one of the pre-defined label given an utterance, where label is one of N classes or “OOS” (out-of-scope examples - utterances that do not belong to any of the predefined classes). We consider few-shot setting, where only few examples (5 or 10) per intent class are given as a training set.
2. Get started with the model¶
First make sure you have the DeepPavlov Library installed. More info about the first installation.
[ ]:
!pip install -q deeppavlov
Then make sure that all the required packages are installed.
[ ]:
!python -m deeppavlov install few_shot_roberta
few_shot_roberta
is the name of the model’s config_file. What is a Config File?
Configuration file defines the model and describes its hyperparameters. To use another model, change the name of the config_file here and further. Some of few-shot classification models with their config names can be found in the table.
3. Models list¶
At the moment, only few_shot_roberta
config support out-of-scope detection.
Config name |
Dataset |
Shot |
Model Size |
In-domain accuracy |
Out-of-scope recall |
Out-of-scope precision |
---|---|---|---|---|---|---|
few_shot_roberta |
5 |
1.4 GB |
84.1±1.9 |
93.2±0.8 |
97.8±0.3 |
|
few_shot_roberta |
5 |
1.4 GB |
59.4±1.4 |
87.9±1.2 |
40.3±0.7 |
|
few_shot_roberta |
5 |
1.4 GB |
51.4±2.1 |
93.7±0.7 |
82.7±1.4 |
|
fasttext_logreg* |
5 |
37 KB |
24.8±2.2 |
98.2±0.4 |
74.8±0.6 |
|
fasttext_logreg* |
5 |
37 KB |
13.4±0.5 |
98.6±0.2 |
20.5±0.1 |
|
fasttext_logreg* |
5 |
37 KB |
10.7±0.8 |
99.0±0.3 |
36.4±0.2 |
With zero threshold we can get a classification accuracy without OOS detection:
Config name |
Dataset |
Shot |
Model Size |
Accuracy |
---|---|---|---|---|
few_shot_roberta |
5 |
1.4 GB |
89.6 |
|
few_shot_roberta |
5 |
1.4 GB |
79.6 |
|
few_shot_roberta |
5 |
1.4 GB |
55.1 |
|
fasttext_logreg* |
5 |
37 KB |
86.3 |
|
fasttext_logreg* |
5 |
37 KB |
73.6 |
|
fasttext_logreg* |
5 |
37 KB |
51.6 |
* - config file was modified to predict OOS examples
4. Use the model for prediction¶
Base model few_shot_roberta
was already pre-trained to recognize simmilar utterances, so you can use off-the-shelf model to make predictions and evalutation. No additional training needed.
4.1 Dataset format¶
DNNC model compares input text to every example in dataset to determine, which class the input example belongs to. The dataset based on which classification is performed has the following format:
[
["text_1", "label_1"],
["text_2", "label_2"],
...
["text_n", "label_n"]
]
4.2 Predict using Python¶
After installing the model, build it from the config and predict.
[ ]:
from deeppavlov import build_model
model = build_model("few_shot_roberta", download=True)
If you set download
flag to True
, then existing model weights will be overwritten.
Setting the install
argument to True
is equivalent to executing the command line install
command. If set to True
, it will first install all the required packages.
Input: List[texts, dataset]
Output: List[labels]
[2]:
texts = [
"what expression would i use to say i love you if i were an italian",
"what's the currency conversion between krones and yen",
"i'd like to reserve a high-end car"
]
dataset = [
["please help me book a rental car for nashville", "car_rental"],
["how can i rent a car in boston", "car_rental"],
["help me get a rental car for march 2 to 6th", "car_rental"],
["how many pesos can i get for one dollar", "exchange_rate"],
["tell me the exchange rate between rubles and dollars", "exchange_rate"],
["what is the exchange rate in pesos for 100 dollars", "exchange_rate"],
["can you tell me how to say 'i do not speak much spanish', in spanish", "translate"],
["please tell me how to ask for a taxi in french", "translate"],
["how would i say thank you if i were russian", "translate"]
]
model(texts, dataset)
[2]:
['translate', 'exchange_rate', 'car_rental']
4.3 Predict using CLI¶
You can also get predictions in an interactive mode through CLI (Сommand Line Interface).
[ ]:
!python -m deeppavlov interact few_shot_roberta -d
-d
is an optional download key (alternative to download=True
in Python code). The key -d
is used to download the pre-trained model along with all other files needed to run the model.
Or make predictions for samples from stdin.
[ ]:
!python -m deeppavlov predict few_shot_roberta -f <file-name>
5. Customize the model¶
Out-of-scope (OOS) examples are determined via confidence with confidence_threshold parameter. For each input text, if the confidence of the model is lower than the confidence_threshold, then the input example is considered out-of-scop. The higher the threshold, the more often the model predicts “oos” class. By default it is set to 0, but you can change it to your preferences in configuration file.
[4]:
from deeppavlov import build_model
from deeppavlov.core.commands.utils import parse_config
model_config = parse_config('few_shot_roberta')
model_config['chainer']['pipe'][-1]['confidence_threshold'] = 0.1
model = build_model(model_config)
0.0