deeppavlov.models.ner

class deeppavlov.models.ner.network.NerNetwork(n_tags: int, token_emb_dim: int = None, char_emb_dim: int = None, capitalization_dim: int = None, pos_features_dim: int = None, additional_features: int = None, net_type: str = 'rnn', cell_type: str = 'lstm', use_cudnn_rnn: bool = False, two_dense_on_top: bool = False, n_hidden_list: Tuple[int] = (128, ), cnn_filter_width: int = 7, use_crf: bool = False, token_emb_mat: numpy.ndarray = None, char_emb_mat: numpy.ndarray = None, use_batch_norm: bool = False, dropout_keep_prob: float = 0.5, embeddings_dropout: bool = False, top_dropout: bool = False, intra_layer_dropout: bool = False, l2_reg: float = 0.0, clip_grad_norm: float = 5.0, learning_rate: float = 0.003, gpu: int = None, seed: int = None, lr_drop_patience: int = 5, lr_drop_value: float = 0.1, **kwargs)[source]

The NerNetwork is for Neural Named Entity Recognition and Slot Filling.

Parameters:
  • n_tags – Number of tags in the tag vocabulary.
  • token_emb_dim – Dimensionality of token embeddings, needed if embedding matrix is not provided.
  • char_emb_dim – Dimensionality of token embeddings.
  • capitalization_dim – Dimensionality of capitalization features, if they are provided.
  • pos_features_dim – Dimensionality of POS features, if they are provided.
  • additional_features – Some other features.
  • net_type – Type of the network, either 'rnn' or 'cnn'.
  • cell_type – Type of the cell in RNN, either 'lstm' or 'gru'.
  • use_cudnn_rnn – Whether to use CUDNN implementation of RNN.
  • two_dense_on_top – Additional dense layer before predictions.
  • n_hidden_list – A list of output feature dimensionality for each layer. A value (100, 200) means that there will be two layers with 100 and 200 units, respectively.
  • cnn_filter_width – The width of the convolutional kernel for Convolutional Neural Networks.
  • use_crf – Whether to use Conditional Random Fields on top of the network (recommended).
  • token_emb_mat – Token embeddings matrix.
  • char_emb_mat – Character embeddings matrix.
  • use_batch_norm – Whether to use Batch Normalization or not. Affects only CNN networks.
  • dropout_keep_prob – Probability of keeping the hidden state, values from 0 to 1. 0.5 works well in most cases.
  • embeddings_dropout – Whether to use dropout on embeddings or not.
  • top_dropout – Whether to use dropout on output units of the network or not.
  • intra_layer_dropout – Whether to use dropout between layers or not.
  • l2_reg – L2 norm regularization for all kernels.
  • clip_grad_norm – Clip the gradients by norm.
  • learning_rate – Learning rate to use during the training (usually from 0.1 to 0.0001)
  • gpu – Number of gpu to use.
  • seed – Random seed.
  • lr_drop_patience – How many epochs to wait until drop the learning rate.
  • lr_drop_value – Amount of learning rate drop.