Improving Arabic Named Entity Recognition with a Modified Transformer Encoder
- 1 Department of Computer Science, College of Basic Education, University of Diyala, Iraq
- 2 Department of Computer Science, Faculty of Computer Science and Mathematics, University of Kufa, Iraq
Abstract
This article investigates the use of a transformer encoder for Arabic Named Entity Recognition (NER). The classic transformer that was originally proposed for machine translation adopts the absolute sinusoidal position embedding which is aware of distance but unfortunately is not aware of the directionality. However, in the NER task, both distance and orientation are crucial. Therefore, in this study, instead of using absolute sinusoidal position encoding, we employ relative positional encoding and incorporate the directionality information in our NER model. More specifically, our proposed model uses Bidirectional Long Short-Term Memory (BiLSTM) for encoding every input token. Then, the output of the encoder is fed to the multi-head attention where both the distance and directionality information are incorporated. The decoder layer with a simple fully connected layer takes as input, the result of the attention layer, and the prediction layer with Conditional Random Fields (CRF) predicts the tag of each token. We validate our proposed approach on two merged public datasets, namely, ANER corp and AQMAR. Our experiment results demonstrate significant improvements when compare to the vanilla Transformer with absolute sinusoidal position encoding while achieving a state-of-the-art result on a merged two Arabic public datasets.
DOI: https://doi.org/10.3844/jcssp.2023.599.609
Copyright: © 2023 Hamid Sadeq Mahdi Alsultani and Ahmed H. Aliwy. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 1,902 Views
- 1,000 Downloads
- 0 Citations
Download
Keywords
- Named Entity Recognition
- Transformer
- Natural Language Processing