Author: Abbad, Hamza; Xiong, Shengwu
                    Title: Multi-components System for Automatic Arabic Diacritization  Cord-id: w031hm7f  Document date: 2020_3_17
                    ID: w031hm7f
                    
                    Snippet: In this paper, we propose an approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector which uses the context and the distance information to fix some diacritization issues. This appro
                    
                    
                    
                     
                    
                    
                    
                    
                        
                            
                                Document: In this paper, we propose an approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector which uses the context and the distance information to fix some diacritization issues. This approach is novel in a way that combines methods of different types and adds edit distance based corrections. We used a large public dataset containing raw diacritized Arabic text (Tashkeela) for training and testing our system after cleaning and normalizing it. On a newly-released benchmark test set, our system outperformed all the tested systems by achieving DER of 3.39% and WER of 9.94% when taking all Arabic letters into account, DER of 2.61% and WER of 5.83% when ignoring the diacritization of the last letter of every word.
 
  Search related documents: 
                                Co phrase  search for related documents- activation function and machine learning model: 1
- active passive and machine learning: 1
- active passive and machine learning model: 1
- long vowel and machine learning: 1
 
                                Co phrase  search for related documents, hyperlinks ordered by date