To search, Click below search items.


All Published Papers Search Service


Methods of Arabic Language Baseline Detection ? The State of Art


Atallah AL-Shatnawi, Khairuddin Omar


Vol. 8  No. 10  pp. 137-143


Preprocessing is the most important stage in the Arabic OCR system; it has a direct effect on the reliability and efficiency of the segmentation and feature extraction stages. It is worth mentioning that Arabic language is cursively written, and its characters have between 2 to 4 shapes. An Arabic word likely consists of two or more characters which are connected through an imaginary line called baseline. Detecting baseline is one of the main majorities in preprocessing Arabic OCR system. The baseline can be used for both skew normalization and character segmentation. This paper aims to provide a comprehensive review of the methods proposed by researchers to detect Arabic baseline. The Arabic baseline detection methods are categorized into four methods: (a) based on horizontal projection methods, (b) based on word skeleton method, (c) based on contour tracing method, and (d) based on principle component analysis method. Each of these methods has its own advantages and drawbacks.


Preprocessing, OCR, Handwritten, Offline, Arabic Baseline