Master thesis defense: Fariba Haghbin
Speaker: Fariba Haghbin
Supervisor: Dr. C. Y. Suen
Examining Committee: Drs. T. D. Bui, L. Lam, R. Jayakumar (Chair)
Title: Dynamic Learning Approach for Arabic Word Spotting and Word Recognition
Date: Friday, June 6, 2014
Place: EV 3.309
Developing an Arabic word spotting system is a unique challenge, since Arabic scripts are cursive by nature. In addition, there are no clear boundaries between the Arabic handwri tten words and they can contain overlapping and touching Pieces of Arabic Words (PAWs). Arabic word spotting methods that are based on sub-words or pieces of Arabic words models fail to spot words with disconnected and touching PAWs.
In this thesis, we propose an effective met hod for Arabic word spotting by integrating over-segmentation with the recognition results from the discriminant classifier and by using a dynamic programming algorithm to spot and recognize Arabic handwritten words. Geometric models have been used to refine the results. The proposed method for word spotting was applied on the CENPARMI Arabic document database. The result looked promising. The proposed system obtained an average precision of 62.83% on 73 documents which is a good result when compared to other studies.
We also propose a new rule-based approach to handle overlapping PAWs and to reduce computation time. We reduced the dynamic programming computation by reducing the number of lines to be searched to find a keyword. The techniques were applied on the IFN/ENIT set (e) database for the purpose of word recognition. The over-segmentation approach combined with the rule-based approach resulted in a 99.8% over-segmentation.