Script Identification of Machine Printed Documents Using NN Classifiers
Multi Script Document, Script Identification, OCR, Tick components, Bottom components, KNN, Heuristic Search.
In recent years there are many multimedia documents captured and stored with the advances in computer technology and hence the demand for recognizing and retrieval of such documents has increased tremendously .In such environment the large volume of data and variety of scripts make manual identification unworkable. In such cases the ability to automatically determine the script, and further the language of a document would reduce the time and cost of document handling. So the development of script identification from multilingual document image systems and then retrieving document image by matching with a query image (input image) has become an important task. It is noted that the research in this field is relatively thin and still more research is to be done, particularly in case of machine printed and handwritten documents. Here present a method developed to identify the script in machine printed document images automatically without manual intervention Particularly in case of machine printed and handwritten documents. The objective of this paper is to develop procedure to identify different text portions of a document. In this work eight feature namely top max row, bottom max row, top horizontal lines, vertical lines, bottom components, tick components, top holes and bottom holes have been used to identify the script type. Using these features two methods that is heuristic based algorithms and KNN approach proposed to identify the script type with the scripts of Telugu, Hindi , English, Bangla. There are a large number of different approaches to recognize the scripts currently available in OCR System.
In this paper we look to identify the script of multilingual documents. In the proposed script identification system, we have considered different Indian languages such as English, Devanagari, Kannada, Gurumukhi, Bangla Script.
Ankita Ahuja, Renu Singla. "Script Identification of Machine Printed Documents Using NN Classifiers".INTERNATIONAL JOURNAL OF ENGINEERING DEVELOPMENT AND RESEARCH ISSN:2321-9939, Vol.3, Issue 2, pp.248-251, URL :https://rjwave.org/ijedr/papers/IJEDR1502046.pdf
Volume 3 Issue 2
Pages. 248-251