Deep Extreme Learning Machine-Based Optical Character Recognition System for Nastalique Urdu-Like Script Languages

Rizvi, Syed Saqib Raza; Khan, Muhammad Adnan; Abbas, Sagheer; Asadullah, Muhammad; Anwer, Nida; Fatima, Areej

Published in

Oxford University Press, The Computer Journal, 2(65), p. 331-344, 2020

DOI: 10.1093/comjnl/bxaa042

Tools

Export citation

Search in Google Scholar

Deep Extreme Learning Machine-Based Optical Character Recognition System for Nastalique Urdu-Like Script Languages

Journal article published in 2020 by Syed Saqib Raza Rizvi, Muhammad Adnan Khan, Sagheer Abbas, Muhammad Asadullah, Nida Anwer, Areej Fatima

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Abstract Optical character recognition systems convert printed or handwritten scripts into digital text formats like ASCII or UNICODE. Urdu-like script languages like Urdu, Punjabi and Sindhi are widely spoken languages of the world, especially in Asia. An enormous amount of printed and handwritten text of such languages exist, which needs to be converted into computer-understandable formats for knowledge extraction. In this study, extreme learning machine’s (ELM’s) most recently proposed variant called deep extreme learning machine (DELM)-based optical character recognition (OCR) system is proposed to enhance Urdu-like script language’s character recognition rate. The proposed DELM-based character recognition model is optimizing the OCR process by reducing the overhead of Pre-processing, Segmentation and Feature Extraction Layer. The proposed system evaluations accomplished 98.75% training accuracy with 1.492 × 10−3 RMSE and 98.12% testing accuracy with 1.587 × 10−3 RMSE, with six DELM hidden layers. The results show that the proposed system has attained the foremost recognition rate as compared to any previously proposed Urdu-like script language OCR system. This technique is applicable for machine-printed text and fractionally useful for handwritten text as well. This study will aid in the advancement of more accurate Urdu-like script OCR’s software systems in the future.

Published in

Links

Tools

Deep Extreme Learning Machine-Based Optical Character Recognition System for Nastalique Urdu-Like Script Languages

Abstract