نویسه خوانی نوری (OCR) در خط‌های شکسته با استفاده از شبکه‌های تشخیص شیء

Gandomkar, Mojtaba; Khoramipour, Sahar

doi:10.22034/tjee.2024.62945.4877

فهرست نشریات دارای اعتبار وزارت علوم، تحقیقات و فناوری

تعداد نشریات	45
تعداد شماره‌ها	1,463
تعداد مقالات	17,869
تعداد مشاهده مقاله	58,146,526
تعداد دریافت فایل اصل مقاله	19,643,318

	نویسه خوانی نوری (OCR) در خط‌های شکسته با استفاده از شبکه‌های تشخیص شیء
مجله مهندسی برق دانشگاه تبریز
دوره 55، شماره 1 - شماره پیاپی 111، خرداد 1404، صفحه 67-76 اصل مقاله (913.83 K)
نوع مقاله: علمی-پژوهشی
شناسه دیجیتال (DOI): 10.22034/tjee.2024.62945.4877
نویسندگان
Mojtaba Gandomkar^* ¹؛ Sahar Khoramipour²
¹دانشگاه صنعتی جندی‌شاپور دزفول، دزفول، ایران.
²دانشگاه صنعتی جندی‌شاپور دزفول، دزفول، ایران
چکیده
نویسه خوانی نوری (OCR) در خط‌های شکسته، که در آن حروف یک کلمه به هم چسبیده هستند و در جهت‌های افقی و عمودی با هم همپوشانی دارند، با چالش‌های زیادی در هنگام جداسازی نویسه‌های تشخیص داده نشده و تشخیص نویسه‌های جدا نشده روبه‌رو می‌شود. در این مقاله، ما استفاده از مدل‌های تشخیص شیء را برای تشخیص نویسه‌ها در خط‌های شکسته پیشنهاد می‌کنیم. سادگی اجرا و کارایی این روش در شناخت قلم‌های سبک دست‌نویس بررسی خواهد شد. در این پژوهش از شبکه یولو برای جداسازی و طبقه‌بندی نویسه‌های کلمات دلخواه سه حرفی در خط فارسی به عنوان مطالعه موردی استفاده شده است. در ابتدا مجموعه داده مناسب برای شبکه یولو را از قلم‌های فارسی با سبک دست‌نویس مانند مانلی و ایران‌نستعلیلق تولید کردیم. با استفاده از شبکه یولو به دقت بالای 98.5٪ در تشخیص نویسه‌های قلم مانلی و 97.6٪ برای ترکیب کلمات در قلم‌های مانلی و ایران‌نستعلیق دست یافتیم. سپس، آستانه دقت مدل پیشنهادی را با اضافه کردن نویز، تاری و چولگی به نمونه‌ها به چالش کشیدیم. علاوه بر این، ما از یک مدل پرسپترون چند لایه (MLP) برای پیش‌بینی کلمات از نویسه‌های شناسایی شده و مکان‌یابی شده توسط یولو با دقت بیش از 97.7٪ استفاده کردیم. این رویکرد ما را قادر می‌سازد تا بدون استفاده از لغت‌نامه فارسی، کلمات کامل با قلم‌های پیچیده به سبک دست‌نویس را به طور دقیق تشخیص دهیم.
کلیدواژه‌ها
نویسه‌خوانی نوری (OCR)؛ تشخیص شیء؛ شبکه یولو؛ شبکه عصبی پرسپترون چند لایه (MLP)؛ خط فارسی؛ قلم‌های سبک دست‌نویس

مراجع
[1] M. Pandey, M. Arora, S. Arora, Ch. Goyal, V. K. Gera, and H. Yadav, “AI-based Integrated Approach for the Development of Intelligent Document Management System (IDMS)”, Procedia Computer Science, vol. 230, pp. 725-736, 2023. [2] N. Girdhar, M. Coustaty, A. Doucet, “Digitizing History: Transitioning Historical Paper Documents to Digital Content for Information Retrieval and Mining—A Comprehensive Survey”, IEEE Transactions on Computational Social Systems, pp. 1-30, 2024. [3] H.A. Alhamad, M. Shehab, M. K. Y. Shambour, M. A. Abu-Hashem, A. Abuthawabeh, H. Al-Aqrabi, M. Sh. Daoud, F. B. Shannaq, “Handwritten Recognition Techniques: A Comprehensive Review”, Symmetry, vol. 16, no. 6, p. 681, 2024. [4] P. Shivakumara, U. Pal, “Cognitively Inspired Video Text Processing”, Springer Singapore, 2021. [5] Z. Shen, R. Zhang, M. Dell, B. Charles, G. Lee, J. Carlson, W. Li, “Layoutparser: A unified toolkit for deep learning based document image analysis” In 16th International Conference on Document Analysis and Recognition (ICDAR), Lausanne, Switzerland, September 5–10, pp. 131-146, 2021. [6] J. Memon, M. Sami, R. A. Khan, M. Uddin, “Handwritten optical character recognition (OCR): A comprehensive systematic literature review (SLR)”, IEEE access, vol. 8, pp. 142642-142668, 2020. [7] J. Park, E. Lee, Y. Kim, I. Kang, H.I. Koo, N.I. Cho, “Multi-lingual optical character recognition system using the reinforcement learning of character segmenter”, IEEE Access, vol. 8, pp. 174437-174448, 2020. [8] Z. Khosrobeigi, H. Veisi, E. Hoseinzade, H. Shabanian, “Persian optical character recognition using deep bidirectional long short-term memory”, Applied Sciences, vol. 12, no. 22, p. 11760, 2022. [9] M. Bonyani, S. Jahangard, M. Daneshmand, “Persian handwritten digit, character and word recognition using deep learning”, International Journal on document analysis and recognition (IJDAR), vol. 24, no. 1, pp. 133-143, 2021. [10] S. Ahmadi, M. Agarwal, A. Anastasopoulos, “PALI: A Language Identification Benchmark for Perso-Arabic Scripts”, In Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial). 2023. [11] R. Azmi, E. Kabir, “A new segmentation technique for omnifont Farsi text”, Pattern Recognition Letters, vol. 22, no. 2, pp. 97-104, 2001. [12] H. Khosravi, E. Kabir, “A blackboard approach towards integrated Farsi OCR system”, International Journal of Document Analysis and Recognition (IJDAR), vol. 12, pp. 21-32, 2009. [13] V. Hajihashemi, M. M. A. Ameri, A. A. Gharahbagh, A. Bastanfard, “A pattern recognition based Holographic Graph Neuron for Persian alphabet recognition”, In 2020 Int. conf. on machine vision and image processing (MVIP), pp. 1-6. IEEE, 2020. [14] V. Ghods, M.K. Sohrabi, “Online Farsi Handwritten Character Recognition Using Hidden Markov Model”, Journal of Computers, vol. 11, no. 2, pp. 169-175, 2016. [15] J. Sadri, M.R. Yeganehzad, J. Saghi, “A novel comprehensive database for offline Persian handwriting recognition”, Pattern Recognition, vol. 60, p. 378, 2016. [16] S. Khorashadizadeh, A. Latif, “Arabic/Farsi Handwritten Digit Recognition usin Histogra of Oriented Gradient and Chain Code Histogram”, Int. Arab Journal of Information Technology (IAJIT), vol. 13, no. 4, 2016. [17] M.J. Parseh, M. Meftahi, “A new combined feature extraction method for Persian handwritten digit recognition”, International Journal of Image and Graphics, vol. 17, no. 2, p. 1750012, 2017. [18] G. A. Montazer, H. Q. Saremi, V. Khatibi, “A neuro-fuzzy inference engine for Farsi numeral characters recognition”, Expert Systems with Applications, vol. 37, no. 9, pp. 6327-6337, 2010. [19] M. Pourreza, R. Derakhshan, H. Fayyazi, M. Sabokrou, “Sub-word based Persian OCR using auto-encoder features and cascade classifier”, In 2018 9th International Symposium on Telecommunications (IST), pp. 481-485. IEEE, 2018. [20] Z.A. Aghbari, S. Brook, “HAH manuscripts: A holistic paradigm for classifying and retrieving historical Arabic handwritten documents”, Expert Systems with Applications, vol. 36, no. 8, pp. 10942-10951, 2009. [21] Y. A. Nanehkaran, D. Zhang, S. Salimi, J. Chen, Y. Tian, N. Al-Nabhan, “Analysis and comparison of machine learning classifiers and deep neural networks techniques for recognition of Farsi handwritten digits”, The Journal of Supercomputing, vol. 77, pp. 3193-3222, 2021. [22] M. Parseh, M. Rahmanimanesh, P. Keshavarzi, “Persian handwritten digit recognition using combination of convolutional neural network and support vector machine methods”, The International Arab Journal of Information Technology, vol.17, no. 4, pp. 572-578, 2020. [23] H. Xiang, Q. Zou, M. A. Nawaz, X. Huang, F. Zhang, H. Yu, “Deep learning for image inpainting: A survey”, Pattern Recognition, vol. 134, pp. 109046, 2023. [24] S. Zhang, X. Lu, Z. Lu, “Improved CNN-based CatBoost model for license plate remote sensing image classification”, Signal Processing, vol. 213, p. 109196, 2023. [25] S. Khosravi, A. Chalechale, “Chimp optimization algorithm to optimize a convolutional neural network for recognizing Persian/Arabic handwritten words”, Mathematical Problems in Engineering, vol. 1, p. 4894922, 2022. [26] U. Hengaju, B. K. Bal, “Improving the Recognition Accuracy of Tesseract-OCR Engine on Nepali Text Images via Preprocessing”, Advancement in Image Processing and Pattern Recognition, vol. 3, no. 2, 3, pp. 40-52, 2023. [27] M. M. Misgar, F. Mushtaq, S. S. Khurana, M. Kumar, “Recognition of offline handwritten Urdu characters using RNN and LSTM models”, Multimedia Tools and Applications, vol. 82, no. 2, pp. 2053-2076, 2023. [28] A. Mars, K. Dabbabi, S. Zrigui, M. Zrigui, “Combination of DE-GAN with CNN-LSTM for Arabic OCR on Images with Colorful Backgrounds”, In International Conference on Computational Collective Intelligence, pp. 585-596. Cham: Springer Nature Switzerland, 2023. [29] M. F. Y. Ghadikolaie, E. Kabir, F. Razzazi, “Sub‐word based offline handwritten farsi word recognition using recurrent neural network”, ETRI Journal, vol. 38, no. 4, pp. 703-713, 2016. [30] R. Najam, S. Faizullah, “Analysis of recent deep learning techniques for Arabic handwritten-text OCR and Post-OCR correction”, Applied Sciences, vol. 13, no. 13, p. 7568, 2023. [31] N. Ghanmi, A. Belhakimi, A. Awal, “CNN-BLSTM Model for Arabic Text Recognition in Unconstrained Captured Identity Documents”, In International Conference on Image Analysis and Processing, pp. 106-118. Cham: Springer Nature Switzerland, 2023. [32] A. A. Pratama, M. D. Sulistiyo, A. F. Ihsan, “Balinese Script Handwriting Recognition Using Faster R-CNN”, Journal of RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 7, no. 6, pp. 1268-1275, 2023. [33] R. Mondal, S. Malakar, B. E.H. Smith, R. Sarkar, “Handwritten English word recognition using a deep learning based object detection architecture”, Multimedia Tools and Applications, vol. 81, pp. 975–1000, 2022. [34] S. Alghyaline, “A Printed Arabic Optical Character Recognition System using Deep Learning”, Journal of Computer Science, vol. 18, no. 11, pp. 1038-1050, 2022. [35] A. A. Demir, U. Ozkaya, “Ottoman character recognition on printed documents using deep learning”, Mühendislik Bilimleri ve Tasarım Dergisi, vol. 12, no. 2, pp. 392-402, 2024. [36] X. Wang, S. Zheng, C. Zhang, R. Li, L. Gui, “R-YOLO: A real-time text detector for natural scenes with arbitrary rotation”, Sensors, vol. 21, no. 3, p. 888, 2021. [37] D. Etter, S. Rawls, C. Carpenter, G. Sell, “A synthetic recipe for OCR”, In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 864-869. IEEE, 2019. [38] S. Hatami, S. Behnam, R. Shamsaee, “Improving detection of capsule endoscopy using YOLO”, Tabriz journal of electrical engineering, 2024, (In Persian), doi: 10.22034/tjee.2024.58239.4711. [39] E. Zafarani-Moattar, M. R. Feizi-Derakhshi, A. Roohany, “The intelligent and automatic detection of type errors in large databases without using dictionary”, Tabriz journal of electrical engineering, vol. 47, no. 1, pp. 81-91, 2017, (In Persian)
آمار تعداد مشاهده مقاله: 670 تعداد دریافت فایل اصل مقاله: 331

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

آمار

نویسه خوانی نوری (OCR) در خط‌های شکسته با استفاده از شبکه‌های تشخیص شیء