AI-Based OCR for Digitizing Ancient Indian Texts: Preserving Linguistic Heritage and Overcoming Script Challenges

Authors

  • Shivraj Gaikwad Undergraduate student (AI & DS), Vishwakarma Institute of Information Technology, Pune
  • Renu Kachhoria Associate Professor (AI & DS), Vishwakarma Institute of Information Technology, Pune
  • Gitanjali yadav Associate Professor (AI&DS), Vishwakarma Institute of Information Technology, Pune

DOI:

https://doi.org/10.69889/ijlapt.v2i03(Mar).102

Keywords:

AI-based OCR, ancient Indian texts, script digitization, linguistic heritage, deep learning, natural language processing, manuscript preservation, historical texts, script recognition, cultural preservation.

Abstract

To preserve India's language and culture, traditional literature must be preserved. Due to linguistic differences, script degradation, and a lack of digital copies, many antique manuscripts are unintelligible. This effort digitizes ancient Indian manuscripts using AI-OCR. The initiative dissolves texts, intricate language systems, and scripts. AI-powered optical character recognition (OCR) systems improve Devanagari, Tamil, Grantha, and Brahmi text recognition using deep learning models and NLP. AI-driven optical character recognition (OCR) with extensive pre-processing, character segmentation and language modelling is used to solve recognition problems in the study. According to this program, AI may help indigenous knowledge, language studies, and classic literature. AI is essential for cultural preservation and ancient text studies, the study found.

Downloads

Published

2025-04-05

How to Cite

Shivraj Gaikwad, Renu Kachhoria, & Gitanjali yadav. (2025). AI-Based OCR for Digitizing Ancient Indian Texts: Preserving Linguistic Heritage and Overcoming Script Challenges. International Journal of Linguistics Applied Psychology and Technology (IJLAPT), 2(03(Mar), 1–12. https://doi.org/10.69889/ijlapt.v2i03(Mar).102

Issue

Section

Articles