|
ShodhKosh: Journal of Visual and Performing ArtsISSN (Online): 2582-7472
Machine Translation for Folk Narratives in Education Dr. Prashant Wakhare 1 1 AISSMS
Institute of Information Technology, Pune, Maharashtra, India 2 Department
of Artificial Intelligence and Data Science, Vidya Pratikshan's
Kamalnayan Bajaj Institute of Engineering and Technology, Baramati,
Maharashtra, India 3 Savitribai Phule Pune University, Pune, Maharashtra, India 4 Department of Artificial Intelligence and Data Science, AISSMS
Institute of Information Technology, Pune-01, Maharashtra, India
1. INTRODUCTION The blistering development of online education has only heightened the need to equip multilingual learning materials which in addition to being linguistically precise, should have cultural significance. In that regard, machine translation (MT) is critical in providing access to educational resources across both the language barrier. Nevertheless, although the use of the MT systems has been quite successful in the process of translation of technical, scientific, and informational texts, they have been found to be very ineffective when used in the translation of culturally embedded documents like folk narratives. Folk narratives (myths, legends, folktales and oral histories) lie deep in the traditions of the local area, the memory of the people and in symbolic language. Their translation has to be sensitive to metaphor, cultural allusions, narrative beat, and moral frames going beyond word to word translation. Folk narratives are used in learning institutions in several pedagogical purposes. They facilitate the learning of the language, they pass ethical values, maintain indigenous knowledge, and promote intercultural awareness among the learners. Learner engagement has been found to be increased by the incorporation of folk narratives in curricula as it links abstract concepts to the commonly known elements within a particular culture Ba’ai and Aris (2024). Nevertheless, the issue of linguistic diversification and the prevalence of several international languages in learning institutions tend to push the marginalization of regional and native tellings. Consequently, most of the students cannot have access to culturally relevant learning materials in their native or preferred language. Machine translation provides the means of scalable solution to this issue, but the standard models of MT are not tailored to the peculiarities of the folklore discourse Foroughi et al. (2025). Figure 1 demonstrates that AI translates taking into account the cultural background to leave folk narratives intact. The classic methods of Mt like rule-based and statistical systems, are based on a set of predetermined linguistic rules or probabilistic matches based on parallel corpora. These techniques have trouble with idiom, figurative speech, and oral tellings which are frequent in folk narratives. Figure 1 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Table 1 Related Work on Machine Translation, Cultural Texts, and Educational Applications |
||||
|
MT Paradigm |
Domain Focus |
Cultural Handling |
Evaluation Metrics |
Key Limitations |
|
Statistical MT |
General Text |
None |
BLEU |
Poor idiom handling |
|
Rule-Based MT |
Literary Text |
Rule-driven |
Human judgment |
Not scalable |
|
Phrase-based SMT Lucas-Moreira and Núñez-Díaz (2025) |
News/Text |
Minimal |
BLEU, TER |
Cultural loss |
|
Early NMT |
General Domain |
Implicit |
BLEU |
Data dependent |
|
Transformer NMT |
Multidomain |
Implicit |
BLEU, METEOR |
Cultural neutrality |
|
NMT Pavlidis (2025) |
Literary Translation |
Partial stylistic |
BLEU, human eval |
Figurative loss |
|
SMT + NMT |
Indigenous Texts |
Manual post-edit |
BLEU |
Sparse data |
|
Multilingual NMT Theodoropoulos et al. (2023) |
Low-resource |
Transfer-based |
BLEU |
Limited culture modeling |
|
NLP Ethics |
Cultural Texts |
Conceptual focus |
Qualitative |
No MT system |
|
Domain-adapted NMT |
Folklore |
Partial annotation |
BLEU, TER |
No pedagogy |
|
NMT + Annotation |
Narrative Texts |
Figurative tagging |
METEOR |
Not educational |
|
NMT Thomas (2024) |
Regional Education |
Minimal |
BLEU, usability |
Cultural dilution |
|
Cultural NMT |
Folk Narratives |
Explicit cultural embeddings |
BLEU, Cultural Fidelity, Edu
Score |
— |
3. Theoretical Framework
3.1. Cultural linguistics and narrative preservation
Cultural linguistics offers a critical theoretical approach to the cognition of language as a means of storing common cultural knowledge, values, and conceptualizations in a community. In this view, folk narratives can be understood as cultural frameworks in form of linguistic structure and this serves to capture shared experience, belief and rationalizing moral judgment. Narrative patterns, metaphors, symbolic figures and repetitive patterns are carriers of cultural meaning, which develop in the process of oral delivery. It is important to preserve these things when translating folk narratives to ensure the integrity of the narratives especially when introducing folk narratives in educational settings. Cultural linguistics focuses on the fact that meaning is based on culturally situated conceptualizations and not universal semantics. As a result, literal translation methods can be incapable of translating culturally-specific meanings of folk discourse. Indicatively, nature, kinship, or spirituality-related metaphors can be based on culturally common assumptions unknown to target-language readers. Interpretive sensitivity of these conceptual frameworks is thus needed in narrative preservation. This conceptual position takes issue with purely data-driven models that give more emphasis to either statistical or neural adequacy without cultural sensitivity in the context of machine translation.
3.2. Pedagogical Theories for Integrating Translated Narratives in Classrooms
The pedagogical theories emphasize the importance of narratives as efficient learning tools, meaning-making tools, and the tool to engage learners. Constructivist learning theory assumes that learners are active in creation of knowledge as they relate new knowledge to their previous experiences. Properly translated and placed in context, folk narrative offers the culturally appropriate portals of such learning, by allowing students to apply the abstract concept to those stories that resonate with them. Figure 2 displays the incorporation of folk narratives into the sociocultural learning and responsive pedagogy. Socioculturally, narratives are mediational instruments, which aid language development, morality thought, and form social identity by discussing and interpreting them together.
Figure 2

Figure 2 Framework Linking Folk Narratives, Sociocultural
Learning Theory, and Culturally Responsive Pedagogy in Education
Language-based inclusive learning principles would also be applicable in multilingual classrooms where folk narrative translations would counterbalance the language and cultural assets of learners. Culturally responsive pedagogy focuses on the incorporation of the heritage knowledge of students in the learning to achieve better motivation and understanding. Nevertheless, translation quality is critical to the effectiveness of translated narratives in the learning process. Pedagogical objectives can be compromised and the source cultures distorted through the use of distorted metaphors, simplified moral structures, or lack of narrative coherence. The translation practices that are required by the educational theory then involve a compromise between access and authenticity.
3.3. Human–Machine Collaboration in Educational MT Workflows
Human-machine collaboration provides a practical and ethically based model of implementing machine translation in teaching environments especially where culturally sensitive texts like folk stories are to be used. Instead of making MT a completely independent solution, this solution concentrates on complementary functions of computational systems and human experience. Machine translation is scalable, fast and consistent and can be used to translate large volumes of narrative collections quickly. Human input that is offered by educators, linguists and culturalists introduces contextual knowledge, interpretation and ethical control that are absent in machines. Under collaborative processes, the output of the MT systems is first translated to a rough state before being reviewed, edited or annotated by the human agents. The process enables correction of misinterpretation of culture, adaptation of metaphors and alignment to educational purposes. Theoretically, this kind of cooperation is consistent with the principles of human-centered AI that are based on transparency, accountability, and user agency. It is also applied in education to aid knowledge practices that are participatory in that community members participate in conserving and passing their stories. Further support of human-machine collaboration creates an iterative learning process on the side of the MT systems via feedback, in which post-edits and annotations are used to inform model refinement. This is especially useful in low-resource languages, where professional intervention would help to overcome insufficient training data.
4. Methodology
4.1. Dataset development
4.1.1. Collection of folk narratives
Compiling a good-quality corpus of folk narratives is an initial move towards machine translation in the educational setting. The process of collection starts with finding stories with a wide variety of sources which include oral history archives, published folklore collections, community storytelling projects, and educational collections. Special care is devoted to the reflection of regional, indigenous, and minority traditions which are usually underrepresented in the digital corpora. In oral administration of narratives, audio recordings are transcribed cautiously in order to maintain storytelling characteristics of repetition, rhythm, and discourse markers. The collection process is based on ethical considerations. The informed consent, cultural proprietorship and attribution are honored particularly when dealing with indigenous people. Stories are recorded together with metadata that has details of origin, cultural context, genre and target audience, which are subsequently used to facilitate contextual translation and educational usage.
4.1.2. Annotation of Cultural Markers and Figurative Expressions
Cultural markers and figurative expressions are important to be annotated to make machine translators aware of cultural content of relevance in folk stories and retain it. Some examples of cultural markers are local practices, rituals, kinship, belief, ecology, artifacts that are particular to a culture. Figurative expressions include metaphors, idioms, proverbs, symbolism and narrative motifs, which have a meaning other than literal language. Determining and naming these components gives clear indicators that is used to determine model training and testing. The annotation procedure is normally performed by interdisciplinary groups of linguists, cultural researchers, educators and native speakers. Convention rules are determined to make it standardized and provide the categories of metaphor type, the area of cultural references, and the role of telling the story. Several layers of annotation can be used and this differentiates between lexical, semantic, and discourse-level phenomena. The reliability of annotation is checked with the inter-annotator agreement measures to minimize the subjectivity.
4.1.3. Language Pairs and Low-Resource Considerations
The choice of the right language pairs is a tactical move towards the development of datasets when translating folk narratives. Pairs of regional or native languages and popular instructional languages are often given priority in order to reach the maximum number of people with an education. Most of the folk narrative languages are low-resource, with little digitised text, little parallel corpus and inconsistent orthography. Such limitations are serious limitations to machine translation methods that are based on data. The strategies that are employed to deal with low-resource circumstances in the dataset development process include bilingual elicitation, community translation workshops, and text alignment procedures of similar texts instead of strictly parallel ones. The presence of similar variety of language or a group of dialects promotes transfer learning and multilingual data collection. Adaptive modeling is also supported by metadata of the dialectal variation and sociolinguistic situation. Educationally speaking, bringing language pairs of low resources to the spotlight can be a way of reversing digital language hierarchies and inclusivity in language.
4.2. Model selection and training
4.2.1. Baseline NMT system (e.g., Transformer)
The initial neural machine translation system used in this research is based on the Transformer architecture which is the common structure of state-of-the-art translation activities. The Transformer is also based on self-attention, which uses all tokens in a sentence to model relationships to facilitate optimal capture of long-range dependencies and interaction between contexts. It trains sequences concurrently, which is better than recurrent architectures and makes them more efficient and scalable. This feature is especially significant in case of folk narrative translation where narratives can consist of long sentences, repetitions, and reference to the subject matter between sentences. The base model is also trained with general-domain parallel corpora using the supplements of available narrative or literary text, in order to determine the foundational competence of translation. Subword tokenization method is used to address morphetical variation and infrequent words that are typically found in folk languages.
4.2.2. Fine-Tuning Strategies for Folklore Content
The strategies in fine-tuning are utilized in order to adjust the baseline NMT model to the peculiarities of language and culture of folk narratives. This is done by training further the pre-trained Transformer on a filtered corpus of folklore texts, which enables the model to change its parameters to folklore specific patterns. Domain-specific fine-tuning assists the system to acquire stylistic characteristics like repetitive frameworks, formulaic endings and beginnings, and culturally based discourse signifiers. Gradual unfreezing, reduced learning rates, and early stopping are some of the methods used to ensure that overfitting is minimized in low-resource settings. Back-translation and paraphrasing are data augmentation techniques that increase the size of folklore data. Figure 3 demonstrates that folklore-conscious fine-tuning improves the neural cultural-sensitive translation. Learning curriculum methods can also be employed whereby simple story forms are presented first before more intricate forms.
Figure 3

Figure 3 Flowchart of Folklore-Aware Fine-Tuning Strategies
for Neural Machine Translation Models
Such a narrowed adaptation has substantial enhancing effect on the capacity of the model to maintain the presence of story flow and cultural hints, showing the relevance of folklore-conscious training regimes in educational translation assignments.
4.3. Evaluation metrics
4.3.1. BLEU, METEOR, TER
Common automatic metrics are BLEU, METEOR, and Translation Edit Rate (TER) which are used in the evaluation of the linguistic quality of the machine translation outputs. BLEU quantitatively estimates overlap of n-grams between machine translation and reference texts which is used as an indication of language similarity and fluency. Although it is popular, the use of BLEU is restricted in terms of capturing semantic equivalence, as well as it is not able to handle paraphrasing or stylistic variation that is common with folk narratives. METEOR has dealt with some of these shortcomings by using stemming, synonym matching and word based alignment; it is more sensitive to the preservation of meaning. TER analyzes the quality of translation based on the number of edits to change a system output into a reference translation, as it is an indicator of post-editing work. These metrics are a benchmark of accuracy and structural sufficiency in the linguistic sense of the term that takes place in the educational folk narrative translation.
4.3.2. Cultural Fidelity Score
The cultural fidelity score is created to measure the success of a machine translation to maintain culturally embedded meanings in folk narratives. As opposed to traditional measures, this score, in turn, is concerned with the preservation of cultural markers, metaphors, idioms, and symbolic features detected in the process of dataset annotation. The assessment is done by making comparisons between outputs that have been translated to some reference that the experts have validated so that one can see whether some culturally important expressions have been retained, been modified to reflect the culture, or have been lost. The score is traditionally calculated by using a hybrid method of automatic detection and human judgment. Automated items are used to determine the correspondence of tagged cultural markers between source and target texts, and interpretive quality and cultural suitability are determined through human judgment. The scoring criteria can be based on the preservation of metaphors, the consistency of roles in narratives and the absence of cultural distortion or oversimplification.
4.3.3. Educational Usability Score
The educational usability score is used to assess the potential of the translated folk narratives to work as a learning tool in a classroom or online education setting. The measure is used to determine the comprehensibility of the translations, their pedagogical fit, and their correspondence with the instructional goals. The most important are the readability, narrative appropriateness, age appropriateness and conceptual appropriateness. In comparison to all-technical measures, educational usability denotes learning outcomes and teaching worth. The evaluation usually includes teachers and learners and the situation of real life learning through reading translated stories. Measurement of the understanding of the learners, engagement, and interpretive accuracy are measured using structured questionnaires, comprehension tests and classroom observations. It may be also the assessment of the translations used by teachers in terms of assisting discussion, ethical thinking, and cultural investigation. Systems viewpoint In financial terms, educational usability underscores the practical implication of the quality of translation. The linguistically correct translation of low usability can confuse the learners or extensive teacher mediation may be necessary.
5. Results and Discussion
The findings indicate that folklore-adapted NMT models are always superior to generic systems in the linguistic, cultural, and educational factors. Quantitatively, fine-tuned models obtained better BLEU and METEOR scores and low TER, which is better sign of fluency and adequacy. However, cultural fidelity scores demonstrated significant improvements with a demonstration that more metaphors, symbols, and narrative structure were preserved. The tests on educational usability illustrated a better understanding, interaction and interpretive quality by the learners when the translations to be translated had a cultural adaptation. The qualitative analysis confirmed the hypothesis that the expressions related to the culture were frequently normalized in the baseline models, and the approached method did not remove the narrative voice or moral purpose.
Table 2
|
Table 2 Translation Quality Performance: Baseline vs. Folklore-Adapted NMT |
||
|
Metric |
Baseline NMT (Generic) |
Folklore-Adapted NMT |
|
BLEU Score ↑ |
24.8 |
32.6 |
|
METEOR ↑ |
0.41 |
0.56 |
|
TER ↓ (%) |
46.3 |
31.9 |
|
Sentence Fluency Rating (%) |
62 |
84 |
|
Narrative Coherence Score
(%) |
68.5 |
84.1 |
Table 2 puts special focus on the evident performance improvements of the folklore-adapted neural machine translation (NMT) model over the generic baseline system. The fact that the BLEU score has increased by 32.6 as compared to 24.8 is indicative of a significantly improved lexicality and choice of phrases when translating folk stories. Figure 4 indicates that folklore-adapted NMT is much better at translations than base models.
Figure 4

Figure 4 Comparison of Translation Quality: Baseline NMT vs
Folklore-Adapted NMT
Likewise, the increase in METEOR score between 0.41 and 0.56 is an indication of an increased semantic matching and better processing of morphological variation and synonymy that can be found in narrative texts. An impressive shift in Translation Edit Rate (TER) of -46.3 to -31.9 shows that the product of a translation adapted into folklore contains many fewer corrections that need to be made after the editing, which saves human resources in the educational implementations.
Figure 5

Figure 5 Visualization of Metric Improvements in
Folklore-Adapted NMT
In addition to typical MT measurements, the steep increase in sentence fluency scale (62-84) supports the fact that fine-tuned models are effective in generating readable and more natural narrative translations. Figure 5 demonstrates obvious metric gains with the help of folklore-adapted neural machine translation. The 68.5 to 84.1 score improvement in the narrative coherence is of specific significance to the educational setting since the ability to tell a story coherently has a direct correlation with the level of comprehension and interest among the learners.
6. Conclusion
This paper shows that machine translation can be a potent facilitator of folk narrative integration into the modern educational process in case cultural and pedagogical issues are explicitly considered. The forward step of generic translation pipelines, the suggested framework demonstrates that domain-adapted neural model, culturally sensitive and enriched datasets as well as culturally sensitive and informed evaluation metrics can contribute greatly to translation quality in the context of (narrative-based) learning resources. The findings substantiate the fact that to preserve the educational meaning and culture, it is important to retain metaphors, symbols, and narrative integrity. Low-resource language inclusion is also one of the issues noted in the research. Most folk stories are narrated by linguistic groups that are marginalized to a certain extent, and unless specific approaches to MT are implemented, there is a threat of these traditions becoming excluded in digital education. The human-machine collaboration is also equally significant. The results affirm the fact that teachers, linguologists, and social professionals are still needed in order to authenticate translations, put narratives into perspective, and dictate proper usage ethics. Machine translation is most effective as an assistive technology integrated in participatory educational processes and not a completely independent solution.
CONFLICT OF INTERESTS
None.
ACKNOWLEDGMENTS
None.
REFERENCES
Ba’ai, N., and Aris, A. (2024). AI and Cultural Heritage: Preserving and Promoting Global Cultures Through Technology. Nanotechnology Perceptions, 20, 170–176. https://doi.org/10.62441/nano-ntp.vi.3454
Chen, Y., Zhang, L., and Dong, Q. (2024). Using Natural Language Processing to Evaluate Local Conservation Text: A Study of 624 Documents from 303 Sites of the World Heritage Cities Programme. Journal of Cultural Heritage, 70, 259–270. https://doi.org/10.1016/j.culher.2024.09.011
Dafiotis, P., Sylaiou, S., Stylianidis, E., Koukopoulos, D., and Fidas, C. (2025). Evaluating Uses of XR in Fostering Art Students’ Learning. Multimodal Technologies and Interaction, 9(4), 36. https://doi.org/10.3390/mti9040036
Desai, A. U. (2024). A Review of the Applications of Machine Learning in Cybersecurity and its Challenges. Journal of Digital Security and Forensics, 1(1), 26–29. https://doi.org/10.29121/digisecforensics.v1.i1.2024.17
Foroughi, M., Wang, T., and Roders, P. (2025). In Praise of Diversity in Participatory Heritage Planning Empowered by Artificial Intelligence: Windcatchers in Yazd. Urban Planning, 10, 8724. https://doi.org/10.17645/up.8724
Hannaford, E. D., Schlegel, V., Lewis, R., Ramsden, S., Bunn, J., Moore, J., and Nenadic, G. (2024). Our Heritage, Our Stories: Developing AI Tools to Link and Support Community-Generated Digital Cultural Heritage. Journal of Documentation, 80(5), 1133–1147. https://doi.org/10.1108/JD-03-2024-0057
Harisanty, D., Obille, K., Anna, N., Purwanti, E., and Retrialisca, F. (2024). Cultural Heritage Preservation in the Digital Age, Harnessing Artificial Intelligence for the Future: A Bibliometric Analysis. Digital Library Perspectives, 40(4), 609–630. https://doi.org/10.1108/DLP-01-2024-0018
Harth, A. (2024). The Study of Pigments in Cultural Heritage: A Review Using Machine Learning. Heritage, 7(7), 3664–3695. https://doi.org/10.3390/heritage7070174
He, Z., Su, J., Chen, L., Wang, T., and Li, R. (2025). “I Recall the Past”: Exploring How People Collaborate with Generative AI to Create Cultural Heritage Narratives. Proceedings of the ACM on Human-Computer Interaction, 9, 1–30. https://doi.org/10.1145/3711006
Lucas-Moreira, O. D., and Núñez-Díaz, J. (2025). Narratives in the Age of AI: Reflections on Literature and Communication. YUYAY: Estrategias Metodológicas y Didácticas Educativas, 4(2), 77–93. https://doi.org/10.59343/yuyay.v4i2.99
Münster, S., Maiwald, F., Di Lenardo, I., Henriksson, J., Isaac, A., Graf, M., Beck, C., and Oomen, J. (2024). Artificial Intelligence for Digital Heritage Innovation: Setting Up an RandD Agenda for Europe. Heritage, 7(2), 794–816. https://doi.org/10.3390/heritage7020038
Pavlidis, G. (2025). Agentic AI for Cultural Heritage: Embedding Risk Memory in Semantic Digital Twins. Computers, 14(7), 266. https://doi.org/10.3390/computers14070266
Sylaiou, S., Dafiotis, P., Koukopoulos, D., Koukoulis, K., Vital, R., Antoniou, A., and Fidas, C. (2024). From Physical to Virtual Art Exhibitions and Beyond: Survey and Some Issues for Consideration for the Metaverse. Journal of Cultural Heritage, 66, 86–98. https://doi.org/10.1016/j.culher.2023.11.002
Theodoropoulos, A., Stavropoulou, D., Papadopoulos, P., Platis, N., and Lepouras, G. (2023). Developing an Interactive VR CAVE for Immersive Shared Gaming Experiences. Virtual Worlds, 2(2), 162–181. https://doi.org/10.3390/virtualworlds2020010
Thomas, S. (2024). AI and actors: Ethical Challenges, Cultural Narratives and Industry Pathways in Synthetic Media Performance. Emergent Media, 2(4), 523–546. https://doi.org/10.1177/27523543241289108
Trichopoulos, G., Konstantakis, M., Caridakis, G., Katifori, A., and Koukouli, M. (2023). Crafting a Museum Guide Using ChatGPT-4. Big Data and Cognitive Computing, 7(3), 148. https://doi.org/10.3390/bdcc7030148
Tsepapadakis, M., and Gavalas, D. (2023). Are you Talking to me? An Audio Augmented Reality Conversational Guide for Cultural Heritage. Pervasive and Mobile Computing, 92, 101797. https://doi.org/10.1016/j.pmcj.2023.101797
Zhang, J., Xiang, R., Kuang, Z., Wang, B., and Li, Y. (2024). ArchGPT: Harnessing Large Language Models for Supporting Renovation and Conservation of Traditional Architectural Heritage. Heritage Science, 12, 220. https://doi.org/10.1186/s40494-024-01334-x
|
|
This work is licensed under a: Creative Commons Attribution 4.0 International License
© ShodhKosh 2026. All Rights Reserved.