NLP MODELS FOR ARTISTIC STATEMENT GENERATION
DOI:
https://doi.org/10.29121/shodhkosh.v6.i1s.2025.6673Keywords:
Natural Language Processing, Generative AI, Reinforcement Learning, Vision Language Fusion, Creativity, Transformer Models, Parallel Multimodal Learning, Creative Text GenerationAbstract [English]
In this paper we propose a multimodal vision-language multimodality detransformer framework for coherent, expressive, and visually grounded artistic statement generation, which takes advantage of multimodal vision- and language modeling on top of a strong transformer-based text generation network. The proposed system is comprised of a visual encoder to interpret compositional and stylistic aspects of an artwork, a fine-tuned transformer decoder that acts as a conceptually rich story engine and a cross-modal fusion module to ensure the alignment between visual clues and linguistic output. Combined with creative and grounding-based reward mechanisms from reinforcement learning, the interpretive depth and style-grounding are further advanced. Using automated similarity measures, multimodality alignment scores and human expert subjectivity measurement, it is shown that the hybrid model greatly improves over traditional captioning and text-only methods at extracting artistry, emotionality and conceptuality. While the method has great potential, challenges exist in dealing with cultural bias, data limitations, interpretive subjectivity and computational demands. Overall, the research brings forward the field of AI-assisted artistic communication and provides a scalable solution to help artists, curators, educators, and digital art platforms to create quality artistic statements.
References
Aldekhail, M., and Almasri, M. (2022). Intelligent Identification and Resolution of Software Requirement Conflicts: Assessment and Evaluation. Computer Systems Science and Engineering, 40(2), 469–489. https://doi.org/10.32604/csse.2022.018269 DOI: https://doi.org/10.32604/csse.2022.018269
Borawake, M., Patil, A., Yadav, S., Nagwade, V., and Somwanshi, H. (2025). Driver Drowsiness Detection using ML and IOT. IJRAET, 14(1), 114–117.
Bozyiğit, F., Aktaş, Ö., and Kılınç, D. (2021). Linking Software Requirements and Conceptual Models: A Systematic Literature Review. Engineering Science and Technology, an International Journal, 24(1), 71–82. https://doi.org/10.1016/j.jestch.2020.11.006 DOI: https://doi.org/10.1016/j.jestch.2020.11.006
Deshpande, G., Sheikhi, B., Chakka, S., Zotegouon, D. L., Masahati, M. N., and Ruhe, G. (2021). Is BERT the New Silver Bullet? An Empirical Investigation of Requirements Dependency Classification. In Proceedings of the IEEE 29th International Requirements Engineering Conference Workshops (REW 2021) (pp. 136–145). https://doi.org/10.1109/REW53955.2021.00025 DOI: https://doi.org/10.1109/REW53955.2021.00025
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019) (Vol. 1, pp. 4171–4186). Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423 DOI: https://doi.org/10.18653/v1/N19-1423
Došilović, F. K., Brčić, M., and Hlupić, N. (2018). Explainable Artificial Intelligence: A Survey. In Proceedings of the International Convention on Information, Communication, Technology, Electronics and Microelectronics (MIPRO 2018) DOI: https://doi.org/10.23919/MIPRO.2018.8400040
Joshi, A., Karimi, S., Sparks, R., and Macintyre, C. R. (2020). Survey of Text-Based Epidemic Intelligence: A Computational Linguistics Perspective. ACM Computing Surveys, 52(6), Article 119. https://doi.org/10.1145/3361141 DOI: https://doi.org/10.1145/3361141
Kim, A. Y., and Hardin, J. (2021). Playing the Whole Game: A Data Collection and Analysis Exercise with Google Calendar. Journal of Statistics and Data Science Education, 29(sup1), S51–S60. https://doi.org/10.1080/10691898.2020.1799728 DOI: https://doi.org/10.1080/10691898.2020.1799728
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., and He, L. (2022). A Survey on Text Classification: From Traditional to Deep Learning. ACM Transactions on Intelligent Systems and Technology, 13(2), Article 31. https://doi.org/10.1145/3495162 DOI: https://doi.org/10.1145/3495162
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., and Gao, J. (2021). Deep Learning-Based Text Classification: A Comprehensive Review. ACM Computing Surveys, 54(3), 1–40. https://doi.org/10.1145/3439726 DOI: https://doi.org/10.1145/3439726
Peng, S., et al. (2022). A Survey on Deep Learning for Textual Emotion Analysis in Social Networks. Digital Communications and Networks, 8(5), 745–762. https://doi.org/10.1016/j.dcan.2021.10.003 DOI: https://doi.org/10.1016/j.dcan.2021.10.003
Pham, P., Nguyen, L. T. T., Pedrycz, W., et al. (2023). Deep Learning, Graph-Based Text Representation and Classification: A Survey, Perspectives and Challenges. Artificial Intelligence Review, 56, 4893–4927. https://doi.org/10.1007/s10462-022-10265-7 DOI: https://doi.org/10.1007/s10462-022-10265-7
Sarwar, T., et al. (2023). The Secondary use of Electronic Health Records for Data Mining: Data Characteristics and Challenges. ACM Computing Surveys, 55(2), Article 33. https://doi.org/10.1145/3490234 DOI: https://doi.org/10.1145/3490234
Wang, B., Peng, R., Wang, Z., Wang, X., and Li, Y. (2020). An Automated Hybrid Approach for Generating Requirements Trace Links. International Journal of Software Engineering and Knowledge Engineering, 30(7), 1005–1048. https://doi.org/10.1142/S0218194020500278 DOI: https://doi.org/10.1142/S0218194020500278
Wang, R. (2021). K-adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In Findings of the Association for Computational Linguistics: ACL 2021 (pp. 1405–1418). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-acl.121 DOI: https://doi.org/10.18653/v1/2021.findings-acl.121
Wu, J.-L., et al. (2020). Identifying Emotion Labels from Psychiatric Social Texts Using a Bi-Directional LSTM-CNN Model. IEEE Access, 8, 66638–66646. https://doi.org/10.1109/ACCESS.2020.2985228 DOI: https://doi.org/10.1109/ACCESS.2020.2985228
Zhao, L., Alhoshan, W., Ferrari, A., Letsholo, K. J., Ajagbe, M. A., Chioasca, E.-V., and Batista-Navarro, R. T. (2022). Natural Language Processing for Requirements Engineering: A systematic mapping study. ACM Computing Surveys, 54(3), 1–41. https://doi.org/10.1145/3444689 DOI: https://doi.org/10.1145/3444689
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Dr. R.M. Gomathi, Pooja Srishti, Prateek Garg, Dr. Roselin, Dr. Hemal Thakker, Sumeet Kaur

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.























