NLP MODELS FOR ARTISTIC STATEMENT GENERATION

R.M. Gomathi; Pooja Srishti; Prateek Garg; Roselin; Hemal Thakker; Sumeet Kaur

doi:10.29121/shodhkosh.v6.i1s.2025.6673

Authors

Dr. R.M. Gomathi Associate Professor, Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, India
Pooja Srishti Assistant Professor, School of Business Management, Noida International University, India
Prateek Garg Chitkara Centre for Research and Development, Chitkara University, Himachal Pradesh, Solan, 174103, India
Dr. Roselin Associate Professor, Department of Computer Science and Engineering, Presidency University, Bangalore, Karnataka, India
Dr. Hemal Thakker Associate Professor, ISME - School of Management & Entrepreneurship, ATLAS SkillTech University, Mumbai, Maharashtra, India
Sumeet Kaur Centre of Research Impact and Outcome, Chitkara University, Rajpura- 140417, Punjab, India

DOI:

https://doi.org/10.29121/shodhkosh.v6.i1s.2025.6673

Keywords:

Natural Language Processing, Generative AI, Reinforcement Learning, Vision Language Fusion, Creativity, Transformer Models, Parallel Multimodal Learning, Creative Text Generation

Abstract [English]

In this paper we propose a multimodal vision-language multimodality detransformer framework for coherent, expressive, and visually grounded artistic statement generation, which takes advantage of multimodal vision- and language modeling on top of a strong transformer-based text generation network. The proposed system is comprised of a visual encoder to interpret compositional and stylistic aspects of an artwork, a fine-tuned transformer decoder that acts as a conceptually rich story engine and a cross-modal fusion module to ensure the alignment between visual clues and linguistic output. Combined with creative and grounding-based reward mechanisms from reinforcement learning, the interpretive depth and style-grounding are further advanced. Using automated similarity measures, multimodality alignment scores and human expert subjectivity measurement, it is shown that the hybrid model greatly improves over traditional captioning and text-only methods at extracting artistry, emotionality and conceptuality. While the method has great potential, challenges exist in dealing with cultural bias, data limitations, interpretive subjectivity and computational demands. Overall, the research brings forward the field of AI-assisted artistic communication and provides a scalable solution to help artists, curators, educators, and digital art platforms to create quality artistic statements.

References

Aldekhail, M., and Almasri, M. (2022). Intelligent Identification and Resolution of Software Requirement Conflicts: Assessment and Evaluation. Computer Systems Science and Engineering, 40(2), 469–489. https://doi.org/10.32604/csse.2022.018269 DOI: https://doi.org/10.32604/csse.2022.018269

Borawake, M., Patil, A., Yadav, S., Nagwade, V., and Somwanshi, H. (2025). Driver Drowsiness Detection using ML and IOT. IJRAET, 14(1), 114–117.

Bozyiğit, F., Aktaş, Ö., and Kılınç, D. (2021). Linking Software Requirements and Conceptual Models: A Systematic Literature Review. Engineering Science and Technology, an International Journal, 24(1), 71–82. https://doi.org/10.1016/j.jestch.2020.11.006 DOI: https://doi.org/10.1016/j.jestch.2020.11.006

Deshpande, G., Sheikhi, B., Chakka, S., Zotegouon, D. L., Masahati, M. N., and Ruhe, G. (2021). Is BERT the New Silver Bullet? An Empirical Investigation of Requirements Dependency Classification. In Proceedings of the IEEE 29th International Requirements Engineering Conference Workshops (REW 2021) (pp. 136–145). https://doi.org/10.1109/REW53955.2021.00025 DOI: https://doi.org/10.1109/REW53955.2021.00025

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019) (Vol. 1, pp. 4171–4186). Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423 DOI: https://doi.org/10.18653/v1/N19-1423

Došilović, F. K., Brčić, M., and Hlupić, N. (2018). Explainable Artificial Intelligence: A Survey. In Proceedings of the International Convention on Information, Communication, Technology, Electronics and Microelectronics (MIPRO 2018) DOI: https://doi.org/10.23919/MIPRO.2018.8400040

Joshi, A., Karimi, S., Sparks, R., and Macintyre, C. R. (2020). Survey of Text-Based Epidemic Intelligence: A Computational Linguistics Perspective. ACM Computing Surveys, 52(6), Article 119. https://doi.org/10.1145/3361141 DOI: https://doi.org/10.1145/3361141

Kim, A. Y., and Hardin, J. (2021). Playing the Whole Game: A Data Collection and Analysis Exercise with Google Calendar. Journal of Statistics and Data Science Education, 29(sup1), S51–S60. https://doi.org/10.1080/10691898.2020.1799728 DOI: https://doi.org/10.1080/10691898.2020.1799728

Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., and He, L. (2022). A Survey on Text Classification: From Traditional to Deep Learning. ACM Transactions on Intelligent Systems and Technology, 13(2), Article 31. https://doi.org/10.1145/3495162 DOI: https://doi.org/10.1145/3495162

Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., and Gao, J. (2021). Deep Learning-Based Text Classification: A Comprehensive Review. ACM Computing Surveys, 54(3), 1–40. https://doi.org/10.1145/3439726 DOI: https://doi.org/10.1145/3439726

Peng, S., et al. (2022). A Survey on Deep Learning for Textual Emotion Analysis in Social Networks. Digital Communications and Networks, 8(5), 745–762. https://doi.org/10.1016/j.dcan.2021.10.003 DOI: https://doi.org/10.1016/j.dcan.2021.10.003

Pham, P., Nguyen, L. T. T., Pedrycz, W., et al. (2023). Deep Learning, Graph-Based Text Representation and Classification: A Survey, Perspectives and Challenges. Artificial Intelligence Review, 56, 4893–4927. https://doi.org/10.1007/s10462-022-10265-7 DOI: https://doi.org/10.1007/s10462-022-10265-7

Sarwar, T., et al. (2023). The Secondary use of Electronic Health Records for Data Mining: Data Characteristics and Challenges. ACM Computing Surveys, 55(2), Article 33. https://doi.org/10.1145/3490234 DOI: https://doi.org/10.1145/3490234

Wang, B., Peng, R., Wang, Z., Wang, X., and Li, Y. (2020). An Automated Hybrid Approach for Generating Requirements Trace Links. International Journal of Software Engineering and Knowledge Engineering, 30(7), 1005–1048. https://doi.org/10.1142/S0218194020500278 DOI: https://doi.org/10.1142/S0218194020500278

Wang, R. (2021). K-adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In Findings of the Association for Computational Linguistics: ACL 2021 (pp. 1405–1418). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-acl.121 DOI: https://doi.org/10.18653/v1/2021.findings-acl.121

Wu, J.-L., et al. (2020). Identifying Emotion Labels from Psychiatric Social Texts Using a Bi-Directional LSTM-CNN Model. IEEE Access, 8, 66638–66646. https://doi.org/10.1109/ACCESS.2020.2985228 DOI: https://doi.org/10.1109/ACCESS.2020.2985228

Zhao, L., Alhoshan, W., Ferrari, A., Letsholo, K. J., Ajagbe, M. A., Chioasca, E.-V., and Batista-Navarro, R. T. (2022). Natural Language Processing for Requirements Engineering: A systematic mapping study. ACM Computing Surveys, 54(3), 1–41. https://doi.org/10.1145/3444689 DOI: https://doi.org/10.1145/3444689

NLP MODELS FOR ARTISTIC STATEMENT GENERATION

Authors

DOI:

Keywords:

Abstract [English]

References

Downloads

Published

How to Cite

Issue

Section

License

Custom-Block-Full

Current Issue