EXPLAINABLE AI FOR STYLE INTERPRETATION IN CONTEMPORARY ART USING VISION TRANSFORMERS
DOI:
https://doi.org/10.29121/shodhkosh.v6.i4s.2025.6941Keywords:
Explainable AI, Vision Transformers, Style Interpretation, Contemporary Art, Attention MapsAbstract [English]
This study looks into how Explainable AI (XAI) can be used to figure out style in modern art by using Vision Transformers (ViTs). As the need for interpretability in AI-driven art research has grown, models have had to be made that not only work well but also give clear, understandable reasons for the choices they make. We use ViTs, a cutting-edge deep learning system that is known for being very good at classifying images, to look at and figure out what the style aspects of modern art mean. The study aims to find a balance between the need for high-performance AI models and the need for openness in the art world. It will do this by showing how certain aspects of artworks, like colour schemes, structural structures, and brushstroke patterns, affect the overall style. We present a mixed framework that blends the power of Vision Transformers with techniques for explaining things like Grad-CAM and focus maps. This framework helps you see and understand how the model's predictions work. The results show that the model can correctly spot important creative traits and give visual descriptions, which helps people understand different types of art. Additionally, the suggested method is tried on a wide range of modern artworks, showing that it can be used with various types of art. This work has effects beyond just analysing art; it gives managers, artists, and students a useful tool for working with AI systems in a more open way. It also adds to the field of explainable AI by using these methods to study art analysis, which is very biassed and hard to explain.
References
Ahmed, S., Nielsen, I. E., Tripathi, A., Siddiqui, S., Ramachandran, R. P., and Rasool, G. (2023). Transformers in Time-Series Analysis: A Tutorial. Circuits, Systems, and Signal Processing, 42, 7433–7466. https://doi.org/10.1007/s00034-023-02454-8 DOI: https://doi.org/10.1007/s00034-023-02454-8
Alayrac, J.-B., Donahue, J., Luc, P., Miech, A., Barr, I., Hasson, Y., Lenc, K., Mensch, A., Millican, K., Reynolds, M., and others. (2022). Flamingo: A Visual Language Model for Few-Shot Learning. Advances in Neural Information Processing Systems, 35, 23716–23736.
Burkart, N., and Huber, M. F. (2021). A Survey on the Explainability of Supervised Machine Learning. Journal of Artificial Intelligence Research, 70, 245–317. https://doi.org/10.1613/jair.1.12228 DOI: https://doi.org/10.1613/jair.1.12228
Chan, A., Schneider, M., and Körner, M. (2023). XAI for Early Crop Classification. In Proceedings of the 2023 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (2657–2660). IEEE. https://doi.org/10.1109/IGARSS52108.2023.10281498 DOI: https://doi.org/10.1109/IGARSS52108.2023.10281498
Colliot, O. (2023). Machine Learning for Brain Disorders. Springer Nature. https://doi.org/10.1007/978-1-0716-3195-9 DOI: https://doi.org/10.1007/978-1-0716-3195-9
Gemini Team, Google. (2023). Gemini: A Family of Highly Capable Multimodal Models (Technical Report). Google.
Jakobsen, T. S. T., Cabello, L., and Søgaard, A. (2023). Being Right for Whose Right Reasons? In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (1033–1054). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.acl-long.59 DOI: https://doi.org/10.18653/v1/2023.acl-long.59
Madan, B. S., Zade, N. J., Lanke, N. P., Pathan, S. S., Ajani, S. N., and Khobragade, P. (2024). Self-Supervised Transformer Networks: Unlocking New Possibilities for Label-Free Data. Panamerican Mathematical Journal, 34(4), 194–210. https://doi.org/10.52783/pmj.v34.i4.1878 DOI: https://doi.org/10.52783/pmj.v34.i4.1878
Marcinkevičs, R., and Vogt, J. E. (2023). Interpretable and Explainable Machine Learning: A Methods-Centric Overview with Concrete Examples. WIREs Data Mining and Knowledge Discovery, 13, e1493. https://doi.org/10.1002/widm.1493 DOI: https://doi.org/10.1002/widm.1493
Thampi, A. (2022). Interpretable AI: Building Explainable Machine Learning Systems. Simon and Schuster.
Vijayakumar, S. (2022). Interpretability in Activation Space Analysis of Transformers: A Focused Survey. In Proceedings of the CIKM 2022 Workshops Co-Located with the 31st ACM International Conference on Information and Knowledge Management.
Yang, Y., Jiao, L., Liu, F., Liu, X., Li, L., Chen, P., and Yang, S. (2023). An Explainable Spatial-Frequency Multiscale Transformer for Remote Sensing Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, 61, 1–15. https://doi.org/10.1109/TGRS.2023.3265361 DOI: https://doi.org/10.1109/TGRS.2023.3265361
Yu, L., and Xiang, W. (2023). X-Pruner: Explainable Pruning for Vision Transformers. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (24355–24363). IEEE/CVF. https://doi.org/10.1109/CVPR52729.2023.02333 DOI: https://doi.org/10.1109/CVPR52729.2023.02333
Zini, J. E., and Awad, M. (2022). On the Explainability of Natural Language Processing Deep Models. ACM Computing Surveys, 55, 1–31. https://doi.org/10.1145/3529755 DOI: https://doi.org/10.1145/3529755
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nikil Tiwari, Naman Soni, Rahul Anantrao Padgilwar, Dr. Preeti Pandurang Kale, Dr. Mandeep Kaur, Dr. Vinay Nagalkar

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.























