AI-ASSISTED MACRO PHOTOGRAPHY LEARNING MODELS
DOI:
https://doi.org/10.29121/shodhkosh.v6.i2s.2025.6734Keywords:
Macro Photography, AI-Assisted Learning, Computer Vision, Deep Learning Models, Reinforcement Learning, Aesthetic Evaluation, Image Quality Assessment, Generative Simulation, Diffusion Models, Educational Technology, Intelligent Tutoring Systems, Photography TrainingAbstract [English]
Macro photography requires a lot of control in focus, lighting, depth of field and stability of the camera which makes it one of the most technical challenging technique of photography to a beginner. Conventional learning techniques are too much guided by trials and errors and without real time corrective feedback, there is a tendency to make slow progress and unstable outcome. This paper outlines an AI-Assisted Macro Photography Learning Model as an implementation of deep learning and reinforcement learning with a generative simulation that would deliver context-aware and customized feedback to learners. The system uses hybrid CNN-Transformer frameworks to analyze macro images and measure sharpness, illumination, exposure, and composition, and a reinforcement learning engine uses it to suggest the best camera settings depending on the conditions of the image. A virtual macro simulation environment also allows safe repeatable practice with photorealistic: diffusion-based synthetic scenes. Assessment on 60 beginner photographers indicates that AI-aided trainees gained up to 32 percent in sharpness, 28 percent in light precision and used 40 percent fewer techniques to record usable photographs than their conventional partners. These findings underpin the usefulness of AI-based feedback in enhancing the acceleration of skills acquisition as well as the enhancement of technical and aesthetic skill. The offered structure provides a scaffoldable way of developing macro photography learning into a technology-intensive, adaptive, and structured learning process.
References
Aydın, T. O., Smolic, A., and Gross, M. (2015). Automated Aesthetic Analysis of Photographic Images. IEEE Transactions on Visualization and Computer Graphics, 21(1), 31–42. https://doi.org/10.1109/TVCG.2014.2325047 DOI: https://doi.org/10.1109/TVCG.2014.2325047
Dayma, B., Patil, S., Cuenca, P., Saifullah, K., Abraham, T., Le, P., Luke, and Ghosh, R. (2022). DALL·E Mini Explained.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2020). An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale (arXiv:2010.11929).
Göring, S., Rao, R. R. R., and Raake, A. (2023). Quality Assessment of Higher Resolution Images and Videos with Remote Testing. Quality and User Experience, 8(1), Article 6. https://doi.org/10.1007/s41233-023-00055-6 DOI: https://doi.org/10.1007/s41233-023-00055-6
Göring, S., Rao, R. R. R., Feiten, B., and Raake, A. (2021). Modular Framework and Instances of Pixel-Based Video Quality Models for UHD-1/4K. IEEE Access, 9, 31842–31864. https://doi.org/10.1109/ACCESS.2021.3059932 DOI: https://doi.org/10.1109/ACCESS.2021.3059932
Göring, S., Rao, R. R. R., Fremerey, S., and Raake, A. (2021). AVrate Voyager: An Open Source Online Testing Platform. In Proceedings of the IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP) (pp. 1–6). IEEE. https://doi.org/10.1109/MMSP53017.2021.9733561 DOI: https://doi.org/10.1109/MMSP53017.2021.9733561
International Telecommunication Union. (2022). Subjective Video Quality Assessment Methods for Multimedia Applications (ITU-T Recommendation .910).
Nichol, A., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., McGrew, B., Sutskever, I., and Chen, M. (2021). GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models (arXiv:2112.10741).
Pavlichenko, N., Zhdanov, F., and Ustalov, D. (2022). Best Prompts for Text-To-Image Models and How to Find them (arXiv:2209.11711). https://doi.org/10.1145/3539618.3592000 DOI: https://doi.org/10.1145/3539618.3592000
Rao, R. R. R., Göring, S., and Raake, A. (2021). Towards High Resolution Video Quality Assessment in the Crowd. In Proceedings of the 13th International Conference on Quality of Multimedia Experience (QoMEX). DOI: https://doi.org/10.1109/QoMEX51781.2021.9465425
Roose, K. (2022). An AI-Generated Picture Won an Art Prize. Artists Aren’t Happy. The New York Times.
Saharia, C., Chan, W., Saxena, S., Li, L., Whang, J., Denton, E., Ghasemipour, S. K. S., Ayan, B. K., Mahdavi, S. S., Lopes, R. G., Salimans, T., Ho, J., Fleet, D. J., and Norouzi, M. (2022). Photorealistic Text-To-Image Diffusion Models With Deep Language Understanding (arXiv:2205.11487). https://doi.org/10.1145/3528233.3530757 DOI: https://doi.org/10.1145/3528233.3530757
Wang, X., Xie, L., Dong, C., and Shan, Y. (2021). Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (1905–1914). https://doi.org/10.1109/ICCVW54120.2021.00217 DOI: https://doi.org/10.1109/ICCVW54120.2021.00217
Yang, S., Wu, T., Shi, S., Lao, S., Gong, Y., Cao, M., Wang, J., and Yang, Y. (2022). MANIQA: Multi-Dimension Attention Network for No-Reference Image Quality Assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (1191–1200). https://doi.org/10.1109/CVPRW56347.2022.00126 DOI: https://doi.org/10.1109/CVPRW56347.2022.00126
Yu, J., Xu, Y., Koh, J. Y., Luong, T., Baid, G., Wang, Z., Vasudevan, V., Ku, A., Yang, Y., Ayan, B. K., Hutchinson, B., Han, W., Parekh, Z., Li, X., Zhang, H., Baldridge, J., and Wu, Y. (2022). Scaling Autoregressive Models for Content-Rich Text-To-Image Generation (arXiv:2206.10789).
Zhang, W., Ma, K., Yan, J., Deng, D., and Wang, Z. (2020). Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1), 36–47. https://doi.org/10.1109/TCSVT.2018.2886771 DOI: https://doi.org/10.1109/TCSVT.2018.2886771
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nihar Das, Savinder Kaur, Sidhant Das, Darshana Prajapati, Mithun M S, Dr. Anil Hingmire

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.























