GENERATIVE ART PHOTOGRAPHY USING DIFFUSION MODELS
DOI:
https://doi.org/10.29121/shodhkosh.v6.i1s.2025.6645Keywords:
Diffusion Models, Hybrid Diffusion Architecture, Generative Art Photography, Latent Denoising, Prompt Guidance, Photography-Aware ConditioningAbstract [English]
This work introduces a hybrid diffusion model with the purpose of improving generative art photography via the combination of latent space denoising, dual guidance and photography-aware conditioning. By using text-based semantic control in conjunction with exposure, depth-of-field and color harmonic cues, the system generates more aesthetically consistent images that are more photographically realistic. Experimental results demonstrate that the proposed model has higher visual clarity, more aligned prompt, and more stable light compared to baseline diffusion frameworks, and only needs much fewer sampling steps with the help of auxiliary consistency refinement module. Further analyses like distribution of the aesthetic score, exposure heatmaps, structural-creativity trade-off, and sharpness of texture comparisons verify the validity of the model.
References
Borawake, M., Patil, A., Raut, K., Shelke, K., and Yadav, S. (2025). Deep Fake Audio Recognition Using Deep Learning. International Journal of Research in Advanced Engineering and Technology (IJRAET), 14(1), 108–113. https://doi.org/10.55041/ISJEM03689 DOI: https://doi.org/10.55041/ISJEM03689
Huang, X., Zou, D., Dong, H., Ma, Y.-A., and Zhang, T. (2024). Faster Sampling without Isoperimetry via Diffusion-Based Monte Carlo. In Proceedings of the 37th Conference on Learning Theory (COLT 2024) (2438–2493).
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2023). High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) (10684–10695). IEEE. DOI: https://doi.org/10.1109/CVPR52729.2023.02161
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., and Aberman, K. (2023). DreamBooth: Fine-Tuning Text-To-Image Diffusion Models for Subject-Driven Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) ( 22500–22510). IEEE. https://doi.org/10.1109/CVPR52729.2023.02155 DOI: https://doi.org/10.1109/CVPR52729.2023.02155
Salimans, T., and Ho, J. (2022). Progressive Distillation for Fast Sampling of Diffusion Models. arXiv Preprint arXiv:2202.00512.
Sampath, B., Ayyappa, D., Kavya, G., Rabins, B., and Chandu, K. G. (2025). ADGAN++: A Deep Framework for Controllable and Realistic Face Synthesis. International Journal of Advanced Computer Engineering and Communication Technology (IJACECT), 14(1), 25–31. https://doi.org/10.65521/ijacect.v14i1.168 DOI: https://doi.org/10.65521/ijacect.v14i1.168
Song, Y., Durkan, C., Murray, I., and Ermon, S. (2021). Maximum Likelihood Training of Score-Based Diffusion Models. Advances in Neural Information Processing Systems, 34, 1415–1428.
Ulhaq, A., Akhtar, N., and Pogrebna, G. (2022). Efficient Diffusion Models for Vision: A Survey. arXiv Preprint arXiv:2210.09292.
Wang, Y., et al. (2024). SINSR: Diffusion-Based Image Super-Resolution in a Single Step. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024) (25796–25805). IEEE. https://doi.org/10.1109/CVPR52733.2024.02437 DOI: https://doi.org/10.1109/CVPR52733.2024.02437
Wang, Y., Zhang, W., Zheng, J., and Jin, C. (2023). High-Fidelity Person-Centric Subject-To-Image Synthesis. arXiv Preprint arXiv:2311.10329. DOI: https://doi.org/10.1109/CVPR52733.2024.00733
Wang, Z., Zhao, L., and Xing, W. (2023). StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023) (7643–7655). IEEE. https://doi.org/10.1109/ICCV51070.2023.00706 DOI: https://doi.org/10.1109/ICCV51070.2023.00706
Wu, X., Hu, Z., Sheng, L., and Xu, D. (2021). StyleFormer: Real-time Arbitrary Style Transfer via Parametric Style Composition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021) (10684–10695). IEEE. DOI: https://doi.org/10.1109/ICCV48922.2021.01435
Yang, B., Luo, Y., Chen, Z., Wang, G., Liang, X., and Lin, L. (2023). Law-diffusion: Complex Scene Generation by Diffusion with Layouts. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023) (pp. 22612–22622). IEEE. https://doi.org/10.1109/ICCV51070.2023.02072 DOI: https://doi.org/10.1109/ICCV51070.2023.02072
Yeh, Y.-Y., et al. (2024). TextureDreamer: Image-Guided Texture Synthesis Through Geometry-Aware Diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024) (4304–4314). IEEE. https://doi.org/10.1109/CVPR52733.2024.00412 DOI: https://doi.org/10.1109/CVPR52733.2024.00412
Yi, X., Han, X., Zhang, H., Tang, L., and Ma, J. (2023). Diff-Retinex: Rethinking Low-Light Image Enhancement with a Generative Diffusion Model. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023) (12268–12277). IEEE. https://doi.org/10.1109/ICCV51070.2023.01130 DOI: https://doi.org/10.1109/ICCV51070.2023.01130
Zhang, W., Zhai, G., Wei, Y., Yang, X., and Ma, K. (2023). Blind Image Quality Assessment Via Vision–Language Correspondence: A Multitask Learning Perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) (14071–14081). IEEE. https://doi.org/10.1109/CVPR52729.2023.01352 DOI: https://doi.org/10.1109/CVPR52729.2023.01352
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nidhi Tewatia, Lalit Khanna, Dr. Jeberson Retna Raj, Pavas Saini, Dr. Kunal Meher, Dr. Bichitrananda Patra

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.























