NEURAL RENDERING SYSTEMS TO PRODUCE HYPER-REALISTIC ARTISTIC VISUALS FOR MULTIMEDIA PRODUCTIONS

Mandeep Kaurv; Mercy Paul Selvan; Barkha Bhardwaj; Simranjeet Nanda; Shanthi P; Ashutosh Kulkarni; Prasanna Kumar E

doi:10.29121/shodhkosh.v7.i4s.2026.7508

Authors

Mandeep Kaur School of Computer Science Engineering and Technology, Bennett University, Greater Noida, Uttar Pradesh 201310, India
Dr. Mercy Paul Selvan Professor, Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu, India
Barkha Bhardwaj Assistant Professor, Department of Computer Science and Engineering (AI), Noida Institute of Engineering and Technology, Greater Noida, Uttar Pradesh, India
Simranjeet Nanda Centre of Research Impact and Outcome, Chitkara University, Rajpura- 140417, Punjab, India
Shanthi P Assistant Professor, Visual Communication, Meenakshi College of Arts and Science, Meenakshi Academy of Higher Education and Research, Chennai, Tamil Nadu 600080, India
Ashutosh Kulkarni Associate Professor, Department of DESH, Vishwakarma Institute of Technology, Pune, Maharashtra 411037, India
Prasanna Kumar E Assistant Professor, Meenakshi College of Arts and Science, Meenakshi Academy of Higher Education and Research, Chennai, Tamil Nadu 600080, India

DOI:

https://doi.org/10.29121/shodhkosh.v7.i4s.2026.7508

Keywords:

Neural Rendering, Neural Radiance Fields (NERF), Deep Generative Models, Volumetric Rendering, Multimedia Visual Production, Photorealistic Image Synthesis

Abstract [English]

Neural rendering has been the disruptive technology in the creation of very realistic content of the visual multimedia production that the computer graphics and deep learning have substituted. This paper examines the neural rendering systems that can be trained to produce hyper-realistic artistic images through the acquisition of the complex representations of scenes based on multi-view image representations. The suggested architecture compiles the neural radiance field modeling, deep neural networks as well as volumetric rendering to reproduce detailed three-dimensional scenes as well as produce photorealistic images in new perspectives. Multi-view data acquisition, neural feature encoding, and radiance field estimation are the system architecture elements based on deep learning models that capture geometry, lighting, texture and color interaction within a scene. Experimental analysis of neural rendering methods has shown that they render visual fidelity, geometric consistency and rendering realism by a wide margin than the standard computer graphics pipelines. The quantitative investigation of the measures of the quality of rendering, such as the similarity index of the structure, the perceptual realism scores, and the reconstruction accuracy, reveals the significant progress of visual detail and scene modeling.

References

Dong, A. (2022). Technology-Driven Virtual Production: The Advantages and New Applications of Game Engines in the Film Industry. Revista FAMECOS, 29, e43370. https://doi.org/10.15448/1980-3729.2022.1.43370 DOI: https://doi.org/10.15448/1980-3729.2022.1.43370

Fair, J. (2023). Virtual Production and the Potential Impact on Regional Filmmaking: Where do we go from Here? DBS Business Review, 5, 51–58. https://doi.org/10.22375/dbr.v5i.89 DOI: https://doi.org/10.22375/dbr.v5i.89

Kang, W., Guo, L., Kuang, F., Lin, L., Luo, M., Yao, Z., Yang, X., Żelasko, P., and Povey, D. (2023). Fast and Parallel Decoding for Transducer. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). https://doi.org/10.1109/ICASSP49357.2023.10094567 DOI: https://doi.org/10.1109/ICASSP49357.2023.10094567

Karanjekar, N., Thute, A., Ninawe, A., Kawalkar, A., and Meshram, Y. (2025). A Review Design Analysis and Development of Drive Shaft for Automobile Application with Optimization After Design Includes Weight Reduction. International Journal of Trendy and Advanced Research in Mechanical Engineering, 14(1), 35–40. https://doi.org/10.65521/ijtarme.v14i1.517 DOI: https://doi.org/10.65521/ijtarme.v14i1.517

Li, L., Zhu, W., and Hu, H. (2021). Multivisual Animation Character 3D Model Design Method Based on VR Technology. Complexity, 2021, Article 9988803. https://doi.org/10.1155/2021/9988803 DOI: https://doi.org/10.1155/2021/9988803

Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., Zhang, Y., Narayanan, D., Wu, Y., Kumar, A., et al. (2023). Holistic Evaluation of Language Models. Annals of the New York Academy of Sciences, 1525, 140–146. https://doi.org/10.1111/nyas.15007 DOI: https://doi.org/10.1111/nyas.15007

Liao, W., Chu, X., and Wang, Y. (2024). TPO: Aligning Large Language Models with Multi-Branch and Multi-Step Preference Trees. arXiv.

Liu, Y., Xu, Z., Wang, G., Chen, K., Li, B., Tan, X., Li, J., He, L., and Zhao, S. (2021). DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021. In Proceedings of Blizzard Challenge 2021. https://doi.org/10.21437/Blizzard.2021-14 DOI: https://doi.org/10.21437/Blizzard.2021-14

Montes-Romero, Á., Torres-González, A., Montagnuolo, M., Capitán, J., Metta, S., Negro, F., Messina, A., and Ollero, A. (2020). Director Tools for Autonomous Media Production with a Team of Drones. Applied Sciences, 10(4), 1494. https://doi.org/10.3390/app10041494 DOI: https://doi.org/10.3390/app10041494

Priadko, O., and Sirenko, M. (2021). Virtual Production: A New Approach to Filmmaking. Bulletin of Kyiv National University of Culture and Arts, Series in Audiovisual Arts Production, 4, 52–58. https://doi.org/10.31866/2617-2674.4.1.2021.235079 DOI: https://doi.org/10.31866/2617-2674.4.1.2021.235079

Shen, K., Ju, Z., Tan, X., Liu, Y., Leng, Y., He, L., Qij, T., Shao, Z., and Bian, J. (2023). NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers. arXiv.

Tan, X., Qin, T., Soong, F., and Liu, T.-Y. (2021). A Survey on Neural Speech Synthesis. arXiv.

Taylor, P. (2009). Text-to-Speech Synthesis. Cambridge University Press. https://doi.org/10.1017/CBO9780511816338 DOI: https://doi.org/10.1017/CBO9780511816338

Vilchis, C., Perez-Guerrero, C., Mendez-Ruiz, M., and Gonzalez-Mendoza, M. (2023). A Survey on the Pipeline Evolution of Facial Capture and Tracking for Digital Humans. Multimedia Systems, 29, 1917–1940. https://doi.org/10.1007/s00530-023-01081-2 DOI: https://doi.org/10.1007/s00530-023-01081-2

Walmsley, A. P., and Kersten, T. P. (2020). The Imperial Cathedral in Königslutter (Germany) as an Immersive Experience in Virtual Reality with Integrated 360° Panoramic Photography. Applied Sciences, 10(4), 1517. https://doi.org/10.3390/app10041517 DOI: https://doi.org/10.3390/app10041517

Zhang, Y., Wang, W., Zhang, H., Li, H., Liu, C., and Du, X. (2022). Vibration Monitoring and Analysis of Strip Rolling Mill Based on the Digital Twin Model. International Journal of Advanced Manufacturing Technology, 122, 3667–3681. https://doi.org/10.1007/s00170-022-10098-2 DOI: https://doi.org/10.1007/s00170-022-10098-2

Zhou, K., Sisman, B., Rana, R., Schuller, B. W., and Li, H. (2023). Speech Synthesis with Mixed Emotions. IEEE Transactions on Affective Computing, 14, 3120–3134. https://doi.org/10.1109/TAFFC.2022.3233324 DOI: https://doi.org/10.1109/TAFFC.2022.3233324