AI-BASED EDUCATIONAL VIDEO SUMMARIZATION
DOI:
https://doi.org/10.29121/shodhkosh.v6.i2s.2025.6747Keywords:
AI-Based Summarization, Educational Videos, Deep Learning, Natural Language Processing, Computer Vision, Multimodal Analysis, Adaptive Learning, Content Extraction, Automatic Speech Recognition, Personalized EducationAbstract [English]
The proliferation of digital educational content in exponential amounts has led to the creation of an urgency among the efficient methods of summarization that can be used to create large instructional videos into meaningful and succinct features. Educational video summarization is an AI-powered system based on advanced machine learning and natural language processing and computer vision algorithms to provide short, context-rich summaries to make accessibility and understandability more accessible and consumer-friendly among learners. This method combines the multimodal analysis of data based on speech recognition, literature transcription, and understanding of the visual scene to determine the most important instructional points and eliminate superfluous information. Transformer based architectures of deep learning are used to learn semantic associations among spoken words, visual images, and instructional gestures. The models are used to extract relevant pedagogically coherent summaries in accordance with learning objectives. The suggested structure works in the steps of video segmentation, feature extraction, content ranking, and the creation of summaries. At the same time, visual attention models are used to examine the frame and identify slides, demonstrations, and the focus points of the instructor to make sure that the most important educational aspects are kept. The condensed version can be delivered as text-based, video-based, or a combination of both and it promotes adaptive learning systems and customized learning. The AI summarization has shown to be very effective in reducing cognitive overload, improved content discoverability and facilitated efficient learning as students can concentrate on the key information. In addition, it helps teachers and learning institutions in the production of highlight reels, course previews, and searchable knowledge bases. Consequently, this technology will provide a non-discriminatory learning environment in which different learners will enjoy personalized learning experiences. The future directions are to combine affective computing and learner-feedback to further streamline the summary relevance and pedagogical influence.
References
Ansari, S. A., and Zafar, A. (2023). Multi Video Summarization Using Query Based Deep Optimization Algorithm. International Journal of Machine Learning and Cybernetics, 14(10), 3591–3606. https://doi.org/10.1007/s13042-023-01852-3 DOI: https://doi.org/10.1007/s13042-023-01852-3
Chai, C., et al. (2021). Graph-Based Structural Difference Analysis for Video Summarization. Information Sciences, 577, 483–509. https://doi.org/10.1016/j.ins.2021.07.012 DOI: https://doi.org/10.1016/j.ins.2021.07.012
Chen, B., Meng, F., Tang, H., and Tong, G. (2023). Two-Level Attention Module Based on Spurious-3D Residual Networks for Human Action Recognition. Sensors, 23(3), 1707. https://doi.org/10.3390/s23031707 DOI: https://doi.org/10.3390/s23031707
Dey, A., Biswas, S., and Le, D.-N. (2024). Workout Action Recognition in Video Streams using an Attention Driven Residual DC-GRU Network. Computers, Materials and Continua, 79(2), 3067–3087. https://doi.org/10.32604/cmc.2024.049512 DOI: https://doi.org/10.32604/cmc.2024.049512
Hu, W., et al. (2023). Query-Based Video Summarization with Multi-Label Classification Network. Multimedia Tools and Applications, 82(24), 37529–37549. https://doi.org/10.1007/s11042-023-15126-1 DOI: https://doi.org/10.1007/s11042-023-15126-1
Kadam, P., et al. (2022). Recent Challenges and Opportunities in Video Summarization with Machine Learning Algorithms. IEEE Access, 10, 122762–122785. https://doi.org/10.1109/ACCESS.2022.3223379 DOI: https://doi.org/10.1109/ACCESS.2022.3223379
Ul Haq, H. B., Asif, M., Ahmad, M. B., Ashraf, R., and Mahmood, T. (2022). An Effective Video Summarization Framework Based on the Object of Interest Using Deep Learning. Mathematical Problems in Engineering, 2022, Article 7453744. https://doi.org/10.1155/2022/7453744 DOI: https://doi.org/10.1155/2022/7453744
Vora, D., Kadam, P., Mohite, D. D., et al. (2025). AI-Driven Video Summarization for Optimizing Content Retrieval and Management Through Deep Learning Techniques. Scientific Reports, 15, 4058. https://doi.org/10.1038/s41598-025-87824-9 DOI: https://doi.org/10.1038/s41598-025-87824-9
Wadibhasme, R. N., Chaudhari, A. U., Khobragade, P., Mehta, H. D., Agrawal, R., and Dhule, C. (2024). Detection and Prevention of Malicious Activities in Vulnerable Network Security Using Deep Learning. In 2024 International Conference on Innovations and Challenges in Emerging Technologies (ICICET) (1–6). IEEE. https://doi.org/10.1109/ICICET59348.2024.10616289 DOI: https://doi.org/10.1109/ICICET59348.2024.10616289
Weng, Z., Li, X., and Xiong, S. (2024). Action Recognition Using Attention-Based Spatio-Temporal Vlad Networks and Adaptive Video Sequences Optimization. Scientific Reports, 14(1), 26202. https://doi.org/10.1038/s41598-024-75640-6 DOI: https://doi.org/10.1038/s41598-024-75640-6
Wu, G., Lin, J., and Silva, C. T. (2022). IntentVizor: Towards Generic Query Guided Interactive Video Summarization. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 10493–10502). IEEE.sss https://doi.org/10.1109/CVPR52688.2022.01025 DOI: https://doi.org/10.1109/CVPR52688.2022.01025
Xiao, S., Zhao, Z., Zhang, Z., Guan, Z., and Cai, D. (2020). Query-Biased Self-Attentive Network for Query-Focused Video Summarization. IEEE Transactions on Image Processing, 29, 5889–5899. https://doi.org/10.1109/TIP.2020.2985868 DOI: https://doi.org/10.1109/TIP.2020.2985868
Zhao, B., Gong, M., and Li, X. (2022). Hierarchical Multimodal Transformer to Summarize Videos. Neurocomputing, 468, 360–369. https://doi.org/10.1016/j.neucom.2021.10.039 DOI: https://doi.org/10.1016/j.neucom.2021.10.039
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Dr. Satish Choudhury, Mani Nandini Sharma, Rajeev Sharma, Ganesh Rambhau Gandal, Dr. Satish Choudhury, Avni Garg

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.























