ML AND RAG-BASED INTELLIGENT SYSTEM FOR YOGA POSE RECOGNITION AND CORRECTIVE GUIDANCE
DOI:
https://doi.org/10.29121/ijetmr.v13.i4.2026.1768Keywords:
Yoga Pose Recognition, Machine Learning, Retrieval-Augmented Generation, Computer Vision, Human Pose Estimation, Digital Health, Ai-Based Fitness SystemsAbstract
Yoga pose recognition has gained significant importance in digital health and fitness systems, where accurate posture assessment and corrective feedback are critical for safe practice. Traditional computer vision–based approaches rely on pose estimation models but often lack contextual understanding and personalized guidance. To address this limitation, this paper proposes a hybrid framework that integrates Machine Learning (ML)–based pose recognition with Retrieval-Augmented Generation (RAG) for intelligent feedback generation. The system utilizes human pose estimation techniques to extract skeletal keypoints and classify yoga poses using supervised learning models. Subsequently, a RAG module retrieves relevant expert knowledge from a curated yoga knowledge base and generates context-aware corrective suggestions. This dual-layer architecture ensures both high recognition accuracy and meaningful interpretability of results. The proposed approach aims to bridge the gap between static classification systems and interactive AI-driven coaching by enabling real-time feedback and adaptive recommendations. The framework is designed as a conceptual model with potential applicability in mobile health applications, smart fitness systems, and remote yoga training platforms. By combining data-driven learning with knowledge retrieval mechanisms, the system enhances both usability and reliability in real-world scenarios.
Downloads
References
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. DOI: https://doi.org/10.1023/A:1010933404324
Cao, Z., Hidalgo, G., Simon, T., Wei, S., and Sheikh, Y. (2021). OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1), 172–186. DOI: https://doi.org/10.1109/TPAMI.2019.2929257
Carreira, J., and Zisserman, A. (2017). Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (6299–6308). DOI: https://doi.org/10.1109/CVPR.2017.502
Chen, T., and Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (785–794). DOI: https://doi.org/10.1145/2939672.2939785
Cortes, C., and Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297. DOI: https://doi.org/10.1023/A:1022627411411
Devlin, J., Chang, M.‑W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (4171–4186). DOI: https://doi.org/10.18653/v1/N19-1423
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. (2010). The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338. DOI: https://doi.org/10.1007/s11263-009-0275-4
Girshick, R. (2015). Fast R‑CNN. In Proceedings of the IEEE International Conference on Computer Vision (1440–1448). DOI: https://doi.org/10.1109/ICCV.2015.169
Goodfellow, I., Pouget‑Abadie, J., Mirza, M., Xu, B., Warde‑Farley, D., Ozair, S., Courville, A., Bengio, Y. (2014). Generative Adversarial nets. In Advances in Neural Information Processing Systems (2672–2680).
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (770–778). DOI: https://doi.org/10.1109/CVPR.2016.90
He, P., Liu, W., Gao, J., and Chen, W. (2020). DeBERTa: Decoding-enhanced BERT with Disentangled Attention. arXiv preprint arXiv:2006.03654.
Hochreiter, S., and Schmidhuber, J. (1997). Long Short‑Term Memory. Neural Computation, 9(8), 1735–1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735
Kingma, D. P., and Ba, J. (2015). Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). ImageNet classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems (1097–1105).
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521, 436–444. DOI: https://doi.org/10.1038/nature14539
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.‑T., Rocktäschel, T., Riedel, S., and Kiela, D. (2020). Retrieval‑Augmented Generation for Knowledge‑Intensive NLP Tasks. In Advances in Neural Information Processing Systems.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. Arxiv Preprint arXiv:1301.3781.
Pavlakos, G., Zhou, X., Derpanis, K. G., and Daniilidis, K. (2017). Coarse‑To‑Fine Volumetric Prediction for Single‑Image 3d Human Pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (7025–7034). DOI: https://doi.org/10.1109/CVPR.2017.139
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). You Only Look once: Unified, Real‑Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (779–788). DOI: https://doi.org/10.1109/CVPR.2016.91
Ren, S., He, K., Girshick, R., and Sun, J. (2017). Faster R‑CNN: Towards Real‑Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149. DOI: https://doi.org/10.1109/TPAMI.2016.2577031
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. (2011). Real‑Time Human Pose Recognition in Parts from Single Depth Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1297–1304). DOI: https://doi.org/10.1109/CVPR.2011.5995316
Simonyan, K., and Zisserman, A. (2014). Very Deep Convolutional Networks for Large‑Scale Image Recognition. arXiv preprint arXiv:1409.1556.
Toshev, A., and Szegedy, C. (2014). DeepPose: Human Pose Estimation via deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1653–1660). DOI: https://doi.org/10.1109/CVPR.2014.214
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (5998–6008).
Zhang, F., Bazarevsky, V., Vakunov, A., Sung, G., Chang, C.‑L., and Grundmann, M. (2019). MediaPipe: A Framework for Building Perception pipelines. arXiv preprint arXiv:1906.08172.
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Dr. Harish Barapatre, Pratik Malgunde, Atharva Pratap, Rayan Shaikh

This work is licensed under a Creative Commons Attribution 4.0 International License.
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.
- That it is not under consideration for publication elsewhere.
- That its release has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with International Journal of Engineering Technologies and Management Research agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors can enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or edit it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) before and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
For More info, please visit CopyRight Section





















