MACHINE LEARNING FOR MUSIC AND MOVEMENT COORDINATION

Krishna Reddy BN; Ansh  Kataria; Vijayendra Kumar Shrivastava; Madhur Grover; Durga Prasad; Nishant Kulkarni

doi:10.29121/shodhkosh.v6.i3s.2025.6792

Authors

Mr. Krishna Reddy BN Associate Professor, Department of Management Studies, JAIN (Deemed-to-be University), Bengaluru, Karnataka, India
Ansh Kataria Centre of Research Impact and Outcome, Chitkara University, Rajpura- 140417, Punjab, India
Vijayendra Kumar Shrivastava Professor, Department of Mangement, Vivekananda Global University, Jaipur, India
Madhur Grover Chitkara Centre for Research and Development, Chitkara University, Himachal Pradesh, Solan, 174103, India
Durga Prasad Associate Professor, School of Engineering and Technology, Noida International University, 203201, India
Nishant Kulkarni Department of Mechanical Engineering Vishwakarma Institute of Technology, Pune, Maharashtra, 411037 India

DOI:

https://doi.org/10.29121/shodhkosh.v6.i3s.2025.6792

Keywords:

Sensorimotor Synchronization, Multimodal Machine Learning, Music Information Retrieval, Motion Analysis, Deep Learning Models

Abstract [English]

The coordination of music and movement is a complicated interaction of auditory perception with motor planning and instant sensorimotor combination. New developments in the field of machine learning have opened up new possibilities to model, predict and improve this interaction to be used in the area of performance analysis, rehabilitation, interactive systems and human-computer collaboration. In this paper, the researcher examines a multimodal model that combines audio characteristics alongside kinematic movement information to elicit temporal and spatial dynamics of coordinated behavior. Based on the prior experience in rhythm perception, beat tracking, and gesture recognition, the proposed system uses the latest deep learning models, such as CNNs, RNNs, LSTMs, and Transformers, to train effective representations of rhythmic shape and movement patterns. An end to end signal-processing chain is used, which has audio preprocessing, motion-capture or IMU-based trackers and filtering to minimize noise and guarantee reliability of the data. The process of feature extraction is temporal, spectral and kinematic, which allows the models to deduce the accuracy of synchronization, the quality of movement, and sensitivity to musical cues. The strategies of training focus on cross-validation, hyperparameter optimization and regularization to enhance generalization of various datasets and styles of movements. The findings indicate that the multimodal learning is more effective in predicting the beat alignment, the classification of gestures, and the time coordination as compared to the unimodal learning methods.

References

Afchar, D., Melchiorre, A., Schedl, M., Hennequin, R., Epure, E., and Moussallam, M. (2022). Explainability in Music Recommender Systems. AI Magazine, 43(2), 190–208. https://doi.org/10.1002/aaai.12056 DOI: https://doi.org/10.1002/aaai.12056

Agostinelli, A., Denk, T. I., Borsos, Z., Engel, J., Verzetti, M., Caillon, A., Huang, Q., Jansen, A., Roberts, A., Tagliasacchi, M., et al. (2023). MusicLM: Generating Music from Text (arXiv:2301.11325). arXiv.

Copet, J., Kreuk, F., Gat, I., Remez, T., Kant, D., Synnaeve, G., Adi, Y., and Défossez, A. (2023). Simple and Controllable Music Generation. Advances in Neural Information Processing Systems, 36, 47704–47720.

Damm, L., Varoqui, D., De Cock, V. C., Dalla Bella, S., and Bardy, B. (2020). Why do we Move to the Beat? A Multiscale Approach, from Physical Principles to Brain Dynamics. Neuroscience and Biobehavioral Reviews, 112, 553–584. https://doi.org/10.1016/j.neubiorev.2019.12.024 DOI: https://doi.org/10.1016/j.neubiorev.2019.12.024

Dhariwal, P., Jun, H., Payne, C., Kim, J. W., Radford, A., and Sutskever, I. (2020). Jukebox: A Generative Model for Music (arXiv:2005.00341). arXiv.

Eftychios, A., Nektarios, S., and Nikoleta, G. (2021). Alzheimer Disease and Music Therapy: An Interesting Therapeutic Challenge and Proposal. Advances in Alzheimer’s Disease, 10(1), 1–18. https://doi.org/10.4236/aad.2021.101001 DOI: https://doi.org/10.4236/aad.2021.101001

Huang, Q., Park, D. S., Wang, T., Denk, T. I., Ly, A., Chen, N., Zhang, Z., Zhang, Z., Yu, J., Frank, C., et al. (2023). Noise2Music: Text-Conditioned Music Generation with Diffusion Models (arXiv:2302.03917). arXiv.

Marquez-Garcia, A. V., Magnuson, J., Morris, J., Iarocci, G., Doesburg, S., and Moreno, S. (2022). Music Therapy in Autism Spectrum Disorder: A Systematic Review. Review Journal of Autism and Developmental Disorders, 9(1), 91–107. https://doi.org/10.1007/s40489-021-00246-x DOI: https://doi.org/10.1007/s40489-021-00246-x

Messingschlager, T. V., and Appel, M. (2023). Mind Ascribed to AI and the Appreciation of AI-Generated Art. New Media and Society, 27(6), 1673–1692. https://doi.org/10.1177/14614448231200248 DOI: https://doi.org/10.1177/14614448231200248

Ning, Z., Chen, H., Jiang, Y., Hao, C., Ma, G., Wang, S., Yao, J., and Xie, L. (2025). DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion (arXiv:2503.01183). arXiv.

Schneider, F., Kamal, O., Jin, Z., and Schölkopf, B. (2023). Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion (arXiv:2301.11757). arXiv.

Tang, H., Chen, L., Wang, Y., Zhang, Y., Yang, N., and Yang, N. (2021). The Efficacy of Music Therapy to Relieve Pain, Anxiety, and Promote Sleep Quality in Patients with Small Cell Lung Cancer Receiving Platinum-Based Chemotherapy. Supportive Care in Cancer, 29(12), 7299–7306. https://doi.org/10.1007/s00520-021-06152-6 DOI: https://doi.org/10.1007/s00520-021-06152-6

Wang, W., Li, J., Li, Y., and Xing, X. (2024). Style-Conditioned Music Generation with Transformer-GANs. Frontiers of Information Technology and Electronic Engineering, 25(1), 106–120. https://doi.org/10.1631/FITEE.2300359 DOI: https://doi.org/10.1631/FITEE.2300359

Williams, D., Hodge, V. J., and Wu, C.-Y. (2020). On the Use of AI for Generation of Functional Music to Improve Mental Health. Frontiers in Artificial Intelligence, 3, Article 497864. https://doi.org/10.3389/frai.2020.497864 DOI: https://doi.org/10.3389/frai.2020.497864

Yu, J., Wu, S., Lu, G., Li, Z., Zhou, L., and Zhang, K. (2024). Suno: Potential, Prospects, and Trends. Frontiers of Information Technology and Electronic Engineering, 25(7), 1025–1030. https://doi.org/10.1631/FITEE.2400299 DOI: https://doi.org/10.1631/FITEE.2400299

MACHINE LEARNING FOR MUSIC AND MOVEMENT COORDINATION

Authors

DOI:

Keywords:

Abstract [English]

References

Downloads

Published

How to Cite

Issue

Section

License

Custom-Block-Full

Current Issue