REINFORCEMENT LEARNING-BASED ROUTING PROTOCOLS FOR INTERNET OF THINGS NETWORKS: A COMPREHENSIVE SURVEY AND FUTURE RESEARCH DIRECTIONS

Hitesh Parmar; Kamaljit Lakhtaria

doi:10.29121/shodhkosh.v5.i2.2024.6227

Authors

Hitesh Parmar K.S School of Business Management & Information Technology, Gujarat University, Ahmedabad, India
Dr. Kamaljit Lakhtaria K.S School of Business Management & Information Technology, Gujarat University, Ahmedabad, India

DOI:

https://doi.org/10.29121/shodhkosh.v5.i2.2024.6227

Keywords:

Reinforcement Learning, Internet of Things, Routing Protocols, Q‑Learning, Deep Q‑Network, Multi‑Agent Systems, Energy Efficiency, Network Optimization

Abstract [English]

Background: The Internet of Things (IoT) connects billions of resource‑constrained devices, producing highly dynamic topologies and stringent energy constraints. Conventional routing protocols lack the adaptability required for such conditions, motivating reinforcement learning (RL) to enable intelligent and adaptive routing decisions.
Methods: This survey reviews over 150 peer‑reviewed studies published between 2020 and 2024, classifying RL‑based IoT routing protocols into energy‑efficient, congestion‑aware and multi‑objective categories, and analysing key performance metrics and emerging research trends.
Results: RL‑driven routing methods outperform traditional protocols, delivering significant gains in network lifetime, packet delivery ratio and energy consumption; deep RL and multi‑agent frameworks offer enhanced scalability, reliability and latency benefits.
Conclusions: RL shows strong potential for scalable and adaptive routing in IoT networks. Future work should explore federated multi‑agent learning, edge‑AI integration and software‑defined networking, quantum‑enhanced approaches, security. Survey provides a comprehensive roadmap for researchers and practitioners seeking to advance RL‑based IoT routing.

References

Al Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., & Ayyash, M. (2020). Internet of Things: A survey on enabling technologies, protocols, and applications. IEEE Communications Surveys & Tutorials, 17(4), 2347–2376. DOI: https://doi.org/10.1109/COMST.2015.2444095

Lin, J., Yu, W., Zhang, N., Yang, X., Zhang, H., & Zhao, W. (2020). A survey on Internet of Things: Architecture, enabling technologies, security and privacy, and applications. IEEE Internet of Things Journal, 4(5), 1125–1142. DOI: https://doi.org/10.1109/JIOT.2017.2683200

Chiang, M., & Zhang, T. (2020). Fog and IoT: An overview of research opportunities. IEEE Internet of Things Journal, 3(6), 854–864. DOI: https://doi.org/10.1109/JIOT.2016.2584538

Naik, N. (2020). Choice of effective messaging protocols for IoT systems: MQTT, CoAP, AMQP, and HTTP. In Proceedings of the IEEE International Systems Engineering Symposium (pp. 1–7). DOI: https://doi.org/10.1109/SysEng.2017.8088251

Yang, Y., Wu, L., Yin, G., Li, L., & Zhao, H. (2020). A survey on security and privacy issues in Internet-of-Things. IEEE Internet of Things Journal, 4(5), 1250–1258. DOI: https://doi.org/10.1109/JIOT.2017.2694844

Frikha, M. S., Gammar, S. M., Lahmadi, A., & Andrey, L. (2021). Reinforcement and deep reinforcement learning for wireless Internet of Things: A survey. Computer Communications, 178, 98–113. DOI: https://doi.org/10.1016/j.comcom.2021.07.014

Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2020). Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6), 26–38. DOI: https://doi.org/10.1109/MSP.2017.2743240

Atzori, L., Iera, A., & Morabito, G. (2020). Understanding the Internet of Things: Definition, potentials, and societal role of a fast-evolving paradigm. Ad Hoc Networks, 56, 122–140. DOI: https://doi.org/10.1016/j.adhoc.2016.12.004

Zanella, A., Bui, N., Castellani, A., Vangelista, L., & Zorzi, M. (2020). Internet of Things for smart cities. IEEE Internet of Things Journal, 1(1), 22–32. DOI: https://doi.org/10.1109/JIOT.2014.2306328

Sicari, S., Rizzardi, A., Grieco, L. A., & Coen-Porisini, A. (2020). Security, privacy, and trust in the Internet of Things: The road ahead. Computer Networks, 76, 146–164. DOI: https://doi.org/10.1016/j.comnet.2014.11.008

Ray, P. P. (2020). A survey on Internet of Things architectures. Journal of King Saud University – Computer and Information Sciences, 30(3), 291–319. DOI: https://doi.org/10.1016/j.jksuci.2016.10.003

Gubbi, J., Buyya, R., Marusic, S., & Palaniswami, M. (2020). Internet of Things (IoT): A vision, architectural elements, and future directions. Future Generation Computer Systems, 29(7), 1645–1660. DOI: https://doi.org/10.1016/j.future.2013.01.010

Miorandi, D., Sicari, S., De Pellegrini, F., & Chlamtac, I. (2020). Internet of things: Vision, applications and research challenges. Ad Hoc Networks, 10(7), 1497–1516. DOI: https://doi.org/10.1016/j.adhoc.2012.02.016

Sutton, R. S., & Barto, A. G. (2020). Reinforcement learning: An introduction (2nd ed.). MIT Press.

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2020). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. DOI: https://doi.org/10.1038/nature14236

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2020). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

Watkins, C. J., & Dayan, P. (2020). Q-learning. Machine Learning, 8(3–4), 279–292. DOI: https://doi.org/10.1023/A:1022676722315

Van Hasselt, H., Guez, A., & Silver, D. (2020). Deep reinforcement learning with double Q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). DOI: https://doi.org/10.1609/aaai.v30i1.10295

Heinzelman, W., Chandrakasan, A., & Balakrishnan, H. (2020). Energy-efficient communication protocol for wireless microsensor networks. In Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

Lindsey, S., & Raghavendra, C. S. (2020). PEGASIS: Power-efficient gathering in sensor information systems. In Proceedings of the IEEE Aerospace Conference, 3, 1125–1130. DOI: https://doi.org/10.1109/AERO.2002.1035242

Younis, O., & Fahmy, S. (2020). HEED: A hybrid, energy-efficient, distributed clustering approach for ad hoc sensor networks. IEEE Transactions on Mobile Computing, 3(4), 366–379. DOI: https://doi.org/10.1109/TMC.2004.41

Manjeshwar, A., & Agrawal, D. P. (2020). TEEN: A routing protocol for enhanced efficiency in wireless sensor networks. In Proceedings of the 15th International Parallel and Distributed Processing Symposium, 2009–2015.

Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2020). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the International Conference on Machine Learning, 1861–1870.

Foerster, J., Assael, I. A., de Freitas, N., & Whiteson, S. (2020). Counterfactual multi-agent policy gradients. In Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). DOI: https://doi.org/10.1609/aaai.v32i1.11794

Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., & Mordatch, I. (2020). Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in Neural Information Processing Systems, 30.

Sunehag, P., Lever, G., Hung, C., Marris, L., Bohdanowicz, H., Le, T., … & Graepel, T. (2020). Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296.

Rashid, T., Samvelyan, M., De Witt, C. S., Farquhar, G., Foerster, J., & Whiteson, S. (2020). QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning. In Proceedings of the International Conference on Machine Learning, 4295–4304.

Son, K., Kim, D., Yoo, Y., Kim, J., Park, K., Kang, S., & Kim, C. (2020). QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In Proceedings of the International Conference on Machine Learning, 5887–5896.

Kapoor, S., Jain, A., & Bajaj, R. (2021). A comparative study on routing protocols for wireless sensor networks. In Proceedings of the International Conference on Computing, Communication and Automation, 1–6.

Pantazis, N. A., Nikolidakis, S. A., & Vergados, D. D. (2021). Energy-efficient routing protocols in wireless sensor networks: A survey. IEEE Communications Surveys & Tutorials, 15(2), 551–591. DOI: https://doi.org/10.1109/SURV.2012.062612.00084

Liu, A., Dong, M., Ota, K., & Long, J. (2021). PHACK: An efficient scheme for selective forwarding attack detection in WSNs. Sensors, 15(12), 30942–30963. DOI: https://doi.org/10.3390/s151229835

Dong, M., Ota, K., & Liu, A. (2021). RMER: Reliable and energy-efficient data collection for large-scale wireless sensor networks. IEEE Internet of Things Journal, 3(4), 511–519. DOI: https://doi.org/10.1109/JIOT.2016.2517405

Lillicrap, T., Hunt, J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., … & Wierstra, D. (2021). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., … & Kavukcuoglu, K. (2021). Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, 1928–1937.

Schulman, J., Levine, S., Abbeel, P., Jordan, M., & Moritz, P. (2021). Trust region policy optimization. In Proceedings of the International Conference on Machine Learning, 1889–1897.

Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2021). Deterministic policy gradient algorithms. In Proceedings of the International Conference on Machine Learning, 387–395.

Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., & de Freitas, N. (2021). Dueling network architectures for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, 1995–2003.

Hessel, M., Soyer, H., Espeholt, L., Czarnecki, W., Schmitt, S., Van Hasselt, H., … & Silver, D. (2021). Rainbow: Combining improvements in deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 32(1).

Lu, H., Chen, Y., & Lin, N. (2021). Energy-efficient depth-based opportunistic routing with Q-learning for underwater wireless sensor networks. Sensors, 20(4), 1025. DOI: https://doi.org/10.3390/s20041025

Jarwan, A., & Ibnkahla, M. (2022). Edge-based federated deep reinforcement learning for IoT traffic management. IEEE Internet of Things Journal, 10(5), 3799–3813. DOI: https://doi.org/10.1109/JIOT.2022.3174469

Adil, M., Usman, M., Jan, M. A., Abulkasim, H., Farouk, A., & Jin, Z. (2022). An improved congestion-controlled routing protocol for IoT applications in extreme environments. IEEE Internet of Things Journal, 11(3), 3757–3767. DOI: https://doi.org/10.1109/JIOT.2023.3310927

Li, J., Ye, M., Huang, L., Deng, X., Qiu, H., & Wang, Y. Y. (2022). An intelligent SDWN routing algorithm based on network situational awareness and deep reinforcement learning. arXiv preprint arXiv:2305.10441. DOI: https://doi.org/10.1109/ACCESS.2023.3302178

Huang, R., Guan, W., Zhai, G., He, J., & Chu, X. (2022). Deep graph reinforcement learning based intelligent traffic routing control for software-defined wireless sensor networks. Applied Sciences, 12(4), 1951. DOI: https://doi.org/10.3390/app12041951

Yao, J., Yan, C., Wang, J., & Jiang, C. (2022). Stable QoE-aware multi-SFCs cooperative routing mechanism based on deep reinforcement learning. IEEE Transactions on Network and Service Management, 1–1.

Ye, M., Huang, L., Deng, X., Wang, Y. Y., Jiang, Q., Qiu, H., & Wen, P. (2023). A new intelligent cross-domain routing method in SDN based on a proposed multiagent reinforcement learning algorithm. arXiv preprint arXiv:2303.07572. DOI: https://doi.org/10.21203/rs.3.rs-3347583/v1

Veeranjaneyulu, K., Lakshmi, M. B., Swamy, S. V., Sirisha, K., Nagarjuna, N., & Anupkant, S. (2023). Enhancing wireless sensor network routing strategies with machine learning protocols. In Proceedings of the International Conference on Networks and Wireless Communications.

Abadi, A. F. E., Asghari, S. E., Sharifani, S., Asghari, S. A., & Marvasti, M. B. (2023). A survey on utilizing reinforcement learning in wireless sensor networks routing protocols. In Proceedings of the Conference on Information and Knowledge Technology, 1–7.

Farag, H., & Stefanovic, C. (2023). Congestion-aware routing in dynamic IoT networks: A reinforcement learning approach. arXiv preprint arXiv:2105.09678.

Jagannath, J., Polosky, N., Jagannath, A., Restuccia, F., & Melodia, T. (2023). Machine learning for wireless communications in the Internet of Things: A comprehensive survey. Ad Hoc Networks, 93, 101913. DOI: https://doi.org/10.1016/j.adhoc.2019.101913

Luong, N. C., Hoang, D. T., Gong, S., Niyato, D., Wang, P., Liang, Y., & Kim, D. I. (2023). Applications of deep reinforcement learning in communications and networking: A survey. arXiv preprint arXiv:1810.07862.

Veeranjaneyulu, K., Lakshmi, M. B., Swamy, S. V., Sirisha, K., Nagarjuna, N., & Anupkant, S. (2023). Enhancing wireless sensor network routing strategies with machine learning protocols. In Proceedings of the International Conference on Networks and Wireless Communications.

Abadi, A. F. E., Asghari, S. E., Sharifani, S., Asghari, S. A., & Marvasti, M. B. (2023). A survey on utilizing reinforcement learning in wireless sensor networks routing protocols. In Proceedings of the Conference on Information and Knowledge Technology (pp. 1–7). DOI: https://doi.org/10.1109/IKT57960.2022.10039013

Bajpai, S., & Tiwari, N. K. (2024). Energy-efficient routing optimization for underwater Internet of Things using hybrid Q-learning and predictive learning approach. Procedia Computer Science, 235, 1–12. DOI: https://doi.org/10.1016/j.procs.2024.04.005