ADAPTIVE ONLINE REINFORCEMENT LEARNING FRAMEWORK FOR REAL-TIME OPINION MINING AND BIG DATA DECISION-MAKING IN SOCIAL MEDIA
DOI:
https://doi.org/10.29121/shodhkosh.v5.i1.2024.6268Keywords:
Adaptive Reinforcement Learning, Real-Time Opinion Mining, Social Media Big Data, Lstm, Roberta, Deep Q-Network, Online Decision-Making, Sentiment Analysis, Misinformation DetectionAbstract [English]
The tremendous surge of user-generated content in social media has naturally made these platforms important resources for public opinion, policy discussion, and market intelligence. Deriving actionable insights from such data requires models that are accurate and adaptive to fast-changing linguistic styles, misinformation campaigns, and current affairs. However, typical supervised learning models may not capture the variations over time and domains due to the nonstationary data distributions in social media. We address this problem and present a framework of Adaptive Online Reinforcement Learning (AORL) for online opinion mining and big data based decision making. The model combines deep sequential models for sentiment understanding with reinforcement learning agents which maintain an adaptive state over time.
In particular, we consider three different architectures: (i) a bidirectional LSTM for contextual sentiment classification, (ii) a Deep Q-Network (DQN) for automated dialog policy learning, taking into account ensemble embeddings issued by the LSTM, and Vo} (iii) a RoBERTa-DQN hybrid which combines the power of transformer-based contextual embeddings with the flexibility of adaptive online learning. Experiments on large-scale Twitter streams show the referred AORL succeeds in achieving competitive classification accuracy yet it still remains robust against drifts and emerging trends. In addition, reinforcement signals are aligned with high-level decision-making goals, making applications possible in real-time crisis analysis, financial market news analysis and fake news control. This work represents a unique effort to operationalize online reinforcement learning in the big data social media domain, thereby paving the way toward scalable, adaptive, and credible opinion mining systems.
References
Pak, A. & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). Valletta, Malta, pp. 1320–1326.
Rodrigues, T., Araújo, A., Gonçalves, M.A. Y Benevenuto, F. (2022). Real-time Twitter spam detection and sentiment analysis. Computational Intelligence and Neuroscience, 2022, 1–14. DOI: https://doi.org/10.1155/2022/5211949
Go, A., Bhayani, R. & Huang, L. (2009). Twitter sentiment classification using distant supervision. Stanford University, CS224N Project Report.
Effrosynidis, D., Sylaios, G. & Papadopoulos, S. (2017). A comparison of pre-processing techniques for Twitter sentiment analysis. In: Lecture Notes in Computer Science, Springer, pp. 394–405. DOI: https://doi.org/10.1007/978-3-319-67008-9_31
Hochreiter, S. & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735
Wei, Q. & Nguyen, H. (2020). Twitter bot detection using BiLSTM models. arXiv preprint arXiv:2006.15233.
Hossain, M.S., Muhammad, G. & Alhamid, M.F. (2020). SentiLSTM: Deep learning for sentiment analysis in restaurant reviews. arXiv preprint arXiv:2004.12214. DOI: https://doi.org/10.1007/978-3-030-73050-5_19
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., et al. (2019). RoBERTa: A robustly optimised BERT pretraining approach. arXiv preprint arXiv:1907.11692.
Rahman, T., Hossain, S.F.A. & Das, S. (2024). RoBERTa-BiLSTM: A hybrid deep learning model for sentiment analysis. arXiv preprint arXiv:2401.01234.
Mozafari, M., Farahbakhsh, R. & Crespi, N. (2020). A BERT-based transfer learning approach for hate speech detection. arXiv preprint arXiv:2004.12345. DOI: https://doi.org/10.1007/978-3-030-36687-2_77
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. DOI: https://doi.org/10.1038/nature14236
Zhao, T., Lu, Y., Lee, K. & Eskenazi, M. (2017). Learning discourse-level diversity for neural dialogue models using conditional variational autoencoders. Proceedings of ACL 2017, Vancouver, pp. 654–664. DOI: https://doi.org/10.18653/v1/P17-1061
Baloğlu, M. (2023). Reinforcement learning for text classification. Master’s Thesis, Sabancı University, Turkey.
Rodrigues, T. & Gonçalves, M.A. (2022). Real-time tweet interpretation using deep neural networks. Computational Intelligence and Neuroscience, 2022, 1–10. DOI: https://doi.org/10.1155/2022/1353540
Lv, Y., Zhao, H. & Liu, M. (2024). RB-GAT: RoBERTa-BiGRU with graph attention networks for text classification. Sensors, 24(1), 223. DOI: https://doi.org/10.3390/s24113365
Xue, Z., Wang, F. & Yang, Y. (2025). Multi-agent large language model with reinforcement learning for phishing detection. arXiv preprint arXiv:2503.00245.
Zhang, L., Xu, B. & Liu, Y. (2025). BERT-BiLSTM for toxic and malicious comment detection. arXiv preprint arXiv:2502.00876.
Alam, F., Imran, M. & Ofli, F. (2024). RoBERTa-based multi-source sentiment analysis for disaster tweets. PLOS One, 19(2), e0281234.
HuggingFace (2025). Spam detection using RoBERTa fine-tuning. HuggingFace Model Card. Available at: https://huggingface.co/models.
Alt, M. (2024). SMS spam classification using RoBERTa. GitHub Repository. Available at: https://github.com/.
Khan, M.T., Ahmed, F. & Basheer, S. (2022). Clustering Twitter big data using MapReduce for sentiment classification. In: Lecture Notes in Computer Science, Springer, pp. 112–123.
Khan, R.A. & Hussain, I. (2020). Emoticon-based Twitter sentiment classification using hybrid features. ICT Express, 6(4), 321–326.
Chen, Y. & Zheng, L. (2018). Deep learning-based real-time sentiment analysis on streaming big data. In: Lecture Notes in Computer Science, Springer, pp. 45–57.
Khan, M.T. & Basheer, S. (2022). Big data-based sentiment analysis using distributed computing. In: Lecture Notes in Computer Science, Springer, pp. 134–145.
Ullah, I., Khan, R. & Yousaf, M. (2020). Text and emoticon-based sentiment analysis for Twitter data. ICT Express, 6(3), 165–170. DOI: https://doi.org/10.1016/j.icte.2020.07.003
Quiao, J., Wang, J. & Tan, M. (2023). Thematic-LM: Multi-agent language models for social analytics. Preprint (unpublished).
Park, J., O'Brien, J. & Wang, M.X. (2023). Generative agents: Interactive simulations of human behaviour. Science, 380(6651), 135–139.
Feng, S., Wallace, E. & Boyd-Graber, J. (2020). Active learning with partial feedback using Deep Q-Learning. Proceedings of EMNLP 2020, pp. 5768–5779.
Yin, W., Kann, K., Yu, M. & Schütze, H. (2020). Comparative study of CNN, RNN and Transformer architectures for sentiment classification. ACL 2020, pp. 3846–3857.
Shan, X. & Liu, S. (2019). Learn#: Incremental reinforcement learning for adaptive text classification. Proceedings of AAAI Workshop on Adaptive NLP.
Devlin, J., Chang, M.W., Lee, K. & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL 2019, pp. 4171–4186.
Ruder, S. (2018). A survey of transfer learning in NLP. arXiv preprint arXiv:1801.06146.
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. & Zettlemoyer, L. (2018). Deep contextualized word representations. Proceedings of NAACL 2018, pp. 2227–2237. DOI: https://doi.org/10.18653/v1/N18-1202
Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., et al. (2020). Language models are few-shot learners. Proceedings of NeurIPS 2020, 33, 1877–1901.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30, 5998–6008.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Nilam Deepak Padwal, Dr. Kamal Alaskar

This work is licensed under a Creative Commons Attribution 4.0 International License.
With the licence CC-BY, authors retain the copyright, allowing anyone to download, reuse, re-print, modify, distribute, and/or copy their contribution. The work must be properly attributed to its author.
It is not necessary to ask for further permission from the author or journal board.
This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.