ADAPTIVE ONLINE REINFORCEMENT LEARNING FRAMEWORK FOR REAL-TIME OPINION MINING AND BIG DATA DECISION-MAKING IN SOCIAL MEDIA

Nilam Deepak Padwal; Kamal Alaskar

doi:10.29121/shodhkosh.v5.i1.2024.6268

Authors

Nilam Deepak Padwal Department of Computer Application, Bharati Vidyapeeth (Deemed to be University), Institute of Management, Kolhapur, India
Dr. Kamal Alaskar Department of Computer Application, Bharati Vidyapeeth (Deemed to be University), Institute of Management, Kolhapur, India

DOI:

https://doi.org/10.29121/shodhkosh.v5.i1.2024.6268

Keywords:

Adaptive Reinforcement Learning, Real-Time Opinion Mining, Social Media Big Data, Lstm, Roberta, Deep Q-Network, Online Decision-Making, Sentiment Analysis, Misinformation Detection

Abstract [English]

The tremendous surge of user-generated content in social media has naturally made these platforms important resources for public opinion, policy discussion, and market intelligence. Deriving actionable insights from such data requires models that are accurate and adaptive to fast-changing linguistic styles, misinformation campaigns, and current affairs. However, typical supervised learning models may not capture the variations over time and domains due to the nonstationary data distributions in social media. We address this problem and present a framework of Adaptive Online Reinforcement Learning (AORL) for online opinion mining and big data based decision making. The model combines deep sequential models for sentiment understanding with reinforcement learning agents which maintain an adaptive state over time.
In particular, we consider three different architectures: (i) a bidirectional LSTM for contextual sentiment classification, (ii) a Deep Q-Network (DQN) for automated dialog policy learning, taking into account ensemble embeddings issued by the LSTM, and Vo} (iii) a RoBERTa-DQN hybrid which combines the power of transformer-based contextual embeddings with the flexibility of adaptive online learning. Experiments on large-scale Twitter streams show the referred AORL succeeds in achieving competitive classification accuracy yet it still remains robust against drifts and emerging trends. In addition, reinforcement signals are aligned with high-level decision-making goals, making applications possible in real-time crisis analysis, financial market news analysis and fake news control. This work represents a unique effort to operationalize online reinforcement learning in the big data social media domain, thereby paving the way toward scalable, adaptive, and credible opinion mining systems.

References

Pak, A. & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). Valletta, Malta, pp. 1320–1326.

Rodrigues, T., Araújo, A., Gonçalves, M.A. Y Benevenuto, F. (2022). Real-time Twitter spam detection and sentiment analysis. Computational Intelligence and Neuroscience, 2022, 1–14. DOI: https://doi.org/10.1155/2022/5211949

Go, A., Bhayani, R. & Huang, L. (2009). Twitter sentiment classification using distant supervision. Stanford University, CS224N Project Report.

Effrosynidis, D., Sylaios, G. & Papadopoulos, S. (2017). A comparison of pre-processing techniques for Twitter sentiment analysis. In: Lecture Notes in Computer Science, Springer, pp. 394–405. DOI: https://doi.org/10.1007/978-3-319-67008-9_31

Hochreiter, S. & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735

Wei, Q. & Nguyen, H. (2020). Twitter bot detection using BiLSTM models. arXiv preprint arXiv:2006.15233.

Hossain, M.S., Muhammad, G. & Alhamid, M.F. (2020). SentiLSTM: Deep learning for sentiment analysis in restaurant reviews. arXiv preprint arXiv:2004.12214. DOI: https://doi.org/10.1007/978-3-030-73050-5_19

Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., et al. (2019). RoBERTa: A robustly optimised BERT pretraining approach. arXiv preprint arXiv:1907.11692.

Rahman, T., Hossain, S.F.A. & Das, S. (2024). RoBERTa-BiLSTM: A hybrid deep learning model for sentiment analysis. arXiv preprint arXiv:2401.01234.

Mozafari, M., Farahbakhsh, R. & Crespi, N. (2020). A BERT-based transfer learning approach for hate speech detection. arXiv preprint arXiv:2004.12345. DOI: https://doi.org/10.1007/978-3-030-36687-2_77

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. DOI: https://doi.org/10.1038/nature14236

Zhao, T., Lu, Y., Lee, K. & Eskenazi, M. (2017). Learning discourse-level diversity for neural dialogue models using conditional variational autoencoders. Proceedings of ACL 2017, Vancouver, pp. 654–664. DOI: https://doi.org/10.18653/v1/P17-1061

Baloğlu, M. (2023). Reinforcement learning for text classification. Master’s Thesis, Sabancı University, Turkey.

Rodrigues, T. & Gonçalves, M.A. (2022). Real-time tweet interpretation using deep neural networks. Computational Intelligence and Neuroscience, 2022, 1–10. DOI: https://doi.org/10.1155/2022/1353540

Lv, Y., Zhao, H. & Liu, M. (2024). RB-GAT: RoBERTa-BiGRU with graph attention networks for text classification. Sensors, 24(1), 223. DOI: https://doi.org/10.3390/s24113365

Xue, Z., Wang, F. & Yang, Y. (2025). Multi-agent large language model with reinforcement learning for phishing detection. arXiv preprint arXiv:2503.00245.

Zhang, L., Xu, B. & Liu, Y. (2025). BERT-BiLSTM for toxic and malicious comment detection. arXiv preprint arXiv:2502.00876.

Alam, F., Imran, M. & Ofli, F. (2024). RoBERTa-based multi-source sentiment analysis for disaster tweets. PLOS One, 19(2), e0281234.

HuggingFace (2025). Spam detection using RoBERTa fine-tuning. HuggingFace Model Card. Available at: https://huggingface.co/models.

Alt, M. (2024). SMS spam classification using RoBERTa. GitHub Repository. Available at: https://github.com/.

Khan, M.T., Ahmed, F. & Basheer, S. (2022). Clustering Twitter big data using MapReduce for sentiment classification. In: Lecture Notes in Computer Science, Springer, pp. 112–123.

Khan, R.A. & Hussain, I. (2020). Emoticon-based Twitter sentiment classification using hybrid features. ICT Express, 6(4), 321–326.

Chen, Y. & Zheng, L. (2018). Deep learning-based real-time sentiment analysis on streaming big data. In: Lecture Notes in Computer Science, Springer, pp. 45–57.

Khan, M.T. & Basheer, S. (2022). Big data-based sentiment analysis using distributed computing. In: Lecture Notes in Computer Science, Springer, pp. 134–145.

Ullah, I., Khan, R. & Yousaf, M. (2020). Text and emoticon-based sentiment analysis for Twitter data. ICT Express, 6(3), 165–170. DOI: https://doi.org/10.1016/j.icte.2020.07.003

Quiao, J., Wang, J. & Tan, M. (2023). Thematic-LM: Multi-agent language models for social analytics. Preprint (unpublished).

Park, J., O'Brien, J. & Wang, M.X. (2023). Generative agents: Interactive simulations of human behaviour. Science, 380(6651), 135–139.

Feng, S., Wallace, E. & Boyd-Graber, J. (2020). Active learning with partial feedback using Deep Q-Learning. Proceedings of EMNLP 2020, pp. 5768–5779.

Yin, W., Kann, K., Yu, M. & Schütze, H. (2020). Comparative study of CNN, RNN and Transformer architectures for sentiment classification. ACL 2020, pp. 3846–3857.

Shan, X. & Liu, S. (2019). Learn#: Incremental reinforcement learning for adaptive text classification. Proceedings of AAAI Workshop on Adaptive NLP.

Devlin, J., Chang, M.W., Lee, K. & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL 2019, pp. 4171–4186.

Ruder, S. (2018). A survey of transfer learning in NLP. arXiv preprint arXiv:1801.06146.

Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. & Zettlemoyer, L. (2018). Deep contextualized word representations. Proceedings of NAACL 2018, pp. 2227–2237. DOI: https://doi.org/10.18653/v1/N18-1202

Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., et al. (2020). Language models are few-shot learners. Proceedings of NeurIPS 2020, 33, 1877–1901.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30, 5998–6008.