ENHANCING CYBERBULLYING DETECTION USING ENSEMBLE LEARNING AND EMBEDDINGS

Prashant Agrawal; Awanit Kumar; Arun Kumar  Tripathi

doi:10.29121/shodhkosh.v5.i1.2024.3194

Authors

Prashant Agrawal Research Scholar, Department of Computer Science and Engineering, Sangam University, Rajasthan, India
Awanit Kumar Assistant Professor, Department of Computer Science and Engineering, Sangam University, Rajasthan, India
Arun Kumar Tripathi Professor, Department of Computer Applications, KIET Group of Institutions, New Delhi, India

DOI:

https://doi.org/10.29121/shodhkosh.v5.i1.2024.3194

Keywords:

Cyber Bullying Detection, Ensemble Learning, Universal Sentence Encoder, Deep Learning, Machine Learning, Text Classification

Abstract [English]

Cyberbullying represents a significant challenge in online environments, requiring advanced techniques for its accurate detection and mitigation. This paper introduces a novel approach that leverages ensemble learning and embedding methods to enhance cyberbullying detection. The proposed framework integrates various classifiers, including deep learning models, decision trees, random forests, and logistic regression, in combination with Universal Sentence Embeddings for semantic text representation. The study employs a labeled dataset sourced from offensive language databases, which is preprocessed and divided into training and testing sets. Hyperparameter optimization for traditional classifiers is performed using grid search, while a deep learning model is trained to identify complex patterns in cyberbullying content. Ensemble learning is utilized to combine predictions from multiple models, improving overall detection performance and generalization. The effectiveness of the proposed approach is evaluated using metrics such as accuracy and confusion matrices, demonstrating superior performance compared to individual models. The results indicate that the ensemble learning framework significantly enhances the accuracy of cyberbullying detection, contributing to the growing body of research on online safety and machine learning applications in digital platforms.

References

Dadvar, M., & Eckert, K. (2018). Cyberbullying detection in social networks using deep learning based models; a reproducibility study. arXiv preprint arXiv:1812.08046.https://doi.org/10.48550/arXiv.1812.08046

Talpur, Bandeh Ali, and Declan O’Sullivan. "Cyberbullying severity detection: A machinelearning approach." PloS one 15, no. 10 (2020):e0240924.doi: 10.1371/journal.pone.0240924 DOI: https://doi.org/10.1371/journal.pone.0240924

Agrawal, Sweta, and Amit Awekar. "Deep learning for detecting cyberbullying across multiplesocial media platforms." In European conference on information retrieval, pp. 141-153.Springer,Cham, 2018.https://doi.org/10.48550/arXiv.1801.06482 DOI: https://doi.org/10.1007/978-3-319-76941-7_11

Richard, Khoury, and Larochelle Marc-André. "Generalisation of cyberbullying detection." arXiv preprint arXiv: 2009.01046 (2020). https://doi.org/10.48550/arXiv.2009.01046

Eronen, Juuso, Michal Ptaszynski, Fumito Masui, Aleksander Smywiński-Pohl, Gniewosz Leliwa,and Michal Wroczynski. "Improving classifier training efficiency for automatic cyberbullyingdetection with Feature Density." Information Processing & Management 58, no. 5(2021): 102616.https://doi.org/10.1016/j.ipm.2021.102616 DOI: https://doi.org/10.1016/j.ipm.2021.102616

Hayashi, T., & Fujita, H. (2019). Word embeddings-based sentence-level sentiment analysis considering word importance. Acta PolytechnicaHungarica,16(7), 7-24.DOI:10.12700/APH.16.7.2019.7.1 DOI: https://doi.org/10.12700/APH.16.7.2019.7.1

Mao, Junhua, Jiajing Xu, Kevin Jing, and Alan L. Yuille. "Training and evaluating multimodal word embeddings with large-scale web annotated images." Advances in neural information processingsystems 29 (2016)https://doi.org/10.48550/arXiv.1611.08321

Raj, Chahat, Ayush Agarwal, Gnana Bharathy, Bhuva Narayan, and Mukesh Prasad."Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural LanguageProcessing Techniques." Electronics 10, no. 22 (2021): 2810.https://doi.org/10.3390/electronics10222810 DOI: https://doi.org/10.3390/electronics10222810

Kumar, R., & Bhat, A. (2022). A study of machine learning-based models for detection, control, and mitigation of cyberbullying in online social media. International Journal of Information Security, 21(6), 1409-1431.DOI:10.1007/s10207-022-00600-y DOI: https://doi.org/10.1007/s10207-022-00600-y

Hasan, M. T., Hossain, M. A. E., Mukta, M. S. H., Akter, A., Ahmed, M., & Islam, S. (2023). A Review on Deep-Learning-Based Cyberbullying Detection. Future Internet, 15(5), 179.https://doi.org/10.3390/fi15050179 DOI: https://doi.org/10.3390/fi15050179

Kumar, R. (2021). Detection of Cyberbullying using Machine Learning. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 12(9), 656-661.DOI:10.17762/turcomat.v12i9.3131

Alabdulwahab, A., Haq, M. A., & Alshehri, M. (2023). Cyberbullying Detection using Machine Learning and Deep Learning. International Journal of Advanced Computer Science and Applications, 14(10).DOI:10.14569/IJACSA.2023.0141045 DOI: https://doi.org/10.14569/IJACSA.2023.0141045

Vanigotha, A. R., Kumar, M. N., Hiremath, S., Adityan, S. S., & Basha, M. J. (2023). Effective Cyberbullying Detection with SparkNLP. Int J Res Appl Sci Eng Technol, 11(3), 101-106.DOI:10.22214/ijraset.2023.49369 DOI: https://doi.org/10.22214/ijraset.2023.49369

Subramanian, M., Sathiskumar, V. E., Deepalakshmi, G., Cho, J., & Manikandan, G. (2023). A survey on hate speech detection and sentiment analysis using machine learning and deep learning models. Alexandria Engineering Journal, 80, 110-121.https://doi.org/10.1016/j.aej.2023.08.038 DOI: https://doi.org/10.1016/j.aej.2023.08.038

Alam, K. S., Bhowmik, S., &Prosun, P. R. K. (2021, February). Cyberbullying detection: an ensemble based machine learning approach. In 2021 third international conference on intelligent communication technologies and virtual mobile networks (ICICV) (pp. 710-715). IEEE.DOI: 10.1109/ICICV50876.2021.9388499 DOI: https://doi.org/10.1109/ICICV50876.2021.9388499

Hani, J., Mohamed, N., Ahmed, M., Emad, Z., Amer, E., & Ammar, M. (2019). Social media cyberbullying detection using machine learning. International Journal of Advanced Computer Science and Applications, 10(5).DOI:10.14569/IJACSA.2019.0100587 DOI: https://doi.org/10.14569/IJACSA.2019.0100587

Islam, M. M., Uddin, M. A., Islam, L., Akter, A., Sharmin, S., &Acharjee, U. K. (2020, December). Cyberbullying detection on social networks using machine learning approaches. In 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE) (pp. 1-6). IEEE.DOI:10.1109/CSDE50874.2020.9411601 DOI: https://doi.org/10.1109/CSDE50874.2020.9411601

Raisi, E., & Huang, B. (2018, August). Weakly supervised cyberbullying detection using co-trained ensembles of embedding models. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 479-486). IEEE.DOI: 10.1109/ASONAM.2018.8508240 DOI: https://doi.org/10.1109/ASONAM.2018.8508240

Bozyiğit, A., Utku, S., & Nasibov, E. (2021). Cyberbullying detection: Utilizing social media features. Expert Systems with Applications, 179, 115001.https://doi.org/10.1016/j.eswa.2021.115001 DOI: https://doi.org/10.1016/j.eswa.2021.115001

Balakrishnan, V., Khan, S., &Arabnia, H. R. (2020). Improving cyberbullying detection using Twitter users’ psychological features and machine learning. Computers & Security, 90, 101710.https://doi.org/10.1016/j.cose.2019.101710 DOI: https://doi.org/10.1016/j.cose.2019.101710