Offensive Language Detection Using Soft Voting Ensemble Model

  • Brillian Fieri Computer Science Department, BINUS Graduate Program - Master of Computer Science, Bina Nusantara University, Jakarta, Indonesia
  • Derwin Suhartono Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta, Indonesia
Keywords: Offensive Language, Text Classification, Voting Classifier, Ensemble Model


Offensive language is one of the problems that have become increasingly severe along with the rise of the internet and social media usage. This language can be used to attack a person or specific groups. Automatic moderation, such as the usage of machine learning, can help detect and filter this particular language for someone who needs it. This study focuses on improving the performance of the soft voting classifier to detect offensive language by experimenting with the combinations of the soft voting estimators. The model was applied to a Twitter dataset that was augmented using several augmentation techniques. The features were extracted using Term Frequency-Inverse Document Frequency, sentiment analysis, and GloVe embedding. In this study, there were two types of soft voting models: machine learning-based, with the estimators of Random Forest, Decision Tree, Logistic Regression, Naïve Bayes, and AdaBoost as the best combination, and deep learning-based, with the best estimator combination of Convolutional Neural Network, Bidirectional Long Short-Term Memory, and Bidirectional Gated Recurrent Unit. The results of this study show that the soft voting classifier was better in performance compared to classic machine learning and deep learning models on both original and augmented datasets.


Auxier, B., and Anderson, M. Social media use in 2021. Pew Research Center 1 (2021), 1–4.

Benesch, S. Defining and diminishing hate speech. State of the world’s minorities and indigenous peoples 2014 (2014), 18–25.

Davidson, T., Warmsley, D., Macy, M., and Weber, I. Automated hate speech detection and the problem of offensive language. In Proceedings of the international AAAI conference on web and social media (2017), vol. 11, pp. 512–515.

Gao, Z., Yada, S., Wakamiya, S., and Aramaki, E. Offensive language detection on video live streaming chat. In Proceedings of the 28th International Conference on Computational Linguistics (2020), pp. 1936–1940.

Hutto, C., and Gilbert, E. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media (2014), vol. 8, pp. 216–225.

Kabari, L. G., and Onwuka, U. C. Comparison of bagging and voting ensemble machine learning algorithm as a classifier. International Journals of Advanced Research in Computer Science and Software Engineering 9, 3 (2019), 19–23.

Kebriaei, E., Karimi, S., Sabri, N., and Shakery, A. Emad at semeval-2019 task 6: offensive language identification using traditional machine learning and deep learning approaches. In Proceedings of the 13th International Workshop on Semantic Evaluation (2019), pp. 600–603.

Kumari, S., Kumar, D., and Mittal, M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering 2 (2021), 40–46.

Lasotte, Y., Garba, E., Malgwi, Y., and Buhari, M. An ensemble machine learning approach for fake news detection and classification using a soft voting classifier. European Journal of Electrical Engineering and Computer Science 6, 2 (2022), 1–7.

Mohaouchane, H., Mourhir, A., and Nikolov, N. S. Detecting offensive language on arabic social media using deep learning. In 2019 sixth international conference on social networks analysis, management and security (SNAMS) (2019), IEEE, pp. 466–471.

Pennington, J., Socher, R., and Manning, C. D. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (2014), pp. 1532–1543.

Perera, A., and Fernando, P. Accurate cyberbullying detection and prevention on social media. Procedia Computer Science 181 (2021), 605–611.

Rajendran, A., Zhang, C., and Abdul-Mageed, M. Ubc-nlp at semeval-2019 task 6: Ensemble learning of offensive content with enhanced training data. arXiv preprint arXiv:1906.03692 (2019).

Reichelmann, A., Hawdon, J., Costello, M., Ryan, J., Blaya, C., Llorent, V., Oksanen, A., Rasanen, P., and Zych, I. Hate knows no boundaries: Online hate in six nations. Deviant Behavior 42, 9 (2021), 1100–1111.

Sudhir, P., and Suresh, V. D. Comparative study of various approaches, applications and classifiers for sentiment analysis. Global Transitions Proceedings 2, 2 (2021), 205–211.

Watanabe, H., Bouazizi, M., and Ohtsuki, T. Hate speech on twitter: A pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE access 6 (2018), 13825–13835.

Wei, J., and Zou, K. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196 (2019).

Wiedemann, G., Ruppert, E., Jindal, R., and Biemann, C. Transfer learning from lda to bilstm-cnn for offensive language detection in twitter. arXiv preprint arXiv:1811.02906 (2018).

Zabinski, G., Gramacki, J., Gramacki, A., Mista-Jakubowska, E., Birch, T., and Disser, A. Multi-classifier majority voting analyses in provenance studies on iron artefacts. Journal of Archaeological Science 113 (2020), 105055.

Zhou, Q., and Wu, H. Nlp at iest 2018: Bilstmattention and lstm-attention via soft voting in emotion classification. In Proceedings of the 9th workshop on computational approaches to subjectivity, sentiment and social media analysis (2018), pp. 189–194.

How to Cite
Fieri, B. and Suhartono, D. 2023. Offensive Language Detection Using Soft Voting Ensemble Model. MENDEL. 29, 1 (Jun. 2023), 1-6. DOI:
Research articles