Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 87341
Bridging the Data Gap for Sexism Detection in Twitter: A Semi-Supervised Approach
Authors: Adeep Hande, Shubham Agarwal
Abstract:
This paper presents a study on identifying sexism in online texts using various state-of-the-art deep learning models based on BERT. We experimented with different feature sets and model architectures and evaluated their performance using precision, recall, F1 score, and accuracy metrics. We also explored the use of pseudolabeling technique to improve model performance. Our experiments show that the best-performing models were based on BERT, and their multilingual model achieved an F1 score of 0.83. Furthermore, the use of pseudolabeling significantly improved the performance of the BERT-based models, with the best results achieved using the pseudolabeling technique. Our findings suggest that BERT-based models with pseudolabeling hold great promise for identifying sexism in online texts with high accuracy.Keywords: large language models, semi-supervised learning, sexism detection, data sparsity
Procedia PDF Downloads 69