Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 72582
An Analysis of Lexical and Grammatical Gender Bias in German Bidirectional Encoder Representations from Transformers Networks

Authors: Freya Thie├čen, Johannes Schrumpf

Abstract:

Gender bias in natural language processing neural networks based on the Transformer architecture has been the focus of recent research. So far, primarily language models trained on the English language has been investigated and found to possess biased representations with regard to gender. Linguistic analysis hints at the possibility that due to semantic and grammatical differences between the German and English languages, BERT networks trained on German-language material may possess different gender bias properties than English BERT networks. This study investigates the impact of lexical and grammatical forms of gender information on bias in German-BERT, a BERT network trained for natural language processing of the German language. Through an analysis of the principal components of German-BERT embeddings, we show that gender bias exists in German-BERT in the presence of grammatical gender information and lexical gender stereotypes.

Keywords: artificial intelligence, ethical machine learning, gender bias, German language-specific bias, natural language processing

Procedia PDF Downloads 26