검색 상세

자연어추론에서의 성별, 인종에 대한 편향 측정

GRiN : Evaluating Gender and Racial Bias in Natural Language Inference

초록/요약

We introduce the template-based gender and racial bias evaluation framework in natural language inference (NLI) task. We name the dataset used in our framework as GRiN (Gender and Racial bias in Natural Language Inference). In this work, we define bias as overgeneralized relation between a target (e.g., gender, race) and an attribute (e.g., occupation). To measure such bias, we design pairs of sentences in three types. The first two templates generate the same bias-neutral premise illustrating an attribute with target-specific hypotheses. The third type is NLI transformed sentences from crowdsourced stereotype benchmarks (CrowS-Pairs, StereoSet) to preserve the natural context. Our bias evaluation metric is the overall accuracy and the prediction standard deviation between sentence pairs. We demonstrate that NLI classifiers trained on SNLI and MNLI corpus substantially exhibit gender and racial biases, regardless of their intrinsic biases.

more