Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System - Traitement du Langage Parlé
Conference Papers Year : 2022

Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System

Abstract

Multiple studies have shown that existing NMT systems demonstrate some kind of "gender bias". As a result, MT output appears to err more often for feminine forms and to amplify social gender misrepresentations, which is potentially harmful to users and practioners of these technologies. This paper continues this line of investigations and reports results obtained with a new test set in strictly controlled conditions. This setting allows us to better understand the multiple inner mechanisms that are causing these biases, which include the linguistic expressions of gender, the unbalanced distribution of masculine and feminine forms in the language, the modelling of morphological variation and the training process dynamics. To counterbalance these effects, we formulate several proposals and notably show that modifying the training loss can effectively mitigate such biases.
Fichier principal
Vignette du fichier
source_transfert.pdf (302.37 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-03912438 , version 1 (24-12-2022)

Identifiers

  • HAL Id : hal-03912438 , version 1

Cite

Guillaume Wisniewski, Lichao Zhu, Nicolas Ballier, François Yvon. Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System. BlackboxNLP 2022, Dec 2022, Abu Dhabi, United Arab Emirates. ⟨hal-03912438⟩
190 View
73 Download

Share

More