Machine Translation is one of most widely used Artificial Intelligence applications on the Internet: it is so widespread in online services of various types that sometimes users do not realize that they are using the results of an automatic translation process. From social networks, like Facebook or Twitter, to online selling platforms, like EBay or TripAdvisor, research engines like Google and instant messaging systems like Skype, machine translation can break linguistic barriers in various ways with a number of different purposes: translating texts or web sites, making online searches and communicating in real time with people who speak different languages. In spite of the remarkable progress achieved in this field over the last twenty years thanks to the enhanced calculating capacity of computers and advanced technologies in the field of Natural Language Processing (NLP), machine translation systems, even the most widely used ones on the net such as for example Google Translate, have high error rates. There are many challenges to be faced which are mainly linked to the complexity and ambiguity of natural language and the obstacles posed by the translation process itself. Among the most frequent problems in the state-of- the-art MT systems, either based on linguistic data like Systran, statistical approaches like GoogleTranslation or the recent neural approach, translation of gender still represents a recurrent source of mistranslations: incorrect gender attribution to proforms (personal pronouns, relative pronouns, among others), reproduction of gender stereotypes and overuse of male pronouns are among the most frequent problems in MT. Translation problems connected with gender have been widely discussed by the feminist approach to translation whereas in MT theory, they have been taken into account by only a very limited number of contributions. It is considered a negligible problem and there is no specific entry for gender issues in the most comprehensive collection of studies on MT, i.e. the website of John Hutchins, for instance. The issues of gender translation in MT are mainly studied with reference to the so-called anaphora resolution, agreement problems (adjective noun, verb noun agreements) or problems related to the translation between morphologically rich (fusional, agglutinative and polysynthetic languages) and morphologically poor languages (isolating languages), but there is no holistic approach to gender biases in MT which takes into account the linguistic, social and cultural aspects linked to this topic. The reason for underestimating this problem lies in the fact that MT developers mainly evaluate their systems according to automatic metrics (BLEU, METEOR, etc.) , which do not detect specific linguistic problems such as the correct translation of gender of pronouns, nouns, adjectives and therefore the correct transposition of meaning. Another reason is that in the current dominant paradigms in MT, namely phrase-based MT and neural MT, the results for specific linguistic problems cannot be addressed and improved since translations are obtained on the basis ofstatistical computations which calculate the most probable translations for a word or a phrase in bilingual text corpora. Only very recently, Londa Schiebinger, who supervises the Gendered Innovations project at Stanford University carried out a study on the performance of Google Translate. In her contribution to gender issues in MT titled Machine Translation: Analyzing Gender, she easily demonstrates that the algorithms used by the well-known MT system lead to a sexist language. My contribution provides an overview of the most recent studies on this topic and addresses the gender issues in the field of Machine Translation mainly in relation to the translation of grammatical, natural and social gender. A few case studies are also provided by comparing the translations of four state-of- the-art MT systems, representing the different theoretical approaches to MT, to gain insight into their strengths and weaknesses as far as translating gender is concerned.
Gender issues in Machine Translation: an unsolved problem?
Johanna Monti
2020-01-01
Abstract
Machine Translation is one of most widely used Artificial Intelligence applications on the Internet: it is so widespread in online services of various types that sometimes users do not realize that they are using the results of an automatic translation process. From social networks, like Facebook or Twitter, to online selling platforms, like EBay or TripAdvisor, research engines like Google and instant messaging systems like Skype, machine translation can break linguistic barriers in various ways with a number of different purposes: translating texts or web sites, making online searches and communicating in real time with people who speak different languages. In spite of the remarkable progress achieved in this field over the last twenty years thanks to the enhanced calculating capacity of computers and advanced technologies in the field of Natural Language Processing (NLP), machine translation systems, even the most widely used ones on the net such as for example Google Translate, have high error rates. There are many challenges to be faced which are mainly linked to the complexity and ambiguity of natural language and the obstacles posed by the translation process itself. Among the most frequent problems in the state-of- the-art MT systems, either based on linguistic data like Systran, statistical approaches like GoogleTranslation or the recent neural approach, translation of gender still represents a recurrent source of mistranslations: incorrect gender attribution to proforms (personal pronouns, relative pronouns, among others), reproduction of gender stereotypes and overuse of male pronouns are among the most frequent problems in MT. Translation problems connected with gender have been widely discussed by the feminist approach to translation whereas in MT theory, they have been taken into account by only a very limited number of contributions. It is considered a negligible problem and there is no specific entry for gender issues in the most comprehensive collection of studies on MT, i.e. the website of John Hutchins, for instance. The issues of gender translation in MT are mainly studied with reference to the so-called anaphora resolution, agreement problems (adjective noun, verb noun agreements) or problems related to the translation between morphologically rich (fusional, agglutinative and polysynthetic languages) and morphologically poor languages (isolating languages), but there is no holistic approach to gender biases in MT which takes into account the linguistic, social and cultural aspects linked to this topic. The reason for underestimating this problem lies in the fact that MT developers mainly evaluate their systems according to automatic metrics (BLEU, METEOR, etc.) , which do not detect specific linguistic problems such as the correct translation of gender of pronouns, nouns, adjectives and therefore the correct transposition of meaning. Another reason is that in the current dominant paradigms in MT, namely phrase-based MT and neural MT, the results for specific linguistic problems cannot be addressed and improved since translations are obtained on the basis ofstatistical computations which calculate the most probable translations for a word or a phrase in bilingual text corpora. Only very recently, Londa Schiebinger, who supervises the Gendered Innovations project at Stanford University carried out a study on the performance of Google Translate. In her contribution to gender issues in MT titled Machine Translation: Analyzing Gender, she easily demonstrates that the algorithms used by the well-known MT system lead to a sexist language. My contribution provides an overview of the most recent studies on this topic and addresses the gender issues in the field of Machine Translation mainly in relation to the translation of grammatical, natural and social gender. A few case studies are also provided by comparing the translations of four state-of- the-art MT systems, representing the different theoretical approaches to MT, to gain insight into their strengths and weaknesses as far as translating gender is concerned.File | Dimensione | Formato | |
---|---|---|---|
Johanna Monti - ch 34 - The Routledge Handbook of Translation, Feminism and Gender (2).pdf
solo utenti autorizzati
Tipologia:
Documento in Post-print
Licenza:
Copyright dell'editore
Dimensione
1.41 MB
Formato
Adobe PDF
|
1.41 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.