This paper addresses the impact of multiword translation errors in machine translation (MT). We have analysed translations of multiwords in the OpenLogos rule-based system (RBMT) and in the Google Translate statistical system (SMT) for the English-French, English-Italian, and English-Portuguese language pairs. Our study shows that, for distinct reasons, multiwords remain a problematic area for MT independently of the approach, and require adequate linguistic quality evaluation metrics founded on a systematic categorization of errors by MT expert linguists. We propose an empirically-driven taxonomy for multiwords, and highlight the need for the development of specific corpora for multiword evaluation. Finally, the paper presents the Logos approach to multiword processing, illustrating how semantico-syntactic rules contribute to multiword translation quality.

When Multiwords Go Bad in Machine Translation

MONTI, JOHANNA;
2013-01-01

Abstract

This paper addresses the impact of multiword translation errors in machine translation (MT). We have analysed translations of multiwords in the OpenLogos rule-based system (RBMT) and in the Google Translate statistical system (SMT) for the English-French, English-Italian, and English-Portuguese language pairs. Our study shows that, for distinct reasons, multiwords remain a problematic area for MT independently of the approach, and require adequate linguistic quality evaluation metrics founded on a systematic categorization of errors by MT expert linguists. We propose an empirically-driven taxonomy for multiwords, and highlight the need for the development of specific corpora for multiword evaluation. Finally, the paper presents the Logos approach to multiword processing, illustrating how semantico-syntactic rules contribute to multiword translation quality.
2013
Inglese
Monti Johanna, Mitkov Ruslan, Corpas Pastor Gloria, Seretan Violate altri
Monti Johanna, Mitkov Ruslan, Corpas Pastor Gloria, Seretan Violeta
Workshop Proceedings for:Multi-word Units in Machine Translation and Translation Technologies (Organised at the 14th Machine Translation Summit)
contributo
Machine Translation Summit XIV
26
33
8
978-3-9524207-4-4
The European Association for Machine Translation
CH-4123 Allschwil
SVIZZERA
Comitato scientifico
2-6 September 2013
Nice - France
Internazionale
Machine Translation; Multiword; MT Evaluation
4
Monti, Johanna; Barreiro, Anabela; Oroliac, Brigitte; Batista, Fernando
open
273
info:eu-repo/semantics/conferenceObject
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
Proceedings_MT_Summit_2013_Workshop_on_M.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: PUBBLICO - Pubblico con Copyright
Dimensione 2.9 MB
Formato Adobe PDF
2.9 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11574/170127
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact