Multiword units translation evaluation in machine translation: another pain in the neck?

IRIS

Recent studies have highlighted that the translation of Multiword Units (MWUs) by Machine Translation (MT) is still an open challenge, whatever the adopted approach (statistical, rule-based or example- based). The difficulties in translating automatically this recurrent, complex and varied lexical phenomenon originate from its lexical, syntactic, semantic, pragmatic and/or statistical but also translational idiomaticity. It is widely acknowledged that in order to achieve significant improvements in Machine Translation and translation technologies it is important to develop resources, which can be used both for Statistical Machine Translation (SMT) training and evaluation purposes. There is therefore, the need to develop linguistic re- sources, mainly parallel corpora annotated with MWUs which can help improve the MT quality in particular as regards translation of MWUs in context and discontinuous MWUs. In this paper, we analyse the state of the art concerning MWU-aware MT evaluation metrics, the availability of both benchmarking resources and annotation guidelines and procedures.

Multiword units translation evaluation in machine translation: another pain in the neck?

MONTI, JOHANNA;Todirascu Amalia

2016-01-01

Abstract

Recent studies have highlighted that the translation of Multiword Units (MWUs) by Machine Translation (MT) is still an open challenge, whatever the adopted approach (statistical, rule-based or example- based). The difficulties in translating automatically this recurrent, complex and varied lexical phenomenon originate from its lexical, syntactic, semantic, pragmatic and/or statistical but also translational idiomaticity. It is widely acknowledged that in order to achieve significant improvements in Machine Translation and translation technologies it is important to develop resources, which can be used both for Statistical Machine Translation (SMT) training and evaluation purposes. There is therefore, the need to develop linguistic re- sources, mainly parallel corpora annotated with MWUs which can help improve the MT quality in particular as regards translation of MWUs in context and discontinuous MWUs. In this paper, we analyse the state of the art concerning MWU-aware MT evaluation metrics, the availability of both benchmarking resources and annotation guidelines and procedures.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Codice ISBN
	
				978-2-9700736-9-7
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
WorkshopProceedings_def.pdf accesso aperto Tipologia: Documento in Post-print Licenza: PUBBLICO - Pubblico con Copyright Dimensione 9.1 MB Formato Adobe PDF Visualizza/Apri	9.1 MB	Adobe PDF	Visualizza/Apri
MUMTTT2015 - Copertina.pdf accesso aperto Tipologia: Documento in Post-print Licenza: PUBBLICO - Pubblico con Copyright Dimensione 915.78 kB Formato Adobe PDF Visualizza/Apri	915.78 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11574/172676

Citazioni

ND

social impact