This paper describes the first resource annotated for multiword expressions (MWEs) in Italian. Two versions of this dataset have been prepared: the first with a fast markup list of out-of-context MWEs, and the second with an in-context annotation, where the MWEs are entered with their contexts. The paper also discusses annotation issues and reports the inter-annotator agreement for both types of annotations. Finally, the results of the first exploitation of the new resource, namely the automatic extraction of Italian MWEs, are presented.
Questo contributo descrive la prima risorsa italiana annotatata con polirematiche. Sono state preparate due versioni del dataset: la prima con una lista di polirematiche senza contesto, e la seconda con annotazione in contesto. Il contributo discute le problematiche emerse durante l’annotazione e riporta il grado di accordo tra annotatori per entrambi i tipi di annotazione. Infine vengono presentati i risultati del primo impiego della nuova risorsa, ovvero l’estrazione automatica di polirematiche per l’italiano.
Language Resources for Italian: towards the Development of a Corpus of Annotated Italian Multiword Expressions
MONTI, JOHANNA
2016-01-01
Abstract
This paper describes the first resource annotated for multiword expressions (MWEs) in Italian. Two versions of this dataset have been prepared: the first with a fast markup list of out-of-context MWEs, and the second with an in-context annotation, where the MWEs are entered with their contexts. The paper also discusses annotation issues and reports the inter-annotator agreement for both types of annotations. Finally, the results of the first exploitation of the new resource, namely the automatic extraction of Italian MWEs, are presented.File | Dimensione | Formato | |
---|---|---|---|
CLiCit2016-51_Taslimipoor_Desantis_Cherchi_etal.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
347.09 kB
Formato
Adobe PDF
|
347.09 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.