Information Extraction for Ontology Population Tasks. An Application to the Italian Archaeological Domain

Di Buono, Maria Pia

In the last years many approaches to Information Extraction (IE) task has been developed. Some of these are concept-based systems which use a reduced number of features in order to represent semantic content. On the other hand, some systems are based on term-representation. More recent techniques involve ontology-based approaches. In fact, “ontologies reflect the structure of the domain and constrain the potential interpretations of terms” [1]. In this paper we present an on-going research, based on Lexicon-Grammar (LG) framework, which aims at improving Term Extraction (TE) in the Archaeological domain. We intend to demonstrate how our language formalization technique can be applied for processing unstructured texts in order to both entity recognition and domain ontology population tasks. Starting from the assumption that a coherent and consistent language formal description is crucial and indispensable to achieve a correct semantic representation of whatsoever knowledge domain, this study focuses on a different approach to content analysis and IE