transparent gif


Ej inloggad.

Göteborgs universitets publikationer

Developing an interlingual translation lexicon using WordNets and Grammatical Framework

Författare och institution:
Shafqat Virk (Institutionen för data- och informationsteknik (GU)); K. V. S. Prasad (Institutionen för data- och informationsteknik (Chalmers), Chalmers); Aarne Ranta (Institutionen för data- och informationsteknik, datavetenskap (GU)); Krasimir Angelov (Institutionen för data- och informationsteknik, datavetenskap (GU))
Publicerad i:
Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing,
Konferensbidrag, refereegranskat
Sammanfattning (abstract):
The Grammatical Framework (GF) offers perfect translation between controlled subsets of natural languages. E.g., an abstract syntax for a set of sentences in school mathematics is the interlingua between the corresponding sentences in English and Hindi, say. GF “resource grammars” specify how to say something in English or Hindi; these are reused with “application grammars” that specify what can be said (mathematics, tourist phrases, etc.). More recent robust parsing and parse-tree disambiguation allow GF to parse arbitrary English text. We report here an experiment to linearise the resulting tree directly to other languages (e.g. Hindi, German, etc.), i.e., we use a language independent resource grammar as the interlingua. We focus particularly on the last part of the translation system, the interlingual lexicon and word sense disambiguation (WSD). We improved the quality of the wide coverage interlingual translation lexicon by using the Princeton and Universal WordNet data. We then integrated an existing WSD tool and replaced the usual GF style lexicons, which give one target word per source word, by the WordNet based lexicons. These new lexicons and WSD improve the quality of translation in most cases, as we show by examples. Both WordNets and WSD in general are well known, but this is the first use of these tools with GF.
Länk till sammanfattning (abstract):
Ämne (baseras på Högskoleverkets indelning av forskningsämnen):
Data- och informationsvetenskap ->
Språkteknologi (språkvetenskaplig databehandling)
Grammatical Framework, WordNet , lexicons, word sense disambiguation
Chalmers styrkeområden:
Informations- och kommunikationsteknik
Ytterligare information:
Postens nummer:
Posten skapad:
2014-09-10 12:03
Posten ändrad:
2015-02-05 11:50

Visa i Endnote-format

Göteborgs universitet • Tel. 031-786 0000
© Göteborgs universitet 2007