Language-consistent Open Relation Extraction

from Multilingual Text Corpora

Master Thesis (2019)
Author(s)

T. Harting (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Lofi – Mentor (TU Delft - Web Information Systems)

Geert Jan Houben – Graduation committee member (TU Delft - Web Information Systems)

W.P. Brinkman – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2019 Tom Harting
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Tom Harting
Graduation Date
12-07-2019
Awarding Institution
Delft University of Technology
Programme
['Computer Science']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Open Relation Extraction (ORE) aims to find arbitrary relation tuples between entities in unstructured texts. Even though recent research efforts yield state-of-the-art results for the ORE task by utilizing neural network based models, these works are solely focused on the English language. Methods were proposed to tackle the ORE task for multiple languages, yet these works fail to exploit relation patterns that are consistent over languages. Moreover, they require additional data to train translators, hindering efficient extension to new languages. In this work, we introduce a Language-consistent Open Relation Extraction Model (LOREM). By adding a language-consistent component to the current state-of-the-art open relation extraction model, we enable exploitation of information from multiple languages. Since we remove all dependencies on language-specific knowledge and external NLP tools such as translators, it is relatively easy to extend our model to new languages. An extensive evaluation performed on 5 languages shows that LOREM outperforms state-of-the-art monolingual and cross-lingual open relation extractors. Moreover, experiments on low- and even no-resource languages indicate that LOREM generalizes to other languages than the languages that it is trained on.

Files

License info not available