Language-consistent Open Relation Extraction

None, None

Language-consistent Open Relation Extraction

from Multilingual Text Corpora

Master Thesis (2019)

Author(s)

T. Harting (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

C. Lofi – Mentor (TU Delft - Web Information Systems)

Geert Jan Houben – Graduation committee member (TU Delft - Web Information Systems)

W.P. Brinkman – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Natural Language Processing Information Extraction Open Relation Extraction

To reference this document use:

https://resolver.tudelft.nl/uuid:5c713c07-fbdd-4ad9-8914-d6e64374d526

More Info

expand_more

Publication Year

2019

Language

English

Copyright

Graduation Date

12-07-2019

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Open Relation Extraction (ORE) aims to find arbitrary relation tuples between entities in unstructured texts. Even though recent research efforts yield state-of-the-art results for the ORE task by utilizing neural network based models, these works are solely focused on the English language. Methods were proposed to tackle the ORE task for multiple languages, yet these works fail to exploit relation patterns that are consistent over languages. Moreover, they require additional data to train translators, hindering efficient extension to new languages. In this work, we introduce a Language-consistent Open Relation Extraction Model (LOREM). By adding a language-consistent component to the current state-of-the-art open relation extraction model, we enable exploitation of information from multiple languages. Since we remove all dependencies on language-specific knowledge and external NLP tools such as translators, it is relatively easy to extend our model to new languages. An extensive evaluation performed on 5 languages shows that LOREM outperforms state-of-the-art monolingual and cross-lingual open relation extractors. Moreover, experiments on low- and even no-resource languages indicate that LOREM generalizes to other languages than the languages that it is trained on.

Files

_Final_online_Thesis_LOREM.pdf

(pdf | 2.78 Mb)

License info not available