LOREM

Language-consistent Open Relation Extraction from Unstructured Text

Conference Paper (2020)
Author(s)

Tom Harting (Student TU Delft)

Sepideh Mesbah (TU Delft - Web Information Systems)

C. Lofi (TU Delft - Web Information Systems)

Research Group
Web Information Systems
Copyright
© 2020 Tom Harting, S. Mesbah, C. Lofi
DOI related publication
https://doi.org/10.1145/3366423.3380252
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 Tom Harting, S. Mesbah, C. Lofi
Research Group
Web Information Systems
Pages (from-to)
1830-1838
ISBN (electronic)
978-1-4503-7023-3
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We introduce a Language-consistent multi-lingual Open Relation Extraction Model (LOREM) for finding relation tuples of any type between entities in unstructured texts. LOREM does not rely on language-specific knowledge or external NLP tools such as translators or PoS-taggers, and exploits information and structures that are consistent over different languages. This allows our model to be easily extended with only limited training efforts to new languages, but also provides a boost to performance for a given single language. An extensive evaluation performed on 5 languages shows that LOREM outperforms state-of-the-art mono-lingual and cross-lingual open relation extractors. Moreover, experiments on languages with no or only little training data indicate that LOREM generalizes to other languages than the languages that it is trained on.