Print Email Facebook Twitter Coner Title Coner: A Collaborative Approach for Long-Tail Named Entity Recognition in Scientific Publications Author Vliegenthart, Daniel (Student TU Delft; National Institute of Informatics) Mesbah, S. (TU Delft Web Information Systems) Lofi, C. (TU Delft Web Information Systems) Aizawa, Akiko (National Institute of Informatics) Bozzon, A. (TU Delft Web Information Systems) Contributor Doucet, Antoine (editor) Isaac, Antoine (editor) Golub, Koraljka (editor) Aalberg, Trond (editor) Jatowt, Adam (editor) Date 2019 Abstract Named Entity Recognition (NER) for rare long-tail entities as e.g., often found in domain-specific scientific publications is a challenging task, as typically the extensive training data and test data for fine-tuning NER algorithms is lacking. Recent approaches presented promising solutions relying on training NER algorithms in an iterative weakly-supervised fashion, thus limiting human interaction to only providing a small set of seed terms. Such approaches heavily rely on heuristics in order to cope with the limited training data size. As these heuristics are prone to failure, the overall achievable performance is limited. In this paper, we therefore introduce a collaborative approach which incrementally incorporates human feedback on the relevance of extracted entities into the training cycle of such iterative NER algorithms. This approach, called Coner, allows to still train new domain specific rare long-tail NER extractors with low costs, but with ever increasing performance while the algorithm is actively used in an application. To reference this document use: http://resolver.tudelft.nl/uuid:e3d634e4-6ba9-4ab4-b80a-ffaefc527091 DOI https://doi.org/10.1007/978-3-030-30760-8_1 Publisher Springer, Cham ISBN 978-3-030-30759-2 Source Digital Libraries for Open Knowledge: 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Proceedings Event 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, 2019-09-09 → 2019-09-12, Oslo, Norway Series Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 0302-9743, 11799 LNCS Part of collection Institutional Repository Document type conference paper Rights © 2019 Daniel Vliegenthart, S. Mesbah, C. Lofi, Akiko Aizawa, A. Bozzon Files PDF 2019TPDL_Coner.pdf 465.89 KB Close viewer /islandora/object/uuid:e3d634e4-6ba9-4ab4-b80a-ffaefc527091/datastream/OBJ/view