SmartPub

None, None; None, None; None, None; None, None

SmartPub

A Platform for Long-Tail Entity Extraction from Scientific Publications

Conference Paper (2018)

Author(s)

Sepideh Mesbah (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Alessandro Bozzon (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Christoph Lofi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Geert-Jan Houben (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Research Group

Web Information Systems

Information Extraction

DOI related publication

https://doi.org/10.1145/3184558.3186976 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:df14aed1-ba15-48df-94ce-f947e3ab5cdb

More Info

expand_more

Publication Year

2018

Language

English

Research Group

Web Information Systems

Pages (from-to)

191-194

ISBN (electronic)

978-1-4503-5640-4

Event

WWW 2018 (2018-04-23 - 2018-04-27), Lyon, France

Downloads counter

361

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This demo presents SmartPub, a novel web-based platform that supports the exploration and visualization of shallow meta-data (e.g., author list, keywords) and deep meta-data--long tail named entities which are rare, and often relevant only in specific knowledge domain--from scientific publications. The platform collects documents from different sources (e.g. DBLP and Arxiv), and extracts the domain-specific named entities from the text of the publications using Named Entity Recognizers (NERs) which we can train with minimal human supervision even for rare entity types. The platform further enables the interaction with the Crowd for filtering purposes or training data generation, and provides extended visualization and exploration capabilities. SmartPub will be demonstrated using sample collection of scientific publications focusing on the computer science domain and will address the entity types Dataset (i.e. dataset presented or used in a publication), and Methods (i.e. algorithms used to create/enrich/analyse a data set)

Files

39999403_p191_mesbah.pdf

(pdf | 1.82 Mb)