REMA

Graph embeddings-based relational schema matching

More Info
expand_more

Abstract

Schema matching is the process of capturing correspondence between attributes of different datasets and it is one of the most important prerequisite steps for analyzing heterogeneous data collections. State-of-the-art schema matching algorithms that use simple schema- or instance-based similarity measures struggle with finding matches beyond the trivial cases. Semantics-based algorithms require the use of domain-specific knowledge encoded in a knowledge graph or an ontology. As a result, schema matching still remains a largely manual process, which is performed by few domain experts. In this paper we present the Relational Embeddings MAtcher, or rema, for short. rema is a novel schema matching approach which captures semantic similarity of attributes using relational embeddings: a technique which embeds database rows, columns and schema information into multidimensional vectors that can reveal semantic similarity. This paper aims at communicating our latest findings, and at demonstrating rema's potential with a preliminary experimental evaluation.