An exploratory journey to combine schema matchers for better relevance prediction

More Info
expand_more

Abstract

Current speed of data growth has exponentially increased over the past decade, highlighting the need of modern organizations for data discovery systems. Several (automated) schema matching approaches have been proposed to find related data, exploiting different parts of schema information (e.g. data type, data distribution, column name, etc.). However, research showed that single schema matching techniques fails to effectively match schemas, whilst combinatorial schema matching systems show more promise. With the introduction of combinatorial schema matching systems, new challenges arise regarding selection and combining strategies. This research attempts to explore different techniques for determining the importance of each matcher in a combinatorial schema matching system by determining the weights of each matcher and comparing them through a comprehensive evaluation.