An exploratory journey to combine schema matchers for better relevance prediction
W.H. Wang (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A. Katsifodimos – Mentor (TU Delft - Web Information Systems)
Geert-Jan Houben – Graduation committee member (TU Delft - Web Information Systems)
Lydia Chen – Graduation committee member
Andra Ionescu – Mentor (TU Delft - Web Information Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Current speed of data growth has exponentially increased over the past decade, highlighting the need of modern organizations for data discovery systems. Several (automated) schema matching approaches have been proposed to find related data, exploiting different parts of schema information (e.g. data type, data distribution, column name, etc.). However, research showed that single schema matching techniques fails to effectively match schemas, whilst combinatorial schema matching systems show more promise. With the introduction of combinatorial schema matching systems, new challenges arise regarding selection and combining strategies. This research attempts to explore different techniques for determining the importance of each matcher in a combinatorial schema matching system by determining the weights of each matcher and comparing them through a comprehensive evaluation.