Reproducing state-of-the-art schema matching algorithms

More Info
expand_more

Abstract

Schema matching has been a researched topic for over 20 years. Therefore, many schema matching solutions have been proposed to treat various problems such as: creating unified knowledge bases or mediation schema, data translation, data discovery, data curation. Such a wide variety of schema matching algorithms requires a benchmarking system that can evaluate to what extent one solution is appropriate for a given problem. However, creating the benchmark requires open source algorithms, which are not widely available in the data management community. One solution to this problem is reproducing the algorithms, although there is a reproducibility crisis which proves that the majority of existing research can not be reproduced. These circumstances have determined the goal of this research: conducting a reproducibility study on the state-of-the-art schema matching algorithms. This study supports the schema matching development and emphasizes the issues regarding the ability to reproduce the algorithms or the results. Moreover, we implement the selected algorithms and benchmark them in an industry case study.