Repository hosted by TU Delft Library

Home · Contact · About · Disclaimer ·
 

The semantic snake charmer search engine: A tool to facilitate data science in high-tech industry domains

Publication files not online:

Author: Grappiolo, C. · Gerwen, M.J.A.M. van · Verhoosel, J.P.C. · Somers, L.
Type:article
Date:2019
Publisher: Association for Computing Machinery, Inc
Source:CHIIR 2019 - Proceedings of the 2019 Conference on Human Information Interaction and Retrieval, 4th ACM SIGIR Conference on Information Interaction and Retrieval, CHIIR 2019, 10 March 2019 through 14 March 2019, 355-359
Identifier: 865905
ISBN: 9781450360258
Keywords: Document Classification · Human-computer Collaboration · Natural Language Processing · Reinforcement Learning · Search Engine · Semantic Graph · Data mining · Data Science · Embedded systems · Graphic methods · Information retrieval systems · Knowledge based systems · Learning algorithms · Machine learning · Semantic Web · Semantics · Core competencies · Document Classification · Domain knowledge · High tech industry · Human-computer collaboration

Abstract

The booming popularity of data science is also affecting high-tech industries. However, since these usually have different core competencies - building cyber-physical systems rather than e.g. machine learning or data mining algorithms - delving into data science by domain experts such as system engineers or architects might be more cumbersome than expected. In order to help domain experts to delve into data science we designed the Semantic Snake Charmer (SSC), a domain knowledge-based search engine for Jupyter Notebooks. SSC is composed of three modules: (1) a human-machine cooperative module to identify internal documentation which contains the most relevant domain knowledge, (2) a natural language processing module capable of transforming relevant documentation into several semantic graph types, (3) a reinforcement-learning based search engine which learns, given user feedback, the best mapping between input queries and semantic graph type to rely on. We believe SSC can be a fundamental asset to allow the easy landing of data science in industrial domains. © 2019 Copyright held by the owner/author(s).