Dark Data as the New Challenge for Big Data Science and the Introduction of the Scientific Data Officer

More Info
expand_more

Abstract

Many studies in big data focus on the uses of data available to researchers, leaving without treatment data that is on the servers but of which researchers are unaware. We call this dark data, and in this article, we present and discuss it in the context of high-performance computing (HPC) facilities. To this end, we provide statistics of a major HPC facility in Europe, the High-Performance Computing Center Stuttgart (HLRS). We also propose a new position tailor-made for coping with dark data and general data management. We call it the scientific data officer (SDO) and we distin- guish it from other standard positions in HPC facilities such as chief data officers, system administrators, and security officers. In order to understand the role of the SDO in HPC facilities, we discuss two kinds of responsibilities, namely, technical responsibilities and ethical responsibilities. While the former are intended to charac- terize the position, the latter raise concerns—and proposes solutions—to the control and authority that the SDO would acquire.