Data-driven Recovery of Incomplete Geotechnical Dataset Using Low-rank Matrix Completion
Z. Guan (TU Delft - Geo-engineering)
Yu Wang (The Hong Kong University of Science and Technology)
Kok-Kwang Phoon (Singapore University of Technology and Design)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Real geotechnical data from a typical site might be characterized as MUSIC-X (i.e., Multivariate, Uncertain, Unique, Sparse, Incomplete, and potentially Corrupted, with X denoting spatial/temporal variability). One of the key challenges in developing site-specific statistical models for multiple geotechnical properties (i.e., Multivariate) is missing (or Incomplete) values from different tests at various depths/locations. This raises a critical question in geotechnical site investigations: how to recover the missing values in real geotechnical datasets from available measurements by leveraging the underlying structure of geotechnical datasets? Since different geotechnical properties are not only correlated among different properties, but also auto-correlated across different depths, this suggests that a simple underlying structure with only a limited number of important features/patterns might exist for multivariate geotechnical datasets. Leveraging on this observation, this study proposes a novel, data-driven method for predicting missing values by low-rank matrix completion. The proposed method exploits the auto- and cross-correlation structures of different test data. Missing values are then recovered using a singular value thresholding algorithm, and a k-fold cross-validation strategy is employed to determine the level of measurement noise. The method is illustrated and validated using a real geotechnical dataset. The results indicate that the proposed method can provide reliable predictions. </jats:p>