Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors
Marco Virgolin (Centrum Wiskunde & Informatica (CWI), TU Delft - Algorithmics)
Tanja Alderliesten (Amsterdam UMC)
Arjan Bel (Amsterdam UMC)
Cornelis Witteveen (TU Delft - Algorithmics)
P.A.N. Bosman (TU Delft - Algorithmics, Centrum Wiskunde & Informatica (CWI))
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The recently introduced Gene-pool Optimal Mixing Evolutionary Algorithm for Genetic Programming (GP-GOMEA) has been shown to find much smaller solutions of equally high quality compared to other state-of-the-art GP approaches. This is an interesting aspect as small solutions better enable human interpretation. In this paper, an adaptation of GP-GOMEA to tackle real-world symbolic regression is proposed, in order to find small yet accurate mathematical expressions, and with an application to a problem of clinical interest. For radiotherapy dose reconstruction, a model is sought that captures anatomical patient similarity. This problem is particularly interesting because while features are patient-specific, the variable to regress is a distance, and is defined over patient pairs. We show that on benchmark problems as well as on the application, GP-GOMEA outperforms variants of standard GP. To find even more accurate models, we further consider an evolutionary meta learning approach, where GP-GOMEA is used to construct small, yet effective features for a different machine learning algorithm. Experimental results show how this approach significantly improves the performance of linear regression, support vector machines, and random forest, while providing meaningful and interpretable features.