MV

M. Virgolin

info

Please Note

15 records found

Conference paper (2021) - Tom Den Ottelander, Arkadiy Dushatskiy, Marco Virgolin, Peter A.N. Bosman
Neural Architecture Search (NAS), i.e., the automation of neural network design, has gained much popularity in recent years with increasingly complex search algorithms being proposed. Yet, solid comparisons with simple baselines are often missing. At the same time, recent retrospective studies have found many new algorithms to be no better than random search (RS). In this work we consider the use of a simple Local Search (LS) algorithm for NAS. We particularly consider a multi-objective NAS formulation, with network accuracy and network complexity as two objectives, as understanding the trade-off between these two objectives is arguably among the most interesting aspects of NAS. The proposed LS algorithm is compared with RS and two evolutionary algorithms (EAs), as these are often heralded as being ideal for multi-objective optimization. To promote reproducibility, we create and release two benchmark datasets, named MacroNAS-C10 and -C100, containing 200K saved network evaluations for two established image classification tasks, CIFAR-10 and CIFAR-100. Our benchmarks are designed to be complementary to existing benchmarks, especially in that they are better suited for multi-objective search. We additionally consider a version of the problem with a much larger architecture space. While we find and show that the considered algorithms explore the search space in fundamentally different ways, we also find that LS substantially outperforms RS and even performs nearly as good as state-of-the-art EAs. We believe that this provides strong evidence that LS is truly a competitive baseline for NAS against which new NAS algorithms should be benchmarked. ...
Journal article (2020) - Marco Virgolin, Ziyuan Wang, Tanja Alderliesten, Peter A.N. Bosman
Purpose: Current phantoms used for the dose reconstruction of long-term childhood cancer survivors lack individualization. We design a method to predict highly individualized abdominal three-dimensional (3-D) phantoms automatically. Approach: We train machine learning (ML) models to map (2-D) patient features to 3-D organ-at-risk (OAR) metrics upon a database of 60 pediatric abdominal computed tomographies with liver and spleen segmentations. Next, we use the models in an automatic pipeline that outputs a personalized phantom given the patient's features, by assembling 3-D imaging from the database. A step to improve phantom realism (i.e., avoid OAR overlap) is included. We compare five ML algorithms, in terms of predicting OAR left-right (LR), anterior-posterior (AP), inferior-superior (IS) positions, and surface Dice-Sørensen coefficient (sDSC). Furthermore, two existing human-designed phantom construction criteria and two additional control methods are investigated for comparison. Results: Different ML algorithms result in similar test mean absolute errors: ∼8 mm for liver LR, IS, and spleen AP, IS; ∼5 mm for liver AP and spleen LR; ∼80 % for abdomen sDSC; and ∼60 % to 65% for liver and spleen sDSC. One ML algorithm (GP-GOMEA) significantly performs the best for 6/9 metrics. The control methods and the human-designed criteria in particular perform generally worse, sometimes substantially (+5-mm error for spleen IS,-10 % sDSC for liver). The automatic step to improve realism generally results in limited metric accuracy loss, but fails in one case (out of 60). Conclusion: Our ML-based pipeline leads to phantoms that are significantly and substantially more individualized than currently used human-designed criteria. ...
Doctoral thesis (2020) - Marco Virgolin
Machine learning is impacting modern society at large, thanks to its increasing potential to effciently and effectively model complex and heterogeneous phenomena. While machine learning models can achieve very accurate predictions in many applications, they are not infallible. In some cases, machine learning models can deliver unreasonable outcomes. For example, deep neural networks for self-driving cars have been found to provide wrong steering directions based on the lighting conditions of street lanes (e.g., due to cloudy weather). In other cases, models can capture and reflect unwanted biases that were concealed in the training data. For example, deep neural networks used to predict likely jobs and social status of people based on their pictures, were found to consistently discriminate based on gender and ethnicity–this was later attributed to human bias in the labels of the training data. ...
Journal article (2020) - Marco Virgolin, Tanja Alderliesten, Peter A.N. Bosman
Feature construction can substantially improve the accuracy of Machine Learning (ML) algorithms. Genetic Programming (GP) has been proven to be effective at this task by evolving non-linear combinations of input features. GP additionally has the potential to improve ML explainability since explicit expressions are evolved. Yet, in most GP works the complexity of evolved features is not explicitly bound or minimized though this is arguably key for explainability. In this article, we assess to what extent GP still performs favorably at feature construction when constructing features that are (1) Of small-enough number, to enable visualization of the behavior of the ML model; (2) Of small-enough size, to enable interpretability of the features themselves; (3) Of sufficient informative power, to retain or even improve the performance of the ML algorithm. We consider a simple feature construction scheme using three different GP algorithms, as well as random search, to evolve features for five ML algorithms, including support vector machines and random forest. Our results on 21 datasets pertaining to classification and regression problems show that constructing only two compact features can be sufficient to rival the use of the entire original feature set. We further find that a modern GP algorithm, GP-GOMEA, performs best overall. These results, combined with examples that we provide of readable constructed features and of 2D visualizations of ML behavior, lead us to positively conclude that GP-based feature construction still works well when explicitly searching for compact features, making it extremely helpful to explain ML models. ...
Journal article (2020) - M. Virgolin, Z. Wang, B.V. Balgobind, I.W.E.M. van Dijk, J. Wiersma, P.S. Kroon, G.O. Janssens, M. van Herk, P.A.N. Bosman, More authors...
To study radiotherapy-related adverse effects, detailed dose information (3D distribution) is needed for accurate dose-effect modeling. For childhood cancer survivors who underwent radiotherapy in the pre-CT era, only 2D radiographs were acquired, thus 3D dose distributions must be reconstructed from limited information. State-of-the-art methods achieve this by using 3D surrogate anatomies. These can however lack personalization and lead to coarse reconstructions. We present and validate a surrogate-free dose reconstruction method based on Machine Learning (ML). Abdominal planning CTs (n = 142) of recently-treated childhood cancer patients were gathered, their organs at risk were segmented, and 300 artificial Wilms' tumor plans were sampled automatically. Each artificial plan was automatically emulated on the 142 CTs, resulting in 42,600 3D dose distributions from which dose-volume metrics were derived. Anatomical features were extracted from digitally reconstructed radiographs simulated from the CTs to resemble historical radiographs. Further, patient and radiotherapy plan features typically available from historical treatment records were collected. An evolutionary ML algorithm was then used to link features to dose-volume metrics. Besides 5-fold cross validation, a further evaluation was done on an independent dataset of five CTs each associated with two clinical plans. Cross-validation resulted in mean absolute errors ≤ 0.6 Gy for organs completely inside or outside the field. For organs positioned at the edge of the field, mean absolute errors ≤ 1.7 Gy for Dmean, ≤ 2.9 Gy for 2cc,}, and ≤ 13% for V5 Gy10 Gy, were obtained, without systematic bias. Similar results were found for the independent dataset. To conclude, we proposed a novel organ dose reconstruction method that uses ML models to predict dose-volume metric values given patient and plan features. Our approach is not only accurate, but also efficient, as the setup of a surrogate is no longer needed. ...
Journal article (2020) - Ziyuan Wang, Marco Virgolin, Peter A.N. Bosman, Koen F. Crama, Brian V. Balgobind, Arjan Bel, Tanja Alderliesten
Performing large-scale three-dimensional radiation dose reconstruction for patients requires a large amount of manual work. We present an image processing-based pipeline to automatically reconstruct radiation dose. The pipeline was designed for childhood cancer survivors that received abdominal radiotherapy with anterior-to-posterior and posterior-to-anterior field set-up. First, anatomical landmarks are automatically identified on two-dimensional radiographs. Second, these landmarks are used to derive parameters to emulate the geometry of the plan on a surrogate computed tomography. Finally, the plan is emulated and used as input for dose calculation. For qualitative evaluation, 100 cases of automatic and manual plan emulations were assessed by two experienced radiation dosimetrists in a blinded comparison. The two radiation dosimetrists approved 100%/100% and 92%/91% of the automatic/manual plan emulations, respectively. Similar approval rates of 100% and 94% hold when the automatic pipeline is applied on another 50 cases. Further, quantitative comparisons resulted in on average <5 mm difference in plan isocenter/borders, and <0.9 Gy in organ mean dose (prescribed dose: 14.4 Gy) calculated from the automatic and manual plan emulations. No statistically significant difference in terms of dose reconstruction accuracy was found for most organs at risk. Ultimately, our automatic pipeline results are of sufficient quality to enable effortless scaling of dose reconstruction data generation. ...
Conference paper (2020) - Marco Virgolin, Ziyuan Wang, Tanja Alderliesten, Peter A.N. Bosman
The advent of Machine Learning (ML) is proving extremely beneficial in many healthcare applications. In pediatric oncology, retrospective studies that investigate the relationship between treatment and late adverse effects still rely on simple heuristics. To capture the effects of radiation treatment, treatment plans are typically simulated on virtual surrogates of patient anatomy called phantoms. Currently, phantoms are built to represent categories of patients based on reasonable yet simple criteria. This often results in phantoms that are too generic to accurately represent individual anatomies. We present a novel approach that combines imaging data and ML to build individualized phantoms automatically. We design a pipeline that, given features of patients treated in the pre-3D planning era when only 2D radiographs were available, as well as a database of 3D Computed Tomography (CT) imaging with organ segmentations, uses ML to predict how to assemble a patient-specific phantom. Using 60 abdominal CTs of pediatric patients between 2 to 6 years of age, we find that our approach delivers significantly more representative phantoms compared to using current phantom building criteria, in terms of shape and location of two considered organs (liver and spleen), and shape of the abdomen. Furthermore, as interpretability is often central to trust ML models in medical contexts, among other ML algorithms we consider the Gene-pool Optimal Mixing Evolutionary Algorithm for Genetic Programming (GP-GOMEA), that learns readable mathematical expression models. We find that the readability of its output does not compromise prediction performance as GP-GOMEA delivered the best performing models. ...
Conference paper (2019) - Marco Virgolin, Tanja Alderliesten, Peter A.N. Bosman
Semantic Backpropagation (SB) is a recent technique that promotes effective variation in tree-based genetic programming. The basic idea of SB is to provide information on what output is desirable for a specified tree node, by propagating the desired root-node output back to the specified node using inversions of functions encountered along the way. Variation operators then replace the subtree located at the specified node with a tree for which the output is closest to the desired output, by searching in a pre-computed library. In this paper, we propose two contributions to enhance SB specifically for symbolic regression, by incorporating the principles of Keijzer's Linear Scaling (LS). In particular, we show how SB can be used in synergy with the scaled mean squared error, and we show how LS can be adopted within library search. We test our adaptations using the well-known variation operator Random Desired Operator (RDO), comparing to its baseline implementation, and to traditional crossover and mutation. Our experimental results on real-world datasets show that SB enhanced with LS substantially improves the performance of RDO, resulting in overall the best performance among all tested GP algorithms. ...
Journal article (2019) - Ziyuan Wang, Brian V. Balgobind, Marco Virgolin, Irma W.E.M. Van Dijk, Jan Wiersma, Cécile M. Ronckers, Peter A.N. Bosman, Arjan Bel, Tanja Alderliesten
In retrospective radiation treatment (RT) dosimetry, a surrogate anatomy is often used for patients without 3D CT. To gain insight in what the crucial aspects in a surrogate anatomy are to enable accurate dose reconstruction, we investigated the relation of patient characteristics and internal anatomical features with deviations in reconstructed organ dose using surrogate patient's CT scans. Abdominal CT scans of 35 childhood cancer patients (age: 2.1-5.6 yr; 17 boys, 18 girls) undergoing RT during 2004-2016 were included. Based on whether an intact right or left kidney is present in the CT scan, two groups were formed each containing 24 patients. From each group, four CTs associated with Wilms' tumor RT plans with an anterior-posterior - posterior-anterior field setup were selected as references. For each reference, a 2D digitally reconstructed radiograph was computed from the reference CT to simulate a 2D radiographic image and dose reconstruction was performed on the other CTs in the respective group. Deviations in organ mean dose (DE mean ) of the reconstructions versus the references were calculated, as were deviations in patient characteristics (i.e. age, height, weight) and in anatomical features including organ volume, location (in 3D), and spatial overlaps. Per reference, the Pearson's correlation coefficient between deviations in DE mean and patient characteristics/features were studied. Deviation in organ locations and DE mean for the liver, spleen, and right kidney were moderately correlated (R 2 ...
Conference paper (2019) - Ziyuan Wang, Marco Virgolin, Peter A.N. Bosman, Brian V. Balgobind, Arjan Bel, Tanja Alderliesten
3D dose reconstruction for radiotherapy (RT) is the estimation of the 3D radiation dose distribution patients received during RT. Big dose reconstruction data is needed to accurately model the relationship between the dose and onset of adverse effects, to ultimately gain insights and improve today's treatments. Dose reconstruction is often performed by emulating the original RT plan on a surrogate anatomy for dose estimation. This is especially essential for historically treated patients with long-Term follow-up, as solely 2D radiographs were used for RT planning, and no 3D imaging was acquired for these patients. Performing dose reconstruction for a large group of patients requires a large amount of manual work, where the geometry of the original RT plan is emulated on the surrogate anatomy, by visually comparing the latter with the original 2D radiograph of the patient. This is a labor-intensive process that for practical use needs to be automated. This work presents an image-processing pipeline to automatically emulate plans on surrogate computational tomography (CT) scans. The pipeline was designed for childhood cancer survivors that historically received abdominal RT with anterior-To-posterior and posterior-To-Anterior RT field set-up. First, anatomical landmarks are automatically identified on 2D radiographs. Next, these landmarks are used to derive parameters needed to finally emulate the plan on a surrogate CT. Validation was performed by an experienced RT planner, visually assessing 12 cases of automatic plan emulations. Automatic emulations were approved 11 out of 12 times. This work paves the way to effortless scaling of dose reconstruction data generation. ...
Journal article (2018) - Eric Medvet, Marco Virgolin, Mauro Castelli, Peter Bosman, Ivo Gonçalves, Tea Tušar
Evolutionary algorithms (EAs) have proven to be effective in tackling problems in many different domains. However, users are often required to spend a significant amount of effort in fine-tuning the EA parameters in order to make the algorithm work. In principle, visualization tools may be of great help in this laborious task, but current visualization tools are either EA-specific, and hence hardly available to all users, or too general to convey detailed information. In this work, we study the Diversity and Usage map (DU map), a compact visualization for analyzing a key component of every EA, the representation of solutions. In a single heat map, the DU map visualizes for entire runs how diverse the genotype is across the population and to which degree each gene in the genotype contributes to the solution. We demonstrate the generality of the DU map concept by applying it to six EAs that use different representations (bit and integer strings, trees, ensembles of trees, and neural networks). We present the results of an online user study about the usability of the DU map which confirm the suitability of the proposed tool and provide important insights on our design choices. By providing a visualization tool that can be easily tailored by specifying the diversity (D) and usage (U) functions, the DU map aims at being a powerful analysis tool for EAs practitioners, making EAs more transparent and hence lowering the barrier for their use. ...
Journal article (2018) - Marco Virgolin, Irma W.E.M. van Dijk, Jan Wiersma, Cécile M. Ronckers, Cees Witteveen, Arjan Bel, Peter A.N. Bosman
Purpose: The aim of this study is to establish the first step toward a novel and highly individualized three-dimensional (3D) dose distribution reconstruction method, based on CT scans and organ delineations of recently treated patients. Specifically, the feasibility of automatically selecting the CT scan of a recently treated childhood cancer patient who is similar to a given historically treated child who suffered from Wilms’ tumor is assessed. Methods: A cohort of 37 recently treated children between 2- and 6-yr old are considered. Five potential notions of ground-truth similarity are proposed, each focusing on different anatomical aspects. These notions are automatically computed from CT scans of the abdomen and 3D organ delineations (liver, spleen, spinal cord, external body contour). The first is based on deformable image registration, the second on the Dice similarity coefficient, the third on the Hausdorff distance, the fourth on pairwise organ distances, and the last is computed by means of the overlap volume histogram. The relationship between typically available features of historically treated patients and the proposed ground-truth notions of similarity is studied by adopting state-of-the-art machine learning techniques, including random forest. Also, the feasibility of automatically selecting the most similar patient is assessed by comparing ground-truth rankings of similarity with predicted rankings. Results: Similarities (mainly) based on the external abdomen shape and on the pairwise organ distances are highly correlated (Pearson rp ≥ 0.70) and are successfully modeled with random forests based on historically recorded features (pseudo-R2 ≥ 0.69). In contrast, similarities based on the shape of internal organs cannot be modeled. For the similarities that random forest can reliably model, an estimation of feature relevance indicates that abdominal diameters and weight are the most important. Experiments on automatically selecting similar patients lead to coarse, yet quite robust results: the most similar patient is retrieved only 22% of the times, however, the error in worst-case scenarios is limited, with the fourth most similar patient being retrieved. Conclusions: Results demonstrate that automatically selecting similar patients is feasible when focusing on the shape of the external abdomen and on the position of internal organs. Moreover, whereas the common practice in phantom-based dose reconstruction is to select a representative phantom using age, height, and weight as discriminant factors for any treatment scenario, our analysis on abdominal tumor treatment for children shows that the most relevant features are weight and the anterior–posterior and left–right abdominal diameters. ...
Conference paper (2018) - Marco Virgolin, Tanja Alderliesten, Arjan Bel, Cees Witteveen, Peter A.N. Bosman
The recently introduced Gene-pool Optimal Mixing Evolutionary Algorithm for Genetic Programming (GP-GOMEA) has been shown to find much smaller solutions of equally high quality compared to other state-of-the-art GP approaches. This is an interesting aspect as small solutions better enable human interpretation. In this paper, an adaptation of GP-GOMEA to tackle real-world symbolic regression is proposed, in order to find small yet accurate mathematical expressions, and with an application to a problem of clinical interest. For radiotherapy dose reconstruction, a model is sought that captures anatomical patient similarity. This problem is particularly interesting because while features are patient-specific, the variable to regress is a distance, and is defined over patient pairs. We show that on benchmark problems as well as on the application, GP-GOMEA outperforms variants of standard GP. To find even more accurate models, we further consider an evolutionary meta learning approach, where GP-GOMEA is used to construct small, yet effective features for a different machine learning algorithm. Experimental results show how this approach significantly improves the performance of linear regression, support vector machines, and random forest, while providing meaningful and interpretable features. ...
Other (2017) - Marco Virgolin, Tanja Alderliesten, Cees Witteveen, P.A.N. Bosman
The Gene-pool Optimal Mixing Evolutionary Algorithm (GOMEA) is a recently introduced model-based EA that has been shown to be capable of outperforming state-of-the-art alternative EAs in terms of scalability when solving discrete optimization problems. One of the key aspects of GOMEA's success is a variation operator that is designed to extensively exploit linkage models by effectively combining partial solutions. Here, we bring the strengths of GOMEA to Genetic Programming (GP), introducing GP-GOMEA. Under the hypothesis of having little problem-specific knowledge, and in an effort to design easy-to-use EAs, GP-GOMEA requires no parameter specification. On a set of well-known benchmark problems we find that GP-GOMEA outperforms standard GP while being on par with more recently introduced, state-of-the-art EAs. We furthermore introduce Input-space Entropy-based Building-block Learning (IEBL), a novel approach to identifying and encapsulating relevant building blocks (subroutines) into new terminals and functions. On problems with an inherent degree of modularity, IEBL can contribute to compact solution representations, providing a large potential for knock-on effects in performance. On the difficult, but highly modular Even Parity problem, GP-GOMEA+IEBL obtains excellent scalability, solving the 14-bit instance in less than 1 hour. ...
Conference paper (2016) - Marco Virgolin, Irma Dijk van, Jan Wiersma, Cecile Ronckers, Cees Witteveen, Coen Rasch, Arjan Bel, Tanja Alderliesten, Peter Bosman