Machine learning for the prediction of pseudorealistic pediatric abdominal phantoms for radiation dose reconstruction

Journal Article (2020)
Author(s)

Marco Virgolin (Centrum Wiskunde & Informatica (CWI))

Ziyuan Wang (Amsterdam UMC)

Tanja Alderliesten (Vrije Universiteit Amsterdam, Amsterdam UMC, TU Delft - Algorithmics)

Peter Bosman (Centrum Wiskunde & Informatica (CWI), TU Delft - Algorithmics)

Research Group
Algorithmics
Copyright
© 2020 M. Virgolin, Ziyuan Wang, T. Alderliesten, P.A.N. Bosman
DOI related publication
https://doi.org/10.1117/1.JMI.7.4.046501
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 M. Virgolin, Ziyuan Wang, T. Alderliesten, P.A.N. Bosman
Research Group
Algorithmics
Issue number
4
Volume number
7
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Purpose: Current phantoms used for the dose reconstruction of long-term childhood cancer survivors lack individualization. We design a method to predict highly individualized abdominal three-dimensional (3-D) phantoms automatically. Approach: We train machine learning (ML) models to map (2-D) patient features to 3-D organ-at-risk (OAR) metrics upon a database of 60 pediatric abdominal computed tomographies with liver and spleen segmentations. Next, we use the models in an automatic pipeline that outputs a personalized phantom given the patient's features, by assembling 3-D imaging from the database. A step to improve phantom realism (i.e., avoid OAR overlap) is included. We compare five ML algorithms, in terms of predicting OAR left-right (LR), anterior-posterior (AP), inferior-superior (IS) positions, and surface Dice-Sørensen coefficient (sDSC). Furthermore, two existing human-designed phantom construction criteria and two additional control methods are investigated for comparison. Results: Different ML algorithms result in similar test mean absolute errors: ∼8 mm for liver LR, IS, and spleen AP, IS; ∼5 mm for liver AP and spleen LR; ∼80 % for abdomen sDSC; and ∼60 % to 65% for liver and spleen sDSC. One ML algorithm (GP-GOMEA) significantly performs the best for 6/9 metrics. The control methods and the human-designed criteria in particular perform generally worse, sometimes substantially (+5-mm error for spleen IS,-10 % sDSC for liver). The automatic step to improve realism generally results in limited metric accuracy loss, but fails in one case (out of 60). Conclusion: Our ML-based pipeline leads to phantoms that are significantly and substantially more individualized than currently used human-designed criteria.

Files

046501_1.pdf
(pdf | 5.81 Mb)
License info not available