Classification of clinically significant prostate cancer on multi-parametric MRI

A validation study comparing deep learning and radiomics

Journal article (2022)

Authors

J.M. Castillo Erasmus MC

M. Arif Erasmus MC

M.P.A. Starmans Erasmus MC

W.J. Niessen ImPhys/Computational Imaging - , ImPhys/Medical Imaging - , Erasmus MC

C.H. Bangma Erasmus MC

Ivo G. Schoots Erasmus MC

J.F. Veenland Erasmus MC

Department

Biomechanical Engineering

DOI

https://doi.org/10.3390/cancers14010012

Machine learning Comparison Deep learning Prediction Classification Model Radiomics Clinically significant Gleason score MpMRI Prostate carcinoma

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:0b615445-9f7d-4735-bc0a-4a21a5f9b0a6

Published Date

2022

Language

English

Department

Biomechanical Engineering

Abstract

The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning-and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning-and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.

Files

Cancers_14_00012.pdf

(.pdf | 1.65 Mb)