Incidental pulmonary embolism (IPE) is a clinically significant yet frequently underdiagnosed finding in routine contrast-enhanced computed tomography (CT). This study proposes a two-stage detection framework that integrates patch-level classification with case-level inference through heatmap reconstruction. A ResNet10-based classifier was trained on cropped three-dimensional patches using five-fold cross-validation across multiple patch sizes. The ensemble of models trained with 48^3 patches achieved the best overall performance, reaching an area under the ROC curve (AUC) of 0.9646, an accuracy of 94.3%, an F1 score of 94.3%, a precision of 93.3%, a sensitivity of 95.4%, and a specificity of 93.1% on the independent test set.
For case-level inference, prediction probabilities from sliding-window traversal were aggregated into heatmaps and evaluated using two strategies: a max strategy and a top-K mean clustering strategy. Grid search revealed that both strategies achieved moderate sensitivity but markedly low specificity, reflecting the strong influence of patch-level false positives on case decisions. The top-K mean strategy offered a slightly better balance than the max strategy, though both remain highly vulnerable to spurious predictions.
These findings confirm the effectiveness of patch-level deep learning models for IPE detection but highlight the intrinsic difficulty of robust case-level inference. The results underscore the need for improved spatial context modeling, false positive suppression, and anatomically informed aggregation strategies to enhance the clinical applicability of automated IPE detection.