Modelling X-Ray photon transport through a transformer-based neural network in computed tomography forward projection

More Info
expand_more

Abstract

Radiotherapy is one of the main treatments for cancer and relies heavily on CT images to calculate radiation dose. With research on radiotherapy moving to adaptive treatments aiming to calculate these doses at real-time speeds while maintaining high precision, a need for accurate CT imaging at comparable real-time speeds has emerged. Currently, the best performing CT image reconstruction methods are iterative reconstruction (IR) methods, which suffer from slow reconstruction speed. Faster methods are accompanied by artifacts due to the implementation of simplified physics models.

Recently, the Dose Transformer Algorithm (DoTA) [47], [48] and improved DoTA (iDoTA) [49] have shown to successfully calculate radiation therapy dose by modelling particle transport in 3D with the use of a neural network. By implementing a Transformer architecture [62], DoTA is able to capture the relationship between elements in a 3D CT volume while processing it as an input sequence. This results in an accurate prediction of particle transport, while significantly reducing computation times compared to other methods.

A neural network based on the DoTA-architecture is presented. It predicts projection data from CT input, modelling the x-ray photon transport. The network processes 2D CT images as a sequence of 1D lines. The ground truth data contains Monte Carlo projections of cylindrical water phantoms with inserts composed of five different materials.

The predictions are compared to Monte Carlo projections and raytracing projections generated with Astra Toolbox [45], as well as a Two-Angle Convolution (TAC) network [11]. The average NRMSE of the Transformer predictions was 0.725% compared to 2.20% and 1.09% respectively for the raytracer and TAC. The Transformer showed the ability to predict from unseen types of geometries and intensity values. Due to bias in the training data, it does not generalize well to input phantoms with an unseen outer shape.

Two phantoms were reconstructed using the network within an IR algorithm. For the Transformer and raytracer, the highest achieved CNR values are similar for low-contrast regions (6.88 and 8.28 for the raytracer compared to 7.10 and 7.35 for the Transformer) as well as high-contrast regions (37.40 and 41.94 for the raytracer compared to 39.01 and 39.80 for the Transformer). Convergence rates based on low-contrast CNR are higher for the raytracer (39 and 34 iterations compared to 41 and 41 iterations for the Transformer, respectively). The Transformer performs significantly better than the raytracer with respect to beam-hardening artefacts. The IR algorithm has not been tuned for use with the Transformer, suggesting that a higher performance is obtainable with adjustments such as the implementation of a different backprojector or a different value for correction factors used in the algorithm.

Limitations in prediction quality are likely related to factors outside of the model predictions, such as biases in the input data and resolution loss due to interpolation of the input data. When its prediction speed is optimised, the CT Transformer model has potential to replace conventional forward projections in IR methods, achieving Monte Carlo-level accuracy with a fraction of the computation time.