Click here to


Are you sure ?

Yes, do it No, cancel

A Comprehensive Evaluation of Deep Learning Design for Synthetic CT Generation

S Olberg1,2*, J Chun1,3, B Choi1,3, I Park1,3, H Kim1, J Kim3, S Mutic1, O Green1, J Park1,2,3, (1) Department of Radiation Oncology, Washington University in St. Louis, St. Louis, MO, (2) Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO, (3) Department of Radiation Oncology, Yonsei University College of Medicine, Seoul, KR


(Sunday, 7/12/2020)   [Eastern Time (GMT-4)]

Room: AAPM ePoster Library

Purpose: As deep learning (DL)-based approaches grow increasingly popular in solving a variety of tasks, applications specific to the realm of radiation oncology have drawn great amounts of attention. With the proliferation of MRI-guided radiotherapy (MR-IGRT) platforms, DL-based solutions to unique challenges in an MR-IGRT workflow are continuously being proposed. Often, however, an evaluation of design choices specific to employing a DL model is not the focus of the discussion, which contributes to the black box nature of these models. The present study aims to evaluate common design choices to aid in the task of making informed decisions in the application of DL-based algorithms to the task of synthetic CT (sCT) generation for an MRI-only RT workflow.

Methods: Using the generative adversarial network (GAN) as the primary framework, comparisons were made in three areas: 1) the network architecture of the generator (pix2pix vs. FC-DenseNet); 2) the information delivered to the discriminator (GAN vs. conditional GAN [cGAN]); and 3) additional loss functions (L1 vs. L2). The peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) metrics were used to evaluate the quality of sCT outputs.

Results: The comparatively more complex structure of the FC-DenseNet resulted in SSIM values approximately 5% higher compared to pix2pix-based models. Similarly, usage of cGAN frameworks yielded moderate increases over their GAN counterparts. Finally, models utilizing L2 loss demonstrated the overall poorest performance, generating outputs characterized by gridding artifacts.

Conclusion: The plug-and-play nature of many popular DL algorithms has lessened the focus on the impact particular design decisions can have on task-specific performance. These results indicate that, in general, cGANs should be preferred over GANs along with advanced network architectures in order to improve the quality of generated sCTs.


Not Applicable / None Entered.


Not Applicable / None Entered.

Contact Email