Click here to


Are you sure ?

Yes, do it No, cancel

Deep Learning Model for the Detection of Setup Errors in Planar X-Ray Images

J Lamb1*, R Petragallo1 , D Low1 , (1) University of California, Los Angeles, Los Angeles, CA


(Tuesday, 7/16/2019) 7:30 AM - 9:30 AM

Room: Stars at Night Ballroom 2-3

Purpose: To develop a deep learning-based image review algorithm that rivals human performance for detecting setup errors in radiotherapy guidance images.

Methods: ExacTrac x-ray and DRR image pairs acquired from 1517 patients in over 14,000 treatment sessions at our institution from 2014-2017 were used for this study. Transfer learning was used to adapt the well-known deep learning model AlexNet to the present task. The last two fully-connected layers of AlexNet were retrained. Synthetic errors were created from each xray-DRR pair by shifting the DRR by fixed increments of 1 cm in all 4 cardinal directions followed by cropping to ensure equal field. Thus each x-ray DRR pair provided one “no-error� instance and 9 “error� instance and a total of over 42,000 images were available in the test and training datasets. The data set was divided into 70% training and 30% validation sets. The model was implemented in MATLAB v2018b. Training required approximately 75 hours using a single GPU. The error detection sensitivity of the model was compared to a human observer sensitivity extracted from the number of misalignment delivery errors discovered retrospectively and reported in our institution’s clinical incident reporting database.

Results: The deep learning algorithm achieved 97.8% accuracy, 98.8% sensitivity, and 87.9% specificity to detect translational errors of 1 cm. Institutional error reporting data suggest a per-fraction rate of setup image misalignments of greater than 2 cm of 0.1%, corresponding to a per-therapist error detection sensitivity of 97% assuming all patients require positioning refinement based on setup images.

Conclusion: Initial results of using transfer learning to re-train established classification networks for detection of setup errors are promising and may be sufficient to provide additional safety checks backing up human observers. Performance may not yet be sufficient to replace redundant checks performed by humans, but improvements are foreseeable.

Funding Support, Disclosures, and Conflict of Interest: Research supported by AHRQ Grant 1R01HS026486-01


Not Applicable / None Entered.


Not Applicable / None Entered.

Contact Email