Click here to


Are you sure ?

Yes, do it No, cancel

A Deep Sequential Learning Architecture for Xerostomia Prediction in Parotid Glands Using CBCT and Rigid-Registered Dose Images

H Tseng1*, B Rosen2, JT Chien3, M Mierzwa4, R Ten Haken5, I El Naqa6 , (1) University of Michigan, Ann Arbor, Ann Arbor, MI, (2) University of Michigan, Ann Arbor, MI, (3)National Chiao Tung University, Hsinchu, Taiwan (4) University of Michigan, Ann Arbor, MI, (5)University of Michigan, Ann Arbor, MI, (6)University of Michigan, Ann Arbor, MI


(Tuesday, 7/16/2019) 10:00 AM - 10:30 AM

Room: Exhibit Hall | Forum 6

Purpose: To construct an architecture receiving serial cone-beam CTs (CBCTs) and co-registered dose images for modeling xerostomia in head & neck (H/N) cancer radiotherapy patients. Further, we investigate this approach that focuses on performing robust prediction without introducing prior domain knowledge, which eliminates explicit feature selection and inter- and intra-registration errors between images and dose while accounting for temporal associations.

Methods: Daily CBCT and rigid-registered dose images of the parotid glands were used as input for a time resolution classifier. The proposed model directly “observes� these inputs to mimic native human decisions for predicting xerostomia. The proposed classifier comprises two advanced deep-learning architectures: 3D-convolutional-neural-networks (3D-CNN) & Markov-recurrent-neural-network (Markov-RNN). Specifically, two 3D-CNNs were first used to receive and recognize 3D image patterns and to convert enriched 3D image/dose information into single time series. Subsequently, the Markov-RNN served as a time resolution classifier to decode the temporal sequence from 3D-CNNs. Incorporation of 3D-CNNs with Markov-RNN enabled resolution of complex temporal behavior, where multiple RNNs were simultaneously enforced to characterize different structures of data, and Markov transitions were applied between the RNNs to determine when and which RNN channel should be considered. This architecture jointly trained both model parameters to achieve optimal classification rates.

Results: With CBCT & dose images of maximal 35 fractions from 91 H/N patients, this methodology with 4 RNNs and two sets of 2 3D-CNN channels primarily yielded xerostomia (grade 1 or higher) prediction at 1 year with average AUC=0.61 (95% CI: 0.48- 0.82) on 5-fold cross-validation.

Conclusion: This deep-learning based composite architecture can process spatial and time information for sequential learning tasks by skipping complex image processing. Our preliminary results suggest potential for predicting xerostomia without the necessary labor of traditional radiomic extraction, feature selection, and deformable registration. However, further improvement is warranted, and external validation is needed.

Funding Support, Disclosures, and Conflict of Interest: NIH funding: 8704258


Dose Response, Image Analysis, Radiation Therapy


TH- response assessment : CT imaging-based

Contact Email