Click here to


Are you sure ?

Yes, do it No, cancel

Deep Learning-Based Auto-Segmentation of Swallowing and Chewing Structures

A Iyer*, M Thor, R Haq, J Deasy, A Apte, Memorial Sloan Kettering Cancer Center, New York, NY


(Thursday, 7/16/2020) 10:30 AM - 12:30 PM [Eastern Time (GMT-4)]

Room: Track 4

Purpose: swallowing and chewing structures on Head and Neck (H&N) CT scans is essential in radiotherapy treatment (RT) planning to reduce the incidence of radiation-induced dysphagia, trismus, and speech dysfunction. Automating this process would decrease the manual effort required and yield reproducible results, but accurate auto-segmentation is challenging due to the morphological complexity of structures involved and low soft tissue contrast in CT.

Methods: trained deep learning models using 194 H&N CT scans from our institution (IRB#16-142, 16-1488) to segment the masseters (left and right), medial pterygoids (left and right), larynx, and pharyngeal constrictor muscle using DeepLabV3+ with the resnet-101 backbone. Models were trained in a sequential manner to guide the localization of each structure group based on prior segmentations. Additionally, an ensemble of models was developed using contextual information from three different views (axial, coronal, and sagittal), providing robustness to unintended failures of the individual models. Output probability maps were averaged, and voxels were assigned labels corresponding to the class with the highest combined probability.

Results: dice similarity coefficients and 95th percentile Hausdorff distance (DSC, HD95) computed on a hold-out set of 24 CT scans were 0.87±0.02 , 0.35±0.1cm for the masseters; 0.81±0.03, 0.42±0.1cm for the medial pterygoids; 0.83±0.04, 0.47±0.2cm for the larynx; and 0.67±0.07, 0.5±0.18cm for the constrictor. Mean dose, previously identified as a possible factor in radiation-induced complications, was computed from automated and expert segmentations on the hold-out scans. No statistically significant differences were found using the Wilcoxon signed-rank test at significance level 5% except in the constrictor muscle.

Conclusion: developed deep learning models for segmentation of swallowing and chewing structures in CT. The resulting segmentations could potentially be applied in treatment planning to limit complications following RT for H&N cancer. Segmentation models developed in this work are distributed through the open-source platform CERR, accessible at

Funding Support, Disclosures, and Conflict of Interest: This work was partially funded by NIH grant 1R01CA198121 and NIH/NCI Cancer Center Support grant P30 CA008748.


Segmentation, CT


IM/TH- Image Segmentation Techniques: Machine Learning

Contact Email