Click here to


Are you sure ?

Yes, do it No, cancel

Identifying Oropharyngeal Clinical Target Volumes Delineation Patterns From Peer-Reviewed Clinical Delineations Via Cascade 3D Fully-Convolutional Networks

C Cardenas1*, J Yang1 , A Mohamed1 , C Fuller1 , B Beadle2 , A Garden1 , L Court1 , (1) University of Texas MD Anderson Cancer Center, Houston, TX, (2) Stanford University, Stanford, CA


(Tuesday, 7/16/2019) 7:30 AM - 9:30 AM

Room: Stars at Night Ballroom 2-3

Purpose: Clinical target volume (CTV) delineation remains a subjective, difficult, and time-consuming task in head-and-neck radiotherapy. The purpose of this study was to develop a 3D-cascade convolutional neural network to estimate physician delineation patterns and to produce clinically-acceptable high-risk(CTV1), intermediate-risk(CTV2), and low-risk(CTV3) CTVs.

Methods: CT scans and treatment plans from 279 head-and-neck radiotherapy patients were retrospectively collected for this study; these patients were split into train(n=168)/cross-validation(n=56)/test(n=55) sets. Gross-tumor volume and CTV contours were peer-reviewed by a group of head-and-neck radiation oncologists(median=4) prior to radiotherapy. A 3D-cascade network was designed to sequentially auto-delineate CTV1, CTV2, and then CTV3. This was done to mirror physician contouring workflow. This network employs a 3D fully-convolutional network to individually predict each CTV, with the output of the each network being used as input for following the following network (i.e. CTV1 truth/prediction used as input for CTV2 train/test, and CTV1&CTV2 used as inputs for CTV3). In addition, CT scan and GTV contour mask were included as input for each model. The network was trained end-to-end using cross-entropy loss (for each model) and a batch size of 1. During training, hyper-parameter optimization was performed by assessing trained model’s performance on the cross-validations set. Dice Similarity Coefficient (DSC) and mean-surface distance (MDS) were used to compare auto-delineations with clinical ground-truths on the test set.

Results: The test set’s median DSC and MSD values were 0.822(range:0.581-0.891), 0.763(range:0.364-0.859), and 0.736(range:0.602-0.806), and 2.8mm(range:1.9-4.2mm), 4.5mm(range:2.9-17.6mm), and 4.2mm(range:2.9-6.4mm) for CTV1, CTV2, and CTV3, respectively. The largest disagreement between predictions and ground-truth were observed for CTV2. All patients’ predictions were completed in <2mins.

Conclusion: We developed a 3D-cascade network to auto-delineate high-risk, intermediate-risk, and low-risk CTVs for oropharyngeal radiotherapy patients. Our results are consistent with reported inter-observer variability found in the literature. Ongoing clinical assessment of these results will determine clinical-acceptability of our results.

Funding Support, Disclosures, and Conflict of Interest: We gratefully acknowledge the support of the Nvidia Corporation with the donation of the Tesla K40 GPU used for the present research.


Computer Vision, CT, Segmentation


IM/TH- image segmentation: CT

Contact Email