Click here to


Are you sure ?

Yes, do it No, cancel

Atrous Convolution and Spatial Pyramid Pooling for More Accurate Tumor Segmentation in MR Images

K Men*, P Boimel, J Janopaul-Naylor, H Zhong, M Huang, H Geng, C Cheng, Y Fan, J Plastaras, E Ben-Josef, Y Xiao, University of Pennsylvania, Philadelphia, PA


(Thursday, 8/2/2018) 10:00 AM - 12:00 PM

Room: Karl Dean Ballroom C

Purpose: The tumor segmentation with deep learning methods needs to identify the category of each pixel in the image. However, repeated pooling and striding operations in the state-of-the-art convolutional neural networks (CNN) reduce the feature resolution. Additionally the tumors of different patients are of different sizes with different image resolutions. To further improve the accuracy, we trained a novel CNN with atrous convolution and spatial pyramid pooling module (ASPP-CNN) that could extract denser and multi-scale feature maps.

Methods: We modified the last ResNet-101 block with cascaded atrous convolution to extract high-resolution feature map while maintaining the large receptive fields. We have also adopted a parallel four-level pyramid pooling module consisting of one 1×1 convolution and three 3×3 convolutions with atrous rates = (1,4,8,12) to capture the four-scale features. The multi-scale features were fused to create the final feature map. A convolution layer following these layers generated the segmentation. We used T2-MRI data from 80 patients with rectal tumor delineated for this study. A histogram normalization operation was applied to the rectal regions to ensure the consistency of image intensity and contrast. We performed a 10-fold cross-validation to test the performance. The Dice similarity coefficient (DSC) and the Hausdorff distance (HD) were used to quantify the segmentation accuracy.

Results: The performance was compared with that of ResNet-101. The ASPP-CNN outperformed the ResNet-101 in accuracy performance with higher DSC (0.78±0.08 vs. 0.76±0.10) and lower HD (5.5±1.0 vs. 5.6±1.2 mm). The training and testing speed of both networks were comparable.

Conclusion: Application of atrous convolution and spatial pyramid pooling which could extract high-resolution features with large receptive fields and capture multi-scale context yields improve the accuracy in rectal tumor segmentation. Further fine-tuning of the ASPN-CNN with more testing and validation is warranted.

Funding Support, Disclosures, and Conflict of Interest: This project was supported by U24CA180803 (IROC) from the National Cancer Institute (NCI). The authors report no conflicts of interest with this study.


Segmentation, Modeling, MRI


IM/TH- image segmentation: MRI

Contact Email