Room: AAPM ePoster Library
Purpose: To improve the performance of 3D U-net by replacing the conventional up-sampling convolution layers with the Pixel De-convolutional Network (PDN) that considers spatial features.
Methods: 3D U-net was developed to segment neuronal structure with outstanding performance but suffers serious artifacts because of uneven overlap pattern during the de-convolution operation. The hypothesis of this study is that the segmentation quality of liver using the U-net can be improved with the addition of PDN. Sixty-four anonymized annotated plans as training set were fed into two networks (1) 4-layer 3D U-Net, and (2) 4-layer 3D U-Net+PDN. The focal instead of cross-entropy loss was implemented as a guideline for training because it works better in sparse images. The other sixteen plans were used to test the performance of these two networks with all the model parameters acquired from training. The similarity dice score and AUC (Area Under Curve) of segmented livers were calculated and compared between these two networks. All codes were written in Python with TensorFlow on a laptop with Intel i7-7700HQ CPU, 16 GHz memory and NVIDIA GTX 1050 Ti GPU with 4 GB video memory.
Results: The mean dice score for the testing set was 0.69, 0.76 respectively for the U-Net and U-Net+PDN networks, and the AUC was 0.86, 0.85. The computation time for training was 20 hours for U-Net and 26 hours for U-Net+PDN. The computation time for testing cases was 1-2 minutes for both networks.
Conclusion: With nearly identical AUC, the U-Net+PDN outperforms the conventional U-Net for liver segmentation. This better performance is attributed to the PDN’s establishment of direct relationships among adjacent pixels on the up-sampling path in U-Net, which expands more precise location information in comparison to the original down-sampling path in U-Net.
Deconvolution, Segmentation, Computer Vision