Room: Stars at Night Ballroom 2-3
Purpose: To investigate effects of combining convolutional neural networks using transfer learning and extensive data augmentation on the SPIE-AAPM-NCI BreastPathQ cancer cellularity dataset.
Methods: We limit our dataset to the BreastPathQ Cancer Cellularity Challenge dataset. During the course of the above challenge (October-December of 2018), our baseline network was trained using a modified VGG16 network architecture, utilizing widely accepted simple geometric data augmentation of the dataset (four rotations in 90° increments, flipping and mirroring), achieving a prediction probability (PK) score of 0.87561. In this work, we expand our network architecture by combining modified VGG16 and InceptionV3 networks to improve overall network’s robustness by leveraging each network’s intrinsic advantages. We further augment the original dataset by rotating the input images in 1° increments. To maintain fidelity of the cellularity score from the training dataset, the pixels cut off from each rotation are shifted into the ‘dead space’ of the rotated image (see supporting document). Additional augmentation methods tested will include, but are not limited to, controlled dropout of pixels by replacing entire rows/columns with dark pixels at different fixed spacing.
Results: Data augmentation by rotation increases the original dataset size by factor of 360, and pixel dropout data augmentation further increases the dataset size and alleviates overfitting of the network. Our initial results with a limited augmented rotated dataset show an improving trend in the PK score using only the original VGG16 architecture.
Conclusion: This network combination and data augmentation method may allow organizations with limited access to well-labeled data and limited computing power and expertise in deep learning to create powerful predictive models leveraging transfer learning. Increasing access to clear, yet powerful predictive models, while making use of transfer learning and applying data augmentation can make deep learning easier to implement in variety of presently unexplored clinical settings.
Not Applicable / None Entered.
IM/TH- Image Analysis (Single modality or Multi-modality): Machine learning
kp7a@virginia.edu