Click here to


Are you sure ?

Yes, do it No, cancel

A Comparison of Different Data Augmentation Methods in Isocitrate Dehydrogenase 1 (IDH1) Mutation Prediction

H Xiao1*, Z Chang2 , (1) Duke Kunshan University, Kunshan, Jiangsu,(2) Duke University Medical Center, Durham, NC


(Tuesday, 7/16/2019) 9:30 AM - 10:00 AM

Room: Exhibit Hall | Forum 6

Purpose: To evaluate the effectiveness of different data augmentation methods in a deep learning model that predicts Isocitrate Dehydrogenase (IDH) mutations in patients with gliomas.

Methods: MR images of 103 glioma patients were selected from The Cancer Imaging Archive (TCIA), including T1w-post-contrast, FLAIR, and T2w images, and images of the same slice are assembled as one sample. This gave rise to 209 IDH-mutant glioma and 356 IDH-wild-type glioma samples, from which training, validation, and test sets were generated. Data augmentation methods, including noise addition, rotation, translation, cropping and mirroring were applied individually to both training and validation sets in each training. In addition, all data augmentation methods were applied together with a ratio of 20% each to test the combination performance. Another experiment with duplicated images was carried out as a control group. Images from one sample were fed to different input channels of Inception-ResNet, and the predictions were based on the extracted features and the patient’s age at diagnosis. Prediction accuracy and area under curve (AUC) were used to assess the performance of different augmentation methods.

Results: On the same training, validation and test sets, the proposed model trained on data augmented by duplication, cropping, translation, mirroring, rotation, noise addition gave accuracies of 81.6%, 79.6%, 83.7%, 85.7%, 91.8%, and 91.8%, respectively. The combination of 5 augmentation methods resulted in an accuracy of 81.6%.

Conclusion: This study indicated the involvement of non-effective data augmentation methods led to compromised prediction performance. Among all data augmentation methods, noise addition and rotation showed better performance and might suggest potential value for other clinical applications using machine learning algorithms.

Funding Support, Disclosures, and Conflict of Interest: Duke Kunshan University Equipment Budget


CAD, Brain, MRI


IM/TH- Image Analysis (Single modality or Multi-modality): Computer-aided decision support systems (detection, diagnosis, risk prediction, staging, treatment response assessment/monitoring, prognosis prediction)

Contact Email