Click here to


Are you sure ?

Yes, do it No, cancel

Computer-Aided Diagnosis for Wireless Capsule Endoscopy Based On Convolutional Neural Networks: Experimental Feasibility and Optimization On a Large Dataset

S Wang1*, Y Xing2 , L Zhang3 , H Gao4 , H Zhang5 , (1) Tsinghua University, Beijing, ,(2) Tsinghua University, Beijing, ,(3) Tsinghua University, Beijing, 11, (4) Tsinghua University, Beijing, ,(5) Ankon Technologies Co, Ltd (Wuhan, Shanghai, China), Wuhan,


(Wednesday, 7/17/2019) 10:15 AM - 12:15 PM

Room: 304ABC

Purpose: To verify the effectiveness of state-of-the-art CNN models in computer-aided diagnosis for wireless capsule endoscopy(WCE) with one of the largest ulcer WCE dataset which consists of more than 1,400 patients’ videos (>23 million frames in total) and to investigate optimal choices for training process.

Methods: A balanced dataset is constructed from all 1,200 patients’ videos, consisting of 24,839 ulcer images and 24,225 normal images: train (34,518), validation (4,743), test (10,333). ResNet-34 is trained and tested on the dataset. Impacts of input image size and train/test ratio are investigated. For image size test, input size of 224x224(original ImageNet size) and size of 480x480(WCE video) are compared. For train/test ratio study, validation and test dataset are kept the same while train dataset is sampled with two different strategies: 1) extra-case sample, i.e. reduce number of cases, 2) inter-case sample, i.e. reduce frames in one case while number of cases is not reduced. Train/test ratio varies from 3.5:1(all data), 2.3:1, 1.5:1 to 1:1.

Results: The best test accuracy is 91.28% (ROC-AUC: 0.9676) when train/test ratio is 3.5:1 and 480x480 images are used. Test accuracy of 480x480 input is 1.76% better than the result of 224x224 input. For extra-case sample (480x480 input), accuracy decreases by 1.35%, 1.11%, 2.58% respectively when train/test ratio varies from 2.3:1, 1.5:1, to 1:1. And for inter-case sample, accuracy decreases by 0.47%, 0.74%, 1.15%

Conclusion: This work demonstrates the potential benefit of state-of-the-art CNNs for application in WCE diagnosis and verified it in the largest ulcer dataset to our knowledge. External studies provide some guideline for future work. Comparison between inter-case and extra-case sample implies more videos can provide more diversity. On the other hand, it implies possible data redundancy exists in the same video. Weighting factors in the loss term may be useful for further correction.

Funding Support, Disclosures, and Conflict of Interest: A research program on deep learning application in wireless capsule endoscopy with Ankon Technologies Co, Ltd (Wuhan, Shanghai, China) and National Key Scientific Instrument and Equipment Development Project under Grant No. 2013YQ160439 and the Zhangjiang National Innovation Demonstration Zone Special Development Fund under Grant No. ZJ2017-ZD-001.


Computer Vision


Not Applicable / None Entered.

Contact Email