Click here to


Are you sure ?

Yes, do it No, cancel

Endometrial Cancer Risk Prediction and Stratification Using Personal Health Data

G Hart1*, B Nartowt1 , W Muhammad1 , Y Liang3 , G Huang1,2 , J Deng1,2 , (1) Yale School of Medicine, Yale University, New Haven, CT, (2) Yale/New Haven Hospital, New Haven, CT, (3) Medical College of Wisconsin, Milwaukee, WI,


(Sunday, 7/14/2019) 4:00 PM - 4:30 PM

Room: Exhibit Hall | Forum 5

Purpose: While the 5-year survival rate of endometrial cancer is relatively high (82%), the incidence and death rates are rising. Currently screening methods (e.g. endometrial biopsy, pelvic ultrasound) are primarily used for women with a genetic predisposition (e.g. Lynch syndrome). So far, there is no effective means for population-based screening. It is therefore crucial to have a risk stratification tool, based on comprehensive personal health data, that can identify high-risk individuals who would benefit from screening. Such a stratification scheme could also help direct preventative interventions. This study developed a machine learning model capable of accurately predicting endometrial cancer risk from personal health data.

Methods: We used standard demographic and personal health data extracted from the PLCO Cancer Screening Trail data. Participants were followed for 13 years or until cancer diagnosis with 78,215 women participants, 952 of whom developed endometrial cancer within 5 years of enrolling. Splitting the data into training (70%) and testing (30%) sets we tested logistic regression, decision tree, random forest, linear discriminant analysis, support vector machine, naive Bayes, and neural networks (NN). The top model was then used to stratify endometrial cancer risk into low-, medium-, and high-risk categories.

Results: The NN outperformed the other methods with a test AUC of 0.88. Using the NN, we classified 57.2% of those who developed cancer within 5 years as high-risk, 41.8% as medium-risk, and 1.1% as low-risk. For those who did not develop cancer within 5 years we classified 0.9%, 71.0%, and 28.2% as high-, medium-, and low-risk, respectively.

Conclusion: Our results indicate that the use of a NN based on personal health information can accurately discriminate between those at high risk of developing endometrial cancer and those who are not, offering a cost-effective and non-invasive way to stratify endometrial cancer risk for targeted screening and prevention.


Not Applicable / None Entered.


Not Applicable / None Entered.

Contact Email