Room: Stars at Night Ballroom 2-3
Purpose: Clinical implementation of radiomics research is hindered by limited reproducibility. This study investigates differences in feature combinations correlated with radiation pneumonitis (RP) when features are calculated with different radiomics packages.
Methods: Pre- and post-treatment CT scans were acquired from 105 patients undergoing radiotherapy (RT) for esophageal cancer. Twenty patients developed RP grade≥2. Regions-of-interest (ROIs) were randomly placed in normal lung tissue receiving a mean dose ≥30 Gy. The vector map obtained from deformable registration anatomically matched corresponding ROIs between time points. Eight radiomics features common among packages (mean, median, minimum, entropy, GLCM sum average, GLCM sum entropy, GLCM difference entropy, GLCM entropy) were extracted from each ROI using three radiomics packages: one in-house (“A1�) and two open-source (IBEX and Pyradiomics). Differences in feature values were calculated between pre- and post-RT ROIs and averaged across all ROIs for each patient. Logistic regression was performed to assess relationships between changes in each feature and RP development. Each remaining feature from the same package was added to the regression model, area-under-the-curve (AUC) values were calculated, and ANOVA determined whether the second feature significantly improved model fit. Significance was assessed at p<0.0009 (with multiple comparisons correction).
Results: Three first-order and three GLCM features could distinguish RP for all three packages when a single feature was used. Of the 56 unique feature combinations, 40 and 9 combinations resulted in significant or insignificant improvement, respectively, in model fit over using the first feature alone for all three packages; the remaining 7 feature combinations differed among packages with regard to improved model fit. When a second feature was used, AUC values ranged from 0.63-0.76 for package A1, 0.65-0.76 for IBEX, and 0.64-0.76 for Pyradiomics.
Conclusion: Feature combinations used to classify patients with RP may differ when features are calculated with different radiomics packages.