Room: AAPM ePoster Library
Purpose: Mean Absolute Error (MAE) has become the default metric in evaluating sCT model performance. We investigate limitations of using solely MAE as a model comparison tool and suggest simple ways to append to MAE analysis for better model comparison.
Methods: Rigidly registered SPGR post-sim MR and CT images from 14 cranial SRS patient were used for this study. Sample sCT images were generated using a trained Pix2Pix patch-based model (pre-trained with 25 patient datasets). Effects of the comparison region on MAE were calculated. Since deformable registration itself is an open research topic, we also evaluate the effects of blurring and shifting images on the MAE compared to the original CT images. As an additional reference, we calculate the MAE if the patient were simulated as water.
Results: MAE can range from 103.4 HU to 85.4 HU for a single patient case depending on the comparison region chosen and whether median blur (3,3,3) is applied to the reference and sCT images. Shifting the original CT image by one voxel in each direction alone results in MAE=90.6+/-10.3 HU, while blurring the image alone (median box (5,5,5)) results in MAE=34.6+/-6 HU. Substituting all values for water results in MAE=241+/-37 HU.
Conclusion: MAE has limitations based on registration quality, noise/artifacts in the original CT, physics-based errors (MR geometrical errors, distortions, inability to distinguish air/bone/plastic based on intensity values). However, it can still be used as an effective general tool to rank models within a single institutional study as all datasets and regions agree. Since datasets cannot generally be shared, model comparison on the same dataset is limited. Thus, to highlight regions/tissues where improvements can be made, difference images with clear scales, discussion of observed model limitations, and MAE for threshold gated regions can be used to highlight errors and assist in future works.
Funding Support, Disclosures, and Conflict of Interest: This study was supported under a Master Research Agreement with Philips Health Care