Room: AAPM ePoster Library
Purpose: There is a strong clinical need to evaluate different multi-criterial optimization (MCO) algorithms, including inverse optimization sampling algorithms and machine learning-based predictions. This study aims to develop and compare several Pareto surface interpolation similarity metrics.
Methods: The most straightforward metric is the root-mean-square error (RMSE) evaluated between matched, sampled points on the Pareto surfaces, augmented by intra-simplex upsampling of the barycentric coordinates of the Pareto surface simplicial complexes. The second metric is the average projected distance (APD), which evaluates the displacements between the sampled points and computes their projections along the mean displacement. The third metric is the average nearest-point distance (ANPD), which numerically integrates point-to-simplex distances over the upsampled simplices of the Pareto surfaces. These metrics are compared by their convergence rates over intra-simplex upsampling, the calculation times required to achieve convergence, and their qualitative meaningfulness in representing the underlying interpolated surfaces. For testing, several Pareto surfaces were constructed abstractly, using inverse optimization, and using a previously-developed prostate VMAT dose prediction MCO model.
Results: Convergence within 1% is typically achieved at approximately 50 and 100 samples per barycentric dimension for the RMSE and the ANPD, respectively. Calculation requires approximately 50 milliseconds and 3 seconds to achieve convergence for the RMSE and the ANPD, respectively, while the APD always requires much less than 1 millisecond. Additionally, the APD values closely resembled the ANPD limits, while the RMSE limits tended to be more different.
Conclusion: The ANPD is likely more meaningful than the RMSE and APD, as the ANPD’s point-to-simplex distance functions more closely represent the dissimilarity between the underlying interpolated surfaces rather than the sampling points on the surfaces. However, in situations requiring high-speed evaluations, the APD may be more desirable due to its speed, lack of subjective specification of intra-simplex upsampling rates, and similarity to the ANPD limits.
Funding Support, Disclosures, and Conflict of Interest: This work is partially supported by NIH grant R01CA201212.