Room: Exhibit Hall
Purpose: Big data is quickly gaining momentum in the field of medical physics, and with it, the need to make large collections of patient data accessible is vital. By state law, clinics are required to store patient data for several years after treatment. The amount of storage space required to store all this data is limited, so most clinics have implemented a compression system that converts these records into a compressed file known as a tape archive (TAR) file. Our goal is to create a quick and simple method for obtaining useful information from these compressed backup files.
Methods: A python script was created for the quick extraction of predefined variables located within the compressed TAR files created by the Pinnacle treatment planning system. Extraction of radiotherapy plan parameters is the primary goal of the inquiry. The extracted data is saved in JavaScript Object Notation (JSON) formatted documents. Logic is applied throughout the code to perform calculations (e.g. ROI volumes, density of volumes) in addition to removing unused plan trials.
Results: The script can extract any defined values that are available within the TAR file. Modifying the logic code could allow more complex outputs to be created dependent on what the user would like to extract from the file. Data extracted can be quickly inserted into a simple text format or JSON to store it in databases and or use it in data mining applications. Proprietary software processing time for TAR file extraction was found to have a mean completion time of 5 minutes, while the extraction script requires 5-15 seconds for the same file using the same processor.
Conclusion: The script was deemed advantageous by accelerating the retrieval of pertinent information and allowing for quick data allocation into a database for further investigation.
Funding Support, Disclosures, and Conflict of Interest: Resources provided by Pinnacle.
Not Applicable / None Entered.
Not Applicable / None Entered.