Proceedings Article | 24 March 2016
KEYWORDS: Lung cancer, Image analysis, Image segmentation, Cancer, Computed tomography, Lung, Computer-aided diagnosis, Computer aided diagnosis and therapy, New and emerging technologies, Quantitative analysis, Tumors, Machine learning, Surgery
Radiomics is an emerging technology to decode tumor phenotype based on quantitative analysis of image features computed from radiographic images. In this study, we applied Radiomics concept to investigate the association among the CT image features of lung tumors, which are either quantitatively computed or subjectively rated by radiologists, and two genomic biomarkers namely, protein expression of the excision repair cross-complementing 1 (ERCC1) genes and a regulatory subunit of ribonucleotide reductase (RRM1), in predicting disease-free survival (DFS) of lung cancer patients after surgery. An image dataset involving 94 patients was used. Among them, 20 had cancer recurrence within 3 years, while 74 patients remained DFS. After tumor segmentation, 35 image features were computed from CT images. Using the Weka data mining software package, we selected 10 non-redundant image features. Applying a SMOTE algorithm to generate synthetic data to balance case numbers in two DFS (“yes” and “no”) groups and a leave-one-case-out training/testing method, we optimized and compared a number of machine learning classifiers using (1) quantitative image (QI) features, (2) subjective rated (SR) features, and (3) genomic biomarkers (GB). Data analyses showed relatively lower correlation among the QI, SR and GB prediction results (with Pearson correlation coefficients < 0.5 including between ERCC1 and RRM1 biomarkers). By using area under ROC curve as an assessment index, the QI, SR and GB based classifiers yielded AUC = 0.89±0.04, 0.73±0.06 and 0.76±0.07, respectively, which showed that all three types of features had prediction power (AUC>0.5). Among them, using QI yielded the highest performance.