Project ID: P202205130005
Application of Hybrid Machine Learning Features and Reliable Tensor Radiomics features in prediction of Survival Outcome in Lung Cancer
Deep Learning, Machine Learning, Image Processing
People with the below expertise are able to apply for this project: 1-The individual with enough experience in medical image processing techniques 2-The individual with enough experience in 3-D Deep learning methods 3- The individual with enough experience in traditional and deep learning fusion techniques. Both programming languages Matlab and Python are acceptable, but Python works better for some people who aim at working on google Colab.
Objectives: Radiomics is a major frontier in medical image analysis, enabling mining of high-dimensional data from images. Although radiomics features (RF) are increasingly extracted via standardized radiomics software packages towards more reproducible research, employing different feature-generation hyperparameters, fusion techniques, and segmentation methods, may still lead to variable RFs. As such, employing RFs which are robust to processing variations is another important step towards reproducible study. The present work aims, specifically, to identity robust RFs that are less sensitive to different fusion techniques in lung cancer where fused PET-CT imaging hold significant value and predict survival outcome via these tensor radiomics features and hybrid machine learning methods including dimension reduction algorithm linked with multiple classifiers. Methods: Over 200 patients with lung cancer are extracted from the Cancer Imaging Archive (TCIA). In the pre-processing step, PET images are first registered to CT, enhanced, normalized, and cropped. We employ multiple fission techniques to generate multiple flavors. Subsequently, 215 RFs are extracted from each region of interest in PET-only, CT-only, and multiple fused PET-CT images through the standardized SERA radiomics package. Each variable extracted from each modality or fused image calls a flavour of a feature. Variabilities of RFs are studied using the Intraclass Correlation Coefficient (ICC) (with carefully selected parameters, including for two-way random effects, absolute agreement and, multiple raters/measurements). ICC>0.90, 0.75<ICC<0.90, 0.50<ICC<0.75, ICC<0.50 are denoted as having excellent, good, moderate, and poor reliabilities, respectively. Furthermore, considering 95% confidence intervals for ICC values, we further categorize the features into seven groups, including i) poor-poor (lower bound-upper bound), ii) poor-moderate, iii) moderate-moderate, iv) moderate-good, v) good-good, vi) good-excellent, and vii) excellent-excellent reliabilities. Then, we consider a threshold to pre-select reliable features on reliability categories determined above. Finally, we apply multiple HMLS to dataset with reliable and robust tensor radiomics features. We may go via two ways, i) convert each feature with multiple flavours to a single feature via feature extraction algorithms (FEA) such as PCA and then apply collection of these features to feature selection algorithms (FSA) to select the optimal features and then apply optimal combination of features to classifiers to predict survival outcome. ii) select the best flavour from each feature via FSAs and then combine those to reduce dimension via FEAs. Finally, the dataset with optimal dimension is applied to classifiers to predict survival outcome.