The incidence rate for Type 2 Diabetes Mellitus (T2DM) has been increasing over the years. T2DM is a common lifestyle-related disease and predicting its occurrence before five years could help patients to alter their lifestyle ahead and hence prevent T2DM. We intend to investigate the feasibility of radiomics features in predicting the occurrence of T2DM using screening mammography images which could benefit us in terms of the preventability of the disease. This study has examined the prevalence of T2DM using 110 positive samples (developed T2DM after 5 years) and 202 negative samples (did not develop T2DM after five years). The whole breast region was selected as the Region Of Interest (ROI), from which radiomics features were to be extracted. The mask was created from every image using a modified threshold value (by Otsu's binarization method) to obtain a binary image of the breast. 668 radiomics features were then extracted and analyzed using different machine learning algorithms built in the Python programming language such as Random Forest (RF), Gradient Boosting Classifier (GBC), and Light-Gradient Boosting Model (LGBM) as they could give excellent classification and prediction results. A five-fold cross-validation method was carried out; the accuracy, sensitivity, specificity and AUC were calculated when implementing each of the algorithms, and hyperparameter tuning was carried out to tune the models for better performance. The RF and GBC produced good accuracy results (⪆ 70%), but low sensitivity values. LGBM’s accuracy is almost 70% but it has the highest sensitivity (43.9%) and decent specificity (74.4%).
|