Paper
28 August 2023 Optimized modeling of anti-breast cancer drug candidates based on machine learning
Xinxin Hu, Jiayao Lv
Author Affiliations +
Proceedings Volume 12724, Second International Conference on Biomedical and Intelligent Systems (IC-BIS 2023); 1272408 (2023) https://doi.org/10.1117/12.2687762
Event: Second International Conference on Biomedical and Intelligent Systems (IC-BIS2023), 2023, Xiamen, China
Abstract
The research data were obtained from the information of ER antagonists provided by the D problem of Huawei Cup. We screened the molecular descriptor variables to construct quantitative prediction models for the biological activity of the compounds against ER, and the classification prediction models for Caco-2, CYP3A4, hERG, HOB, and MN of the compounds, respectively. We also determined the optimal range of molecular descriptors that could lead to compounds with better bioactivity for ER inhibition and better ADMET properties. In turn, we provide data analysis and predictive models for breast cancer research. In the first step, we used the feature es election method to remove redundant variables, and applied random forest to rank the variable sinterms of relevance to lterout7 molecular descriptor variables: MDEC-23, max HsOH, minHBa, min HsOH, minHBint4, C1SP2 and nHBAcc. In the second step, the pIC50 values in the training set were used as dependent variables and the seven molecular descriptor variables were used as independent variables, and extremely randomized trees were applied to construct the non-linear regression quantitative prediction model between compounds and ER bioactivity. In the third step, we used different integration methods to construct classification prediction models for Caco-2, CYP3A4, hERG, HOB, and MN, respectively. In the fourth step, after eliminating the 1974 compounds with the sum of ADMET indicators less than 3, we inverse solve the prediction model using the particle swarm optimization algorithm to obtain the maximum value of pIC50 and the optimal solution for each molecular descriptor variable. pIC50 has a maximum value of 8.9055, and the molecular descriptors C1SP, minHBa, min HBint4, minHsOH, maxHsOH, nHBAcc, MDEC, nHBAcc, and MDEC-23 are 0, - 7.75978169, -3.42842314, 9.46164889, 0.51061438, 2.63674280, and 80, respectively.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xinxin Hu and Jiayao Lv "Optimized modeling of anti-breast cancer drug candidates based on machine learning", Proc. SPIE 12724, Second International Conference on Biomedical and Intelligent Systems (IC-BIS 2023), 1272408 (28 August 2023); https://doi.org/10.1117/12.2687762
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Random forests

Machine learning

Breast cancer

Decision trees

Education and training

Cancer

Back to Top