Presentation + Paper
3 April 2024 Quantifying input data drift in medical machine learning models by detecting change-points in time-series data
Author Affiliations +
Abstract
Devices enabled by artificial intelligence (AI) and machine learning (ML) are being introduced for clinical use at an accelerating pace. In a dynamic clinical environment, these devices may encounter conditions different from those they were developed for. The statistical data mismatch between training/initial testing and production is often referred to as data drift. Detecting and quantifying data drift is significant for ensuring that AI model performs as expected in clinical environments. A drift detector signals when a corrective action is needed if the performance changes. In this study, we investigate how a change in the performance of an AI model due to data drift can be detected and quantified using a cumulative sum (CUSUM) control chart. To study the properties of CUSUM, we first simulate different scenarios that change the performance of an AI model. We simulate a sudden change in the mean of the performance metric at a change-point (change day) in time. The task is to quickly detect the change while providing few false-alarms before the change-point, which may be caused by the statistical variation of the performance metric over time. Subsequently, we simulate data drift by denoising the Emory Breast Imaging Dataset (EMBED) after a pre-defined change-point. We detect the change-point by studying the pre- and post-change specificity of a mammographic CAD algorithm. Our results indicate that with the appropriate choice of parameters, CUSUM is able to quickly detect relatively small drifts with a small number of false-positive alarms.
Conference Presentation
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Smriti Prathapan, Ravi K. Samala, Nathan Hadjiyski, Pierre-François D’Haese, Fabien Maldonado, Phuong Nguyen, Yelena Yesha, and Berkman Sahiner "Quantifying input data drift in medical machine learning models by detecting change-points in time-series data", Proc. SPIE 12927, Medical Imaging 2024: Computer-Aided Diagnosis, 129270E (3 April 2024); https://doi.org/10.1117/12.3008771
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Artificial intelligence

Performance modeling

Mammography

Computer simulations

Machine learning

Detection theory

Back to Top