Age-related Macular Degeneration (AMD) is a significant health burden that can lead to irreversible vision loss in the elderly population. Accurately classifying Optical Coherence Tomography (OCT) images is vital in computer-aided diagnosis (CAD) of AMD. Most CAD studies focus on improving classification results but ignore the fact that a classifier may predict a correct image label for the wrong reasons, i.e., the classifier provided a correct prediction label but looked at the wrong region. We propose a human-in-the-loop OCT image classification scheme that allows users to provide feedback on model explanation during the training process to address this limitation. We innovatively integrate a custom loss function with our expert’s annotation of the OCT images along with the model’s explanation. The model learns both to classify the images and the explanations (ground truth) using this loss. Our results indicate that the proposed method improves the model explanation accuracy over the baseline model by 85% while maintaining a high classification accuracy of over 95%.
Age-related macular degeneration is the leading cause of vision deterioration among older adults. The selected treatment for this retinal disease is largely dependent on the later stage type (wet or geographic atrophy), meaning correct type classification is crucial to ensuring the best patient outcome. Previous studies have demonstrated high classification accuracy with medical images and used saliency maps to add explainability to opaque deep learning models. However, these explanations have revealed a tendency to make classification decisions based on irrelevant information. Our proposed deep learning model allows domain experts to correct model behavior during the training process through direct annotations of the regions of interest (ROIs) and integrates these annotations into the learning model. Our approach performs consistently with non-interactive classification accuracy of the retinal optical coherence tomography (OCT) scans. Filters are applied regionally to the original OCT image based on the annotations and Grad-CAM highlighted regions. Four interactive classification methods are introduced and compared against a non-interactive CNN with the same overall architecture. Three of the four methods selectively filter regions of the images with weighted pairs of enhancement and blurring filters. The fourth uses ROI maps to focus the attention of the feature maps on the expert annotated region(s). All overlap scores measuring the human and computer output agreement overperformed the non-interactive CNN baseline model with two of the interactive methods doubling the overlap score while another tripling the overlap score
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.