Embedded processing architectures are often integrated into devices to develop novel functions in a cost-effective medical system. In order to integrate neural networks in medical equipment, these models require specialized optimizations for preparing their integration in a high-efficiency and power-constrained environment. In this paper, we research the feasibility of quantized networks with limited memory for the detection of Barrett’s neoplasia. An Efficientnet-lite1+Deeplabv3 architecture is proposed, which is trained using a quantizationaware training scheme, in order to achieve an 8-bit integer-based model. The performance of the quantized model is comparable with float32 precision models. We show that the quantized model with only 5-MB memory is capable of reaching the same performance scores with 95% Area Under the Curve (AUC), compared to a fullprecision U-Net architecture, which is 10× larger. We have also optimized the segmentation head for efficiency and reduced the output to a resolution of 32×32 pixels. The results show that this resolution captures sufficient segmentation detail to reach a DICE score of 66.51%, which is comparable to the full floating-point model. The proposed lightweight approach also makes the model quite energy-efficient, since it can be real-time executed on a 2-Watt Coral Edge TPU. The obtained low power consumption of the lightweight Barrett’s esophagus neoplasia detection and segmentation system enables the direct integration into standard endoscopic equipment.
Computer-Aided Diagnosis (CADx) systems for characterization of Narrow-Band Imaging (NBI) videos of suspected lesions in Barrett’s Esophagus (BE) can assist endoscopists during endoscopic surveillance. The real clinical value and application of such CADx systems lies in real-time analysis of endoscopic videos inside the endoscopy suite, placing demands on robustness in decision making and insightful classification matching with the clinical opinions. In this paper, we propose a lightweight int8-based quantized neural network architecture supplemented with an efficient stability function on the output for real-time classification of NBI videos. The proposed int8-architecture has low-memory footprint (4.8 MB), enabling operation on a range of edge devices and even existing endoscopy equipment. Moreover, the stability function ensures robust inclusion of temporal information from the video to provide a continuously stable video classification. The algorithm is trained, validated and tested with a total of 3,799 images and 284 videos of in total 598 patients, collected from 7 international centers. Several stability functions are experimented with, some of them being clinically inspired by weighing low-confidence predictions. For the detection of early BE neoplasia, the proposed algorithm achieves a performance of 92.8% accuracy, 95.7% sensitivity, and 91.4% specificity, while only 5.6% of the videos are without a final video classification. This work shows a robust, lightweight and effective deep learning-based CADx system for accurate automated real-time endoscopic video analysis, suited for embedding in endoscopy clinical practice.
The majority of the encouraging experimental results published on AI-based endoscopic Computer-Aided Detection (CAD) systems have not yet been reproduced in clinical settings, mainly due to highly curated datasets used throughout the experimental phase of the research. In a realistic clinical environment, these necessary high image-quality standards cannot be guaranteed, and the CAD system performance may degrade. While several studies have previously presented impressive outcomes with Frame Informativeness Assessment (FIA) algorithms, the current-state of the art implies sequential use of FIA and CAD systems, affecting the time performance of both algorithms. Since these algorithms are often trained on similar datasets, we hypothesise that part of the learned feature representations can be leveraged for both systems, enabling a more efficient implementation. This paper explores this case for early Barrett cancer detection by integrating the FIA algorithm within the CAD system. Sharing the weights between two tasks reduces the number of parameters from 16 to 11 million and the number of floating-point operations from 502 to 452 million. Due to the lower complexity of the architecture, the proposed model leads to inference time up to 2 times faster than the state-of-the-art sequential implementation while retaining the classification performance.
Gastroenterologists are estimated to misdiagnose up to 25% of esophageal adenocarcinomas in Barrett's Esophagus patients. This prompts the need for more sensitive and objective tools to aid clinicians with lesion detection. Artificial Intelligence (AI) can make examinations more objective and will therefore help to mitigate the observer dependency. Since these models are trained with good-quality endoscopic video frames to attain high efficacy, high-quality images are also needed for inference. Therefore, we aim to develop a framework that is able to distinguish good image quality by a-priori informativeness classification which leads to high inference robustness. We show that we can maintain informativeness over the temporal domain using recurrent neural networks, yielding a higher performance on non-informativeness detection compared to classifying individual images. Furthermore, it is also found that by using Gradient weighted Class Activation Map (Grad-CAM), we can better localize informativeness within a frame. We have developed a customized Resnet18 feature extractor with 3 classifiers, consisting of a Fully-Connected (FC), Long-Short-Term-Memory (LSTM) and a Gated-Recurrent-Unit (GRU) classifier. Experimental results are based on 4,349 frames from 20 pullback videos of the esophagus. Our results demonstrate that the algorithm achieves comparative performance with the current state-of-the-art. The FC and LSTM classifier reach an F1 score of 91% and 91%. We found that the LSTM classifier based Grad-CAMs represent the origin of non-informativeness the best as 85% of the images were found to be highlighting the correct area.
The benefit of our novel implementation for endoscopic informativeness classification is that it is trained end- to-end, incorporates the spatiotemporal domain in the decision making for robustness, and makes the model decisions of the model insightful with the use of Grad-CAMs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.