Embedded processing architectures are often integrated into devices to develop novel functions in a cost-effective medical system. In order to integrate neural networks in medical equipment, these models require specialized optimizations for preparing their integration in a high-efficiency and power-constrained environment. In this paper, we research the feasibility of quantized networks with limited memory for the detection of Barrett’s neoplasia. An Efficientnet-lite1+Deeplabv3 architecture is proposed, which is trained using a quantizationaware training scheme, in order to achieve an 8-bit integer-based model. The performance of the quantized model is comparable with float32 precision models. We show that the quantized model with only 5-MB memory is capable of reaching the same performance scores with 95% Area Under the Curve (AUC), compared to a fullprecision U-Net architecture, which is 10× larger. We have also optimized the segmentation head for efficiency and reduced the output to a resolution of 32×32 pixels. The results show that this resolution captures sufficient segmentation detail to reach a DICE score of 66.51%, which is comparable to the full floating-point model. The proposed lightweight approach also makes the model quite energy-efficient, since it can be real-time executed on a 2-Watt Coral Edge TPU. The obtained low power consumption of the lightweight Barrett’s esophagus neoplasia detection and segmentation system enables the direct integration into standard endoscopic equipment.
Computer-Aided Diagnosis (CADx) systems for characterization of Narrow-Band Imaging (NBI) videos of suspected lesions in Barrett’s Esophagus (BE) can assist endoscopists during endoscopic surveillance. The real clinical value and application of such CADx systems lies in real-time analysis of endoscopic videos inside the endoscopy suite, placing demands on robustness in decision making and insightful classification matching with the clinical opinions. In this paper, we propose a lightweight int8-based quantized neural network architecture supplemented with an efficient stability function on the output for real-time classification of NBI videos. The proposed int8-architecture has low-memory footprint (4.8 MB), enabling operation on a range of edge devices and even existing endoscopy equipment. Moreover, the stability function ensures robust inclusion of temporal information from the video to provide a continuously stable video classification. The algorithm is trained, validated and tested with a total of 3,799 images and 284 videos of in total 598 patients, collected from 7 international centers. Several stability functions are experimented with, some of them being clinically inspired by weighing low-confidence predictions. For the detection of early BE neoplasia, the proposed algorithm achieves a performance of 92.8% accuracy, 95.7% sensitivity, and 91.4% specificity, while only 5.6% of the videos are without a final video classification. This work shows a robust, lightweight and effective deep learning-based CADx system for accurate automated real-time endoscopic video analysis, suited for embedding in endoscopy clinical practice.
The most common type of cancer are carcinomas: cancers which originate from the epithelial tissue lining the outer surfaces of organs. To detect carcinomas at an early stage, techniques are required with small sampling volumes. Single Fiber Reflectance spectroscopy (SFR) is a promising technique to detect early-stage carcinomas since it has a measurement volume in the order of hundreds of microns. SFR uses a single fiber to emit and collect broadband light. The model from Kanick et al. to relate SFR reflectance to tissue optical properties provided accurate results for tissue with a modified Henyey Greenstein phase function only. However, in many tissues other types of phase functions have been measured. We have developed a new model to relate SFR reflectance to the scattering and absorption properties of tissue, which provides accurate results for a large range of tissue phase functions. SFR measurements fall into the sub-diffuse regime. We, therefore, describe the SFR reflectance as a diffuse plus a semi-ballistic component. An accurate description of the diffuse SFR signal requires double integration of spatially resolved reflectance over the fiber surface. We use approaches from Geometric Probability and have derived the first analytic solution for the diffuse contribution to SFR. For the semi-ballistic contribution to the SFR signal we introduce a new phase function dependent parameter, psb, to describe the semi-ballistic part of the SFR signal. We will use the model to derive optical properties from SFR measurements performed endoscopically in patients with Barrett’s esophagus. These patients are at an increased risk to develop esophageal adenocarcinoma and, therefore, undergo regular endoscopic surveillance. When detected at an early stage, endoscopic treatment is possible, thereby avoiding extensive surgery. Based on the concept of field cancerization, we investigated whether SFR could be used as a tool to identify which patients are developing esophageal adenocarcinoma.
Over the past few decades, primarily developed countries witnessed an increased incidence of esophageal adenocarcinoma (EAC). Screening and surveillance of Barrett’s esophagus (BE), which is known to augment the probability of developing EAC, can significantly improve survival rates. This is because early-stage dysplasia in BE can be treated effectively, while each subsequent stage complicates successful treatment and seriously reduces survival rates. This study proposes a convolutional neural network-based algorithm, which classifies images of BE visualized with White Light Endoscopy (WLE) as either dysplastic or non-dysplastic. To this end, we use merely pixels surrounding the dysplastic region, while excluding the pixels covering the dysplastic region itself. The phenomenon where the diagnosis of a patient can be determined from tissue other than the clearly observable diseased area, is termed the field effect. With its potential to identify missed lesions, it may prove to be a helpful innovation in the screening and surveillance process of BE. A statistical significant difference test indicates the presence of the field effect in WLE, when comparing the distribution of the algorithm classifications of unseen data and the distribution obtained by a random classification.
Routine surveillance endoscopies are currently used to detect dysplasia in patient with Barrett's Esophagus (BE). However, most of these procedures are performed by non-expert endoscopists in community hospitals. Leading to many missed dysplastic lesions, which can progress into advanced esophageal adenocarcinoma if left untreated.1 In recent years, several successful algorithms have been proposed for the detection of cancer in BE using high-quality overview images. This work addresses the first steps towards clinical application on endoscopic surveillance videos. Several challenges are identified that occur when moving from image-based to video-based analysis. (1) It is shown that algorithms trained on high-quality overview images do not naively transfer to endoscopic videos due to e.g. non-informative frames. (2) Video quality is shown to be an important factor in algorithm performance. Specifically, temporal location performance is highly correlated with video quality. (3) When moving to real-time algorithms, the additional compute necessary to address the challenges in videos will become a burden on the computational budget. However, in addition to challenges, videos also bring new opportunities not available in the current image-based methods such as the inclusion of temporal information. This work shows that a multi-frame approach increases performance compared to a naive single-image method when the above challenges are addressed.
Gastroenterologists are estimated to misdiagnose up to 25% of esophageal adenocarcinomas in Barrett's Esophagus patients. This prompts the need for more sensitive and objective tools to aid clinicians with lesion detection. Artificial Intelligence (AI) can make examinations more objective and will therefore help to mitigate the observer dependency. Since these models are trained with good-quality endoscopic video frames to attain high efficacy, high-quality images are also needed for inference. Therefore, we aim to develop a framework that is able to distinguish good image quality by a-priori informativeness classification which leads to high inference robustness. We show that we can maintain informativeness over the temporal domain using recurrent neural networks, yielding a higher performance on non-informativeness detection compared to classifying individual images. Furthermore, it is also found that by using Gradient weighted Class Activation Map (Grad-CAM), we can better localize informativeness within a frame. We have developed a customized Resnet18 feature extractor with 3 classifiers, consisting of a Fully-Connected (FC), Long-Short-Term-Memory (LSTM) and a Gated-Recurrent-Unit (GRU) classifier. Experimental results are based on 4,349 frames from 20 pullback videos of the esophagus. Our results demonstrate that the algorithm achieves comparative performance with the current state-of-the-art. The FC and LSTM classifier reach an F1 score of 91% and 91%. We found that the LSTM classifier based Grad-CAMs represent the origin of non-informativeness the best as 85% of the images were found to be highlighting the correct area.
The benefit of our novel implementation for endoscopic informativeness classification is that it is trained end- to-end, incorporates the spatiotemporal domain in the decision making for robustness, and makes the model decisions of the model insightful with the use of Grad-CAMs.
Volumetric Laser Endomicroscopy (VLE) is a promising balloon-based imaging technique for detecting early neoplasia in Barretts Esophagus. Especially Computer Aided Detection (CAD) techniques show great promise compared to medical doctors, who cannot reliably find disease patterns in the noisy VLE signal. However, an essential pre-processing step for the CAD system is tissue segmentation. At present, tissue is segmented manually but is not scalable for the entire VLE scan consisting of 1,200 frames of 4,096 × 2,048 pixels. Furthermore, the current CAD methods cannot use the VLE scans to their full potential, as only a small segment of the esophagus is selected for further processing, while an automated segmentation system results in significantly more available data. This paper explores the possibility of automatically segmenting relevant tissue for VLE scans using FusionNet and a domain-specific loss function. The contribution of this work is threefold. First, we propose a tissue segmentation algorithm for VLE scans. Second, we introduce a weighted ground truth that exploits the signal-to-noise ratio characteristics of the VLE data. Third, we compare our algorithm segmentation against two additional VLE experts. The results show that our algorithm annotations are indistinguishable from the expert annotations and therefore the algorithm can be used as a preprocessing step for further classification of the tissue.
Volumetric laser endomicroscopy (VLE) is an advanced imaging system offering a promising solution for the detection of early Barrett’s esophagus (BE) neoplasia. BE is a known precursor lesion for esophageal adenocarcinoma and is often missed during regular endoscopic surveillance of BE patients. VLE provides a circumferential scan of near-microscopic resolution of the esophageal wall up to 3-mm depth, yielding a large amount of data that is hard to interpret in real time. In a preliminary study on an automated analysis system for ex vivo VLE scans, novel quantitative image features were developed for two previously identified clinical VLE features predictive for BE neoplasia, showing promising results. This paper proposes a novel quantitative image feature for a missing third clinical VLE feature. The novel gland-based image feature called “gland statistics” (GS), is compared to several generic image analysis features and the most promising clinically-inspired feature “layer histogram” (LH). All features are evaluated on a clinical, validated data set consisting of 88 non-dysplastic BE and 34 neoplastic in vivo VLE images for eight different widely-used machine learning methods. The new clinically-inspired feature has on average superior classification accuracy (0.84 AUC) compared to the generic image analysis features (0.61 AUC), as well as comparable performance to the LH feature (0.86 AUC). Also, the LH feature achieves superior classification accuracy compared to the generic image analysis features in vivo, confirming previous ex vivo results. Combining the LH and the novel GS features provides even further improvement of the performance (0.88 AUC), showing great promise for the clinical utility of this algorithm to detect early BE neoplasia.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.