The majority of the encouraging experimental results published on AI-based endoscopic Computer-Aided Detection (CAD) systems have not yet been reproduced in clinical settings, mainly due to highly curated datasets used throughout the experimental phase of the research. In a realistic clinical environment, these necessary high image-quality standards cannot be guaranteed, and the CAD system performance may degrade. While several studies have previously presented impressive outcomes with Frame Informativeness Assessment (FIA) algorithms, the current-state of the art implies sequential use of FIA and CAD systems, affecting the time performance of both algorithms. Since these algorithms are often trained on similar datasets, we hypothesise that part of the learned feature representations can be leveraged for both systems, enabling a more efficient implementation. This paper explores this case for early Barrett cancer detection by integrating the FIA algorithm within the CAD system. Sharing the weights between two tasks reduces the number of parameters from 16 to 11 million and the number of floating-point operations from 502 to 452 million. Due to the lower complexity of the architecture, the proposed model leads to inference time up to 2 times faster than the state-of-the-art sequential implementation while retaining the classification performance.
Computer-aided detection (CAD) approaches have shown promising results for early esophageal cancer detection using Volumetric Laser Endoscopy (VLE) imagery. However, the relatively slow and computationally costly tissue segmentation employed in these approaches hamper their clinical applicability. In this paper, we propose to reframe the 2D tissue segmentation problem into a 1D tissue boundary detection problem. Instead of using an encoder-decoder architecture, we propose to follow the tissue boundary using a Recurrent Neural Network (RNN), exploiting the spatio-temporal relations within VLE frames. We demonstrate a near state-of-the-art performance using 18 times less floating point operations, enabling real-time execution in clinical practice.
Barrett's Esophagus is a precursor of esophageal adenocarcinoma, one of the most lethal forms of cancer. Volumetric laser endomicroscopy (VLE) is a relatively new technology used for early detection of abnormal cells in BE by imaging the inner tissue layers of the esophagus. Computer-Aided Detection (CAD) shows great promise in analyzing the VLE frames due to the advances in deep learning. However, a full VLE scan produces 1,200 scans of 4,096 x 2,048 pixels, making automated pre-processing for the tissue of interest extraction necessary. This paper explores an object detection for tissue detection in VLE scans. We show that this can be achieved in real time with very low inference time, using single-stage object detection like YOLO. Our best performing model achieves a value of 98.23% for the mean average precision of bounding boxes correctly predicting the tissue of interest. Additionally, we have found that the tiny YOLO with Partial Residual Networks architecture further reduces the inference speed with a factor of 10, while only sacrificing less than 1% of accuracy. This proposed method does not only segment the tissue of interest in real time without any latency, but it can also achieve this efficiently using limited GPU resources, rendering it attractive for embedded applications. Our paper is the first to introduce object detection as a new approach for VLE-data tissue segmentation and paves the way for real-time VLE-based detection of early cancer in BE.
Over the past few decades, primarily developed countries witnessed an increased incidence of esophageal adenocarcinoma (EAC). Screening and surveillance of Barrett’s esophagus (BE), which is known to augment the probability of developing EAC, can significantly improve survival rates. This is because early-stage dysplasia in BE can be treated effectively, while each subsequent stage complicates successful treatment and seriously reduces survival rates. This study proposes a convolutional neural network-based algorithm, which classifies images of BE visualized with White Light Endoscopy (WLE) as either dysplastic or non-dysplastic. To this end, we use merely pixels surrounding the dysplastic region, while excluding the pixels covering the dysplastic region itself. The phenomenon where the diagnosis of a patient can be determined from tissue other than the clearly observable diseased area, is termed the field effect. With its potential to identify missed lesions, it may prove to be a helpful innovation in the screening and surveillance process of BE. A statistical significant difference test indicates the presence of the field effect in WLE, when comparing the distribution of the algorithm classifications of unseen data and the distribution obtained by a random classification.
Routine surveillance endoscopies are currently used to detect dysplasia in patient with Barrett's Esophagus (BE). However, most of these procedures are performed by non-expert endoscopists in community hospitals. Leading to many missed dysplastic lesions, which can progress into advanced esophageal adenocarcinoma if left untreated.1 In recent years, several successful algorithms have been proposed for the detection of cancer in BE using high-quality overview images. This work addresses the first steps towards clinical application on endoscopic surveillance videos. Several challenges are identified that occur when moving from image-based to video-based analysis. (1) It is shown that algorithms trained on high-quality overview images do not naively transfer to endoscopic videos due to e.g. non-informative frames. (2) Video quality is shown to be an important factor in algorithm performance. Specifically, temporal location performance is highly correlated with video quality. (3) When moving to real-time algorithms, the additional compute necessary to address the challenges in videos will become a burden on the computational budget. However, in addition to challenges, videos also bring new opportunities not available in the current image-based methods such as the inclusion of temporal information. This work shows that a multi-frame approach increases performance compared to a naive single-image method when the above challenges are addressed.
Gastroenterologists are estimated to misdiagnose up to 25% of esophageal adenocarcinomas in Barrett's Esophagus patients. This prompts the need for more sensitive and objective tools to aid clinicians with lesion detection. Artificial Intelligence (AI) can make examinations more objective and will therefore help to mitigate the observer dependency. Since these models are trained with good-quality endoscopic video frames to attain high efficacy, high-quality images are also needed for inference. Therefore, we aim to develop a framework that is able to distinguish good image quality by a-priori informativeness classification which leads to high inference robustness. We show that we can maintain informativeness over the temporal domain using recurrent neural networks, yielding a higher performance on non-informativeness detection compared to classifying individual images. Furthermore, it is also found that by using Gradient weighted Class Activation Map (Grad-CAM), we can better localize informativeness within a frame. We have developed a customized Resnet18 feature extractor with 3 classifiers, consisting of a Fully-Connected (FC), Long-Short-Term-Memory (LSTM) and a Gated-Recurrent-Unit (GRU) classifier. Experimental results are based on 4,349 frames from 20 pullback videos of the esophagus. Our results demonstrate that the algorithm achieves comparative performance with the current state-of-the-art. The FC and LSTM classifier reach an F1 score of 91% and 91%. We found that the LSTM classifier based Grad-CAMs represent the origin of non-informativeness the best as 85% of the images were found to be highlighting the correct area.
The benefit of our novel implementation for endoscopic informativeness classification is that it is trained end- to-end, incorporates the spatiotemporal domain in the decision making for robustness, and makes the model decisions of the model insightful with the use of Grad-CAMs.
Symmetric design of encoder-decoder networks is common in deep learning. For almost all segmentation problems, the output segmentation is vastly less complex compared to the input image. However, the effect of the size of the decoder on segmentation performance has not been investigated in literature. This work investigates the effect of reducing decoder size on binary segmentation performance in a medical imaging application. To this end, we propose a methodology to reduce the size of the decoder in encoder-decoder networks, where residual skip connections are employed in combination with a 1x1 convolution instead of concatenations (as employed by U-Net) to achieve models with asymmetric design. The results on the ISIC2017 data set show that the amount of trainable parameters in the decoder can be reduced by up to a factor 100 compared to standard U-Net, while retaining segmentation performance. Additionally, the reduced amount of trainable decoder parameters in the proposed models leads to inference times up to 3 times faster compared to standard U-Net.
Volumetric Laser Endomicroscopy (VLE) is a promising balloon-based imaging technique for detecting early neoplasia in Barretts Esophagus. Especially Computer Aided Detection (CAD) techniques show great promise compared to medical doctors, who cannot reliably find disease patterns in the noisy VLE signal. However, an essential pre-processing step for the CAD system is tissue segmentation. At present, tissue is segmented manually but is not scalable for the entire VLE scan consisting of 1,200 frames of 4,096 × 2,048 pixels. Furthermore, the current CAD methods cannot use the VLE scans to their full potential, as only a small segment of the esophagus is selected for further processing, while an automated segmentation system results in significantly more available data. This paper explores the possibility of automatically segmenting relevant tissue for VLE scans using FusionNet and a domain-specific loss function. The contribution of this work is threefold. First, we propose a tissue segmentation algorithm for VLE scans. Second, we introduce a weighted ground truth that exploits the signal-to-noise ratio characteristics of the VLE data. Third, we compare our algorithm segmentation against two additional VLE experts. The results show that our algorithm annotations are indistinguishable from the expert annotations and therefore the algorithm can be used as a preprocessing step for further classification of the tissue.
In current clinical practice, the resectability of pancreatic ductal adenocarcinoma (PDA) is determined subjec- tively by a physician, which is an error-prone procedure. In this paper, we present a method for automated determination of resectability of PDA from a routine abdominal CT, to reduce such decision errors. The tumor features are extracted from a group of patients with both hypo- and iso-attenuating tumors, of which 29 were resectable and 21 were not. The tumor contours are supplied by a medical expert. We present an approach that uses intensity, shape, and texture features to determine tumor resectability. The best classification results are obtained with fine Gaussian SVM and the L0 Feature Selection algorithms. Compared to expert predictions made on the same dataset, our method achieves better classification results. We obtain significantly better results on correctly predicting non-resectability (+17%) compared to a expert, which is essential for patient treatment (negative prediction value). Moreover, our predictions of resectability exceed expert predictions by approximately 3% (positive prediction value).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.