Video bronchoscopy is routinely conducted for biopsies of lung tissue suspected for cancer, monitoring of COPD patients and clarification of acute respiratory problems at intensive care units. The navigation within complex bronchial trees is particularly challenging and physically demanding, requiring long-term experiences of physicians. This paper addresses the automatic segmentation of bronchial orifices in bronchoscopy videos. Deep learning-based approaches to this task are currently hampered due to the lack of readily available ground truth segmentation data. Thus, we present a data-driven pipeline consisting of a k-means followed by a compact marker-based watershed algorithm which enables to generate airway instance segmentation maps from given depth images. In this way, these traditional algorithms serve as weak supervision for training a shallow CNN directly on RGB images solely based on a phantom dataset. We evaluate generalization capabilities of this model on two in-vivo datasets covering 250 frames on 21 different bronchoscopies. We demonstrate that the model is capable to transfer its knowledge to the unseen in-vivo domain, reaching an average error of 4.35 vs 7.98 pixels for detected centroids of airway segmentations by an image resolution of 128 × 128. Our quantitative and qualitative results indicate that in the context of video bronchoscopy, phantom data and weak supervision using non-learning-based approaches enable to gain a semantic understanding of airway structures
AI guidance on compression ultrasound is a problem for 2D segmentation networks, which produce inconsistent labels. This is aided by registration but classical untrained approaches cannot handle the large deformations and the noisy background movement and convolutional models like VoxelMorph do not reach the robust accuracy required. Meanwhile, large deformations are typically estimated with multi-warp networks that comprise "correlation layers", but they are resource-intensive and not easily applicable on end-devices in clinical context. We propose to replace the "correlation layer" with a differentiable convex optimisation block and perform end-to-end training of the convolutional feature backbone for improved performance.
Purpose: Image registration is the process of aligning images, and it is a fundamental task in medical image analysis. While many tasks in the field of image analysis, such as image segmentation, are handled almost entirely with deep learning and exceed the accuracy of conventional algorithms, currently available deformable image registration methods are often still conventional. Deep learning methods for medical image registration have recently reached the accuracy of conventional algorithms. However, they are often based on a weakly supervised learning scheme using multilabel image segmentations during training. The creation of such detailed annotations is very time-consuming.
Approach: We propose a weakly supervised learning scheme for deformable image registration. By calculating the loss function based on only bounding box labels, we are able to train an image registration network for large displacement deformations without using densely labeled images. We evaluate our model on interpatient three-dimensional abdominal CT and MRI images.
Results: The results show an improvement of ∼10 % (for CT images) and 20% (for MRI images) in comparison to the unsupervised method. When taking into account the reduced annotation effort, the performance also exceeds the performance of weakly supervised training using detailed image segmentations.
Conclusion: We show that the performance of image registration methods can be enhanced with little annotation effort using our proposed method.
We demonstrate a new approach for blind domain adaptation by employing classic feature descriptors as a first step in a deep learning pipeline. One advantage of our approach over other domain adaptation methods is that no target domain data are required. Therefore, the trained models perform well on a multitude of different datasets as opposed to one specific target dataset. We test our approach on the task of abdominal CT and MR organ segmentation and transfer the models from the training dataset to multiple other CT and MR datasets. We show that modality independent neighborhood descriptors applied prior to a DeepLab segmentation pipeline can yield high accuracies when the model is applied on other datasets including those with a different imaging modality.
The Human BioMolecular Atlas Program (HuBMAP) provides an opportunity to contextualize findings across cellular to organ systems levels. Constructing an atlas target is the primary endpoint for generalizing anatomical information across scales and populations. An initial target of HuBMAP is the kidney organ and arterial phase contrast-enhanced computed tomography (CT) provides distinctive appearance and anatomical context on the internal substructure of kidney organs such as renal context, medulla, and pelvicalyceal system. With the confounding effects of demographics and morphological characteristics of the kidney across large-scale imaging surveys, substantial variation is demonstrated with the internal substructure morphometry and the intensity contrast due to the variance of imaging protocols. Such variability increases the level of difficulty to localize the anatomical features of the kidney substructure in a well-defined spatial reference for clinical analysis. In order to stabilize the localization of kidney substructures in the context of this variability, we propose a high-resolution CT kidney substructure atlas template. Briefly, we introduce a deep learning preprocessing technique to extract the volumetric interest of the abdominal regions and further perform a deep supervised registration pipeline to stably adapt the anatomical context of the kidney internal substructure. To generate and evaluate the atlas template, arterial phase CT scans of 500 control subjects are de-identified and registered to the atlas template with a complete end-to-end pipeline. With stable registration to the abdominal wall and kidney organs, the internal substructure of both left and right kidneys are substantially localized in the high-resolution atlas space. The atlas average template successfully demonstrated the contextual details of the internal structure and was applicable to generalize the morphological variation of internal substructure across patients.
Opportunistic screening of the spine using routine 3D-CT scans could be beneficial for early diagnosis and prevention of osteoporosis and degenerative diseases. In clinical practice, a software that alerts radiologists for signs of osteoporosis and degenerative deformities should be accurate and robust despite being limited by computational resources. We explore a light-weight alternative to existing vertebrae segmentation and labelling algorithms in our two-stage diagnosis pipeline and evaluate the efficiency and robustness of our proposed Deep Learning method. During our first stage, semantic segmentation and labelling of the vertebrae is performed using a low-complexity 3D-CNN. Our efficient architecture includes a MobileNetv2 backbone and DeepLab for segmentation. We optimise the network architecture to improve efficiency by applying the compound scaling idea of the EfficientNet. The first stage of our model is trained and evaluated on the public VerSe dataset, resulting in a multi-label Dice score of 75% while taking less than 0.05s inference time on GPU. The segmentation outcome can be further used to extract centre coordinates of each vertebra which are classified by a 3D-CNN into normal vertebra, degenerative deformities and osteoporotic fractures. Our preliminary results on the DiagnostikBilanz dataset, using centre coordinates, yield an F1-score of 76%. Our fully-automatic pipeline achieves an F1-Score of 74.6%, which is an improvement of 7% compared to the pipeline using nnUNet for segmentation in the first stage. Our method provides a light-weight solution to assist radiologists in differentiating osteoporotic fractures from degenerative deformities in opportunistic screening of the spine in CT scans. In particular, it allows to incorporate segmentation information into the challenging differentiation of degenerative deformities from osteoporotic fractures.
Medical image registration over the past many years has been dominated by techniques which rely on expert an- notations, while not taking advantage of the unlabelled data. Deep unsupervised architectures utilized this widely available unlabelled data to model the anatomically induced patterns in a dataset. Deformable Auto-encoder (DAE), an unsupervised group-wise registration technique, has been used to generate a deformed reconstruction of an input image, which also subsequently generates a global template to capture the deformations in a medical dataset. DAEs however have significant weakness in propagating global information over range long dependencies, which may affect the registration performance on quantitative and qualitative measures. Our proposed method captures valuable knowledge over the whole spatial dimension using an attention mechanism. We present Deformable Auto-encoder Attention Relu Network (DA-AR-Net), which is an exquisite integration of the Attention Relu(Arelu), an attention based activation function into the DAE framework. A detachment of the template image from the deformation field is achieved by encoding the spatial information into two separate latent code representation. Each latent code is followed by a separate decoder network, while only a single encoder is used for feature encoding. Our DA-AR-Net is formalized after an extensive and systematic search across various hyper-parameters - initial setting of learnable parameters of Arelu, the appropriate positioning of Arelu, latent code dimensions, and batch size. Our best architecture shows a significant improvement of 42% on MSE score over previous DAEs and 32% reduction is attained while generating visually sharp global templates.
KEYWORDS: Image registration, Medical imaging, Data modeling, Computer programming, 3D modeling, Neuroimaging, Magnetic resonance imaging, Image restoration, Brain, 3D scanning
Robust groupwise registration methods are important for the analysis of large medical image datasets. We build upon the concept of deforming autoencoders that decouples shape and appearance to represent anatomical variability in a robust and plausible manner. In this work we propose a deep learning model that is trained to generate templates and deformation fields. It employs a joint encoder block which provides latent representations for both shape and appearance and is followed by two independent shape and appearance decoder paths. The model achieves image reconstruction by warping the template provided by the appearance decoder with the estimated warping field provided by the shape encoder. By restricting the embedding to a low-dimensional latent code, we are able to obtain meaningful deformable templates. Our objective function ensures smooth and realistic deformation fields. It contains an invertibility loss term, which is novel for deforming autoencoders and induces backward consistency. This should ensure that warping the reconstructed image with the deformation field ideally results in in the template. In addition, warping the template with the reversed deformation field should ideally produce the reconstructed image. We demonstrate the potential of our approach for application to two- and three-dimensional medical image data by training and evaluating it on labeled MRI brain scans. We show that adding the inverse consistency penalty to the objective function leads to improved and more robust registration results. When evaluated on unseen data with expert labels for accuracy estimation our three-dimensional model achieves substantially increased Dice scores by 5 percentage points.
A major goal of lung cancer screening is to identify individuals with particular phenotypes that are associated with high risk of cancer. Identifying relevant phenotypes is complicated by the variation in body position and body composition. In the brain, standardized coordinate systems (e.g., atlases) have enabled separate consideration of local features from gross / global structure. To date, no analogous standard atlas has been presented to enable spatial mapping and harmonization in chest computational tomography (CT). In this paper, we propose a thoracic atlas built upon a large low dose CT database with no screening detected malignancy (age 46-79 years, mean 64.9 years). The application validity of the developed atlas is evaluated in terms of discriminative capability for different anatomic phenotypes, including body mass index (BMI), chronic obstructive pulmonary disease (COPD), and coronary artery calcification (CAC).
The Human BioMolecular Atlas Program (HuBMAP) seeks to create a molecular atlas at the cellular level of the human body to spur interdisciplinary innovations across spatial and temporal scales. While the preponderance of effort is allocated towards cellular and molecular scale mapping, differentiating and contextualizing findings within tissues, organs and systems are essential for the HuBMAP efforts. The kidney is an initial organ target of HuBMAP, and constructing a framework (or atlas) for integrating information across scales is needed for visualizing and integrating information. However, there is no abdominal atlas currently available in the public domain. Substantial variation in healthy kidneys exists with sex, body size, and imaging protocols. With the integration of clinical archives for secondary research use, we are able to build atlases based on a diverse population and clinically relevant protocols. In this study, we created a computed tomography (CT) phase-specific atlas for the abdomen, which is optimized for the kidney organ. A two-stage registration pipeline was used by registering extracted abdominal volume of interest from body part regression, to a high-resolution CT. Affine and non-rigid registration were performed to all scans hierarchically. To generate and evaluate the atlas, multiphase CT scans of 500 control subjects (age: 15 - 50, 250 males, 250 females) are registered to the atlas target through the complete pipeline. The abdominal body and kidney registration are shown to be stable with the variance map computed from the result average template. Both left and right kidneys are substantially localized in the high-resolution target space, which successfully demonstrated the sharp details of its anatomical characteristics across each phase. We illustrated the applicability of the atlas template for integrating across normal kidney variation from 64 cm3 to 302 cm3 .
The treatment of age-related macular degeneration (AMD) requires continuous eye exams using optical coherence tomography (OCT). The need for treatment is determined by the presence or change of disease-specific OCTbased biomarkers. Therefore, the monitoring frequency has a significant influence on the success of AMD therapy. However, the monitoring frequency of current treatment schemes is not individually adapted to the patient and therefore often insufficient. While a higher monitoring frequency would have a positive effect on the success of treatment, in practice it can only be achieved with a home monitoring solution. One of the key requirements of a home monitoring OCT system is a computer-aided diagnosis to automatically detect and quantify pathological changes using specific OCT-based biomarkers. In this paper, for the first time, retinal scans of a novel self-examination low-cost full-field OCT (SELF-OCT) are segmented using a deep learningbased approach. A convolutional neural network (CNN) is utilized to segment the total retina as well as pigment epithelial detachments (PED). It is shown that the CNN-based approach can segment the retina with high accuracy, whereas the segmentation of the PED proves to be challenging. In addition, a convolutional denoising autoencoder (CDAE) refines the CNN prediction, which has previously learned retinal shape information. It is shown that the CDAE refinement can correct segmentation errors caused by artifacts in the OCT image.
The quality of the segmentation of organs and pathological tissues has significantly improved in recent years by using deep learning approaches, which are typically trained with fully supervised learning. For these fully supervised learning methods, an immense amount of fully labeled training data is required, which are, especially in medicine, costly to generate. To overcome this issue, weakly supervised training methods are used, because they do not need fully labeled ground truth data. For the localization of objects weakly supervised learning has already become more important. Recently, weakly supervised learning also became increasingly important in the area of segmentation of pathological tissues. However, these currently available approaches still require additional anatomical information. In this paper, we present a weakly supervised segmentation method that does not need ground truth segmentations as input or additional anatomical information. Our method consists of three classification networks in sagittal, axial, and coronal direction that decide whether a slice contains the structure to be segmented. Then, we use the class activation maps of the classification output to generate a combined segmentation. Our network was trained for the challenging task of pancreas segmentation with the publicly available TCIA pancreas dataset and we reached Dice scores for slices of up to 0.86 and an overall Dice score of up to 0.53.
Deformable image registration, a key component of motion correction in medical imaging, needs to be efficient and provides plausible spatial transformations that reliably approximate biological aspects of complex human organ motion. Standard approaches, such as Demons registration, mostly use Gaussian regularization for organ motion, which, though computationally efficient, rule out their application to intrinsically more complex organ motions, such as sliding interfaces. We propose regularization of motion based on supervoxels, which provides an integrated discontinuity preserving prior for motions, such as sliding. More precisely, we replace Gaussian smoothing by fast, structure-preserving, guided filtering to provide efficient, locally adaptive regularization of the estimated displacement field. We illustrate the approach by applying it to estimate sliding motions at lung and liver interfaces on challenging four-dimensional computed tomography (CT) and dynamic contrast-enhanced magnetic resonance imaging datasets. The results show that guided filter-based regularization improves the accuracy of lung and liver motion correction as compared to Gaussian smoothing. Furthermore, our framework achieves state-of-the-art results on a publicly available CT liver dataset.
Most radiologists prefer an upright orientation of the anatomy in a digital X-ray image for consistency and quality reasons. In almost half of the clinical cases, the anatomy is not upright orientated, which is why the images must be digitally rotated by radiographers. Earlier work has shown that automated orientation detection results in small error rates, but requires specially designed algorithms for individual anatomies. In this work, we propose a novel approach to overcome time-consuming feature engineering by means of Residual Neural Networks (ResNet), which extract generic low-level and high-level features, and provide promising solutions for medical imaging. Our method uses the learned representations to estimate the orientation via linear regression, and can be further improved by fine-tuning selected ResNet layers. The method was evaluated on 926 hand X-ray images and achieves a state-of-the-art mean absolute error of 2.79°.
Automated and fast multi-label segmentation of medical images is challenging and clinically important. This paper builds upon a supervised machine learning framework that uses training data sets with dense organ annotations and vantage point trees to classify voxels in unseen images based on similarity of binary feature vectors extracted from the data. Without explicit model knowledge, the algorithm is applicable to different modalities and organs, and achieves high accuracy. The method is successfully tested on 70 abdominal CT and 42 pelvic MR images. With respect to ground truth, an average Dice overlap score of 0.76 for the CT segmentation of liver, spleen and kidneys is achieved. The mean score for the MR delineation of bladder, bones, prostate and rectum is 0.65. Additionally, we benchmark several variations of the main components of the method and reduce the computation time by up to 47% without significant loss of accuracy. The segmentation results are – for a nearest neighbor method – surprisingly accurate, robust as well as data and time efficient.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.