Lung segmentation supports essential functionality within the realm of computer-aided detection and diagnosis using chest radiographs. In this research, we present a novel bifurcation approach of segmenting the lungs by separating them down the spinal column and having separate networks for the right and the left lungs respectively. Results from the right lung and left lung networks are then merged to form the overall lung. We utilize DeepLabV3+ network with ResNet50 as the backbone for both left and right lung networks. Results are presented for publicly available datasets such as Shenzhen Dataset and Japanese Society of Radiological Technology (JSRT) dataset. Our proposed bifurcation approach achieved an overall accuracy of 98.8% and an IoU (Intersection over Union) of 0.977 for a set of 100 cases in Shenzhen dataset. We conducted an additional robustness study of this method by training and testing on an independent dataset utilizing a hold-out methodology. We utilize a private dataset for the training and testing occurs on an independent JSRT dataset comprising 140 cases and our algorithm achieved an overall IoU of 0.945 thereby demonstrating its efficacy against other whole lung models and setting a new benchmark for future research works.
Automatic identification of wildfires is an area attracting great interest in the past decade. Early detection of fire can help in minimizing disasters and assist decision makers to plan mitigation methods. In this paper, we annotate and utilize a drone imagery dataset with each of its pixels marked as: (a) Burning, (b) Burned, and (c) Unburnt. The dataset is comprised of 22 videos (138,390 frames) among which only a subset of 481 frames (~20 frames from each video) are marked for segmentation. In addition, the entire suite of frames is categorized as either “Smoke” or “No-Smoke”. We implement DeepLab-v3+ architecture to accurately segment affected regions as “Burned”, “Burning”, and “Unburnt”. We adopt a transfer learning-based architecture using an established Xception network to detect smoke within each frame to identify regions that can affect the performance of the proposed segmentation approach. Our segmentation algorithm achieves a mean accuracy of 97% and mean Jaccard Index of 0.93 on three test videos comprising 24,666 frames across all categories. Our classification algorithm achieves 92% for identifying smoke in each of those test frames.
The video captioning problem consists of describing a short video clip with natural language. Existing solutions tend to rely on extracting features from frames or sets of frames with pretrained and fixed Convolutional Neural Networks (CNNs). Traditionally, the CNNs are pretrained on the ImageNet-1K (IN1K) classification task. The features are then fed into a sequence-to-sequence model to produce the text description output. In this paper, we propose using Facebook's ResNeXt Weakly Supervised Learning (WSL) CNNs as fixed feature extractors for video captioning. These CNNs are trained on billion-scale weakly supervised datasets constructed from Instagram image-hashtag pairs and then fine-tuned on IN1K. Whereas previous works use complicated architectures or multimodal features, we demonstrate state-of-the-art performance on the Microsoft Video Description (MSVD) dataset and competitive results on the Microsoft Research-Video to Text (MSR-VTT) dataset using only the frame-level features from the new CNNs and a basic Transformer as a sequence-to-sequence model. Moreover, our results validate that CNNs pretrained with weak supervision can effectively transfer to tasks other than classification. Finally, we present results for a number of IN1K feature extractors and discuss the relationship between IN1K accuracy and video captioning performance. Code will be made available at https://github. com/flauted/OpenNMT-py.
In-situ Laser Powder Bed Fusion (LPBF) sensor packages seek to enable both the commercial and Department of Defense (DoD) supply chains via process monitoring for qualification and machine feedback. An automated material identification and geometric segmentation would be valuable for LPBF process monitoring. In this paper, various segmentation approaches are presented and discussed to determine the best approach. Later, deep learning method(s) to classify the materials as either AlSi10Mg or IN718 are presented. Diverse videos (in terms of shape, size, structure, and camera angle) of both materials are captured and labeled as either AlSi10Mg (24357 frames) or IN718 (9222 frames). A given frame can contain single or multiple parts of a material. The segmentation approach is applied to extract each part and 121,036 images are obtained. The dataset is randomly split into groups of 72%, 8% and 20% for training, validation, and testing respectively. Classification performance(s) using the proposed Convolutional Neural Network (CNN) in addition to transfer learning approaches using established networks such as AlexNet, ResNet, and SqueezeNet are studied. An overall accuracy of 99.6% is obtained on a set of 24,214 test images. In addition, efficacy of the proposed classification model is demonstrated by testing the algorithm on a completely different variant (in terms of shape, size, structure, or camera angle) of either material. The class activation mapping results of these networks are presented, yielding an insight into the network’s decision, and assisting the manufacturers in their decision-making process.
Purpose: Diabetic retinopathy is the leading cause of blindness, affecting over 93 million people. An automated clinical retinal screening process would be highly beneficial and provide a valuable second opinion for doctors worldwide. A computer-aided system to detect and grade the retinal images would enhance the workflow of endocrinologists.
Approach: For this research, we make use of a publicly available dataset comprised of 3662 images. We present a hybrid machine learning architecture to detect and grade the level of diabetic retinopathy (DR) severity. We also present and compare simple transfer learning-based approaches using established networks such as AlexNet, VGG16, ResNet, Inception-v3, NASNet, DenseNet, and GoogLeNet for DR detection. For the grading stage (mild, moderate, proliferative, or severe), we present an approach of combining various convolutional neural networks with principal component analysis for dimensionality reduction and a support vector machine classifier. We study the performance of these networks under different preprocessing conditions.
Results: We compare these results with various existing state-of-the-art approaches, which include single-stage architectures. We demonstrate that this architecture is more robust to limited training data and class imbalance. We achieve an accuracy of 98.4% for DR detection and an accuracy of 96.3% for distinguishing severity of DR, thereby setting a benchmark for future research efforts using a limited set of training images.
Conclusions: Results obtained using the proposed approach serve as a benchmark for future research efforts. We demonstrate as a proof-of-concept that an automated detection and grading system could be developed with a limited set of images and labels. This type of independent architecture for detection and grading could be used in areas with a scarcity of trained clinicians based on the necessity.
Approximately two million pediatric deaths occur every year due to Pneumonia. Detection and diagnosis of Pneumonia plays an important role in reducing these deaths. Chest radiography is one of the most commonly used modalities to detect pneumonia. In this paper, we propose a novel two-stage deep learning architecture to detect pneumonia and classify its type in chest radiographs. This architecture contains one network to classify images as either normal or pneumonic, and another deep learning network to classify the type as either bacterial or viral. In this paper, we study and compare the performance of various stage one networks such as AlexNet, ResNet, VGG16 and Inception-v3 for detection of pneumonia. For these networks, we employ transfer learning to exploit the wealth of information available from prior training. For the second stage, we find that transfer learning with these same networks tends to overfit the data. For this reason we propose a simpler CNN architecture for classification of pneumonic chest radiographs and show that it overcomes the overfitting problem. We further enhance the performance of our system in a novel way by incorporating lung segmentation using a U-Net architecture. We make use of a publicly available dataset comprising 5856 images (1583 – Normal, 4273 – Pneumonic). Among the pneumonia patients, 2780 patients are identified as bacteria type and the rest belongs to virus category. We test our proposed algorithm(s) on a set of 624 images and we achieve an area under the receiver operating characteristic curve of 0.996 for pneumonia detection. We also achieve an accuracy of 97.8% for classification of pneumonic chest radiographs thereby setting a new benchmark for both detection and diagnosis. We believe the proposed two-stage classification of chest radiographs for pneumonia detection and its diagnosis would enhance the workflow of radiologists.
Identifying defective builds early on during Additive Manufacturing (AM) processes is a cost-effective way to reduce scrap and ensure that machine time is utilized efficiently. In this paper, we present an automated method to classify 3Dprinted polymer parts as either good or defective based on images captured during Fused Filament Fabrication (FFF), using independent machine learning and deep learning approaches. Either of these approaches could be potentially useful for manufacturers and hobbyists alike. Machine learning is implemented via Principal Component Analysis (PCA) and a Support Vector Machine (SVM), whereas deep learning is implemented using a Convolutional Neural Network (CNN). We capture videos of the FFF process on a small selection of polymer parts and label each frame as good or defective (2674 good frames and 620 defective frames). We divide this dataset for holdout validation by using 70% of images belonging to each class for training, leaving the rest for blind testing purposes. We obtain an overall accuracy of 98.2% and 99.5% for the classification of polymer parts using machine learning and deep learning techniques, respectively.
Plasmodium malaria is a parasitic protozoan that causes malaria in humans. Computer aided detection of Plasmodium is a research area attracting great interest. In this paper, we study the performance of various machine learning and deep learning approaches for the detection of Plasmodium on cell images from digital microscopy. We make use of a publicly available dataset composed of 27,558 cell images with equal instances of parasitized (contains Plasmodium) and uninfected (no Plasmodium) cells. We randomly split the dataset into groups of 80% and 20% for training and testing purposes, respectively. We apply color constancy and spatially resample all images to a particular size depending on the classification architecture implemented. We propose a fast Convolutional Neural Network (CNN) architecture for the classification of cell images. We also study and compare the performance of transfer learning algorithms developed based on well-established network architectures such as AlexNet, ResNet, VGG-16 and DenseNet. In addition, we study the performance of the bag-of-features model with Support Vector Machine for classification. The overall probability of a cell image comprising Plasmodium is determined based on the average of probabilities provided by all the CNN architectures implemented in this paper. Our proposed algorithm provided an overall accuracy of 96.7% on the testing dataset and area under the Receiver Operating Characteristic (ROC) curve value of 0.994 for 2756 parasitized cell images. This type of automated classification of cell images would enhance the workflow of microscopists and provide a valuable second opinion.
The comet assay is a technique used to assess the DNA damage in individual cells. The extent of the damage is indicated by the ratio between the amount of DNA in the tail of the comet and the amount in the head. This assessment is typically made by the operator manually analyzing the images. This process is inefficient and time consuming. Researchers in the past have used machine learning techniques to automate this process but it required manual feature extraction. In some cases, deep learning was applied but only for damage classification. We have successfully applied Convolutional Neural Networks(CNN) to achieve automated quantification of DNA damage from comet images. Typically deep learning techniques such as CNN require large amounts of labelled training data, which may not always be available. We demonstrate that by applying deep transfer learning, state of the art results can be obtained in the detection of DNA damage, even with a limited number of comet images.
We study the performance of a computer-aided detection (CAD) system for lung nodules in computed tomography (CT) as a function of slice thickness. In addition, we propose and compare three different training methodologies for utilizing nonhomogeneous thickness training data (i.e., composed of cases with different slice thicknesses). These methods are (1) aggregate training using the entire suite of data at their native thickness, (2) homogeneous subset training that uses only the subset of training data that matches each testing case, and (3) resampling all training and testing cases to a common thickness. We believe this study has important implications for how CT is acquired, processed, and stored. We make use of 192 CT cases acquired at a thickness of 1.25 mm and 283 cases at 2.5 mm. These data are from the publicly available Lung Nodule Analysis 2016 dataset. In our study, CAD performance at 2.5 mm is comparable with that at 1.25 mm and is much better than at higher thicknesses. Also, resampling all training and testing cases to 2.5 mm provides the best performance among the three training methods compared in terms of accuracy, memory consumption, and computational time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.