Human epidermal growth factor receptor 2 (HER2) serves as a prognostic and predictive biomarker for breast cancer. Recently, there has been an increasing number of studies evaluating the feasibility of utilizing H&E WSIs for determining HER2 status through innovative data-driven deep learning methods, taking advantage of the ubiquitous availability of H&E WSIs. One of the main challenges with these data-driven methods is the need for large-scale datasets with high quality annotations, which can be expensive to curate. Therefore, in this study, we explored both the region-of-interest (ROI)-based supervised and the attention-based multiple-instance-learning (MIL) weakly supervised methods for predicting HER2 status on H&E WSIs to evaluate whether avoiding labor-intensive tumor annotation will compromise the final prediction performance. The ROI-based method involved an Inception-v3 along with an aggregation step to combine the patch-level predictions into a WSI-level prediction. On the other hand, the attention-based MIL methods explored ImageNet pretrained ResNet, H&E image pretrained ResNet, and H&E image pretrained vision transformer (ViT) as encoders for WSI-level HER2 prediction. Experiments are carried out on N = 355 WSIs available in public domain with HER2 status determined by IHC and ISH and annotations of breast invasive carcinoma. The dataset was split into training/validation/test set with 80/10/10 ratio. Our results demonstrate that the attention-based ViT MIL method is able to reach similar accuracy as the ROI-based method on the independent test set (AUC of 0.79 (95% CI: 0.63-0.95) versus 0.88 (95% CI: 0.63-0.9) respectively), and thus reduces the burden of labor-intensive annotations. Furthermore, the attention mechanism enhances interpretability of the results and offers insights into the reliability of the predictions.
In this study, we have developed a method to detect anomalies in histology slides containing tissues sourced from multiple organs of rats. In the nonclinical phase of drug development, candidate drugs are typically tested on animals such as rats, and a postmortem assessment is conducted based on human evaluation of histology slides. Findings in those histology slides manifest as anomalous departures from expectation on Whole Slide Images (WSIs). Our proposed method, makes use of a StyleGAN2 and ResNet based encoder to identify anomalies in WSIs. Using these models, we train an image reconstruction pipeline only on an anomaly-free (’normal’) dataset. We then use this pipeline to identify anomalies using the reconstruction quality measured by Structural Similarity Index (SSIM). Our experiments were carried out on 54 WSIs across 40 different organ types and achieved a patch-level classification accuracy of 88%.
Three-dimensional lesion segmentation is required for analysis of radiomic features and lesion growth kinetics. In clinical trials, radiologists apply the Response Evaluation Criteria in Solid Tumors (RECIST), by manually annotating the long and short diameters of a lesion on a single 2D axial slice (RECIST slice), where the lesion looks largest. We developed a novel approach that leverages the RECIST annotations to segment lesions in 3D on CT scans. We start with bounding box and center point prompts derived from RECIST long and short diameters on RECIST slice. Iteratively, we perform prompted segmentation using Segment Anything Model (SAM) on off-RECIST slices towards the superior and inferior direction until all slices are segmented. To optimize the performance of SAM, we fine-tuned the mask decoder. In addition, it is crucial to detect where the lesions disappear at the superior and inferior direction to prevent over segmentation. We developed a multi-task framework for lesion existence classification and segmentation and further compared the parallel framework and cascaded framework. We used an internal dataset consisting of 2053 and 200 3D lesions for fine-tuning of SAM decoder and testing, respectively. Baseline SAM, SAM with fine tuning, SAM with parallel multi-task fine tuning, and SAM with cascaded multitask fine tuning have Dice scores of 0.4745±0.2138, 0.7136±0.1277, 0.6985±0.1312, and 0.7239±0.1321, respectively. Our experiments proves that multi-task learning is an effective way for 3D segmentation with SAM, and cascaded framework performs better than parallel framework.
Micro-CT imaging enables noninvasive and longitudinal assessment of mouse lung pathology in genetically engineered lung cancer models, which is crucial for evaluating the effectiveness of potential therapeutics. However, manual lung analysis is time-consuming, and an automated workflow is needed. We present a strategy to optimize a deep learning-based workflow for lung tumor analysis using limited annotations. A 2D UNet model (M1) was trained for chest cavity segmentation using an existing dataset with lung, heart, and vasculature segmentations from wild-type mice (n = 10) and chest cavity segmentations of mice with lung tumors (n = 5). M1 then generated chest cavity segmentations for 20 additional lung tumor burdened mice. Next, non-rigid registration aligned wild-type segmentations with tumor burdened lung scans (n = 25) using the chest cavity mask predicted by M1. Subsequently, M1 was fine-tuned, and a heart segmentation model (M2) was trained with 10 wild-type and 25 tumor burdened lung scans. Heart segmentation was then subtracted from the chest segmentation, and a threshold-based algorithm (-1000 to -300 a.u.) was applied to reveal functional lung volume. Finally, tumor segmentation was estimated by subtracting functional lung and heart volumes from chest cavity volume in a cohort of lung tumor burdened mice. The resulting workflow provides “chest”, “heart”, “functional lung” and “tumor plus vasculature” segmentations for quantification and visualization. The models generate segmentations in approximately 13 seconds per mouse, with high accuracy (Dice ratios: 0.96 for chest cavity, 0.90 for heart). This workflow enables longitudinal monitoring of tumor progression, supporting applications in oncology drug discovery.
High-content screening (HCS) has catalyzed drug development through enabling fast, large-scale, and reproducible testing of changes in cellular invitro models in response to different types of perturbations. One HCS approach, known as Cell Painting (CP), can conduct the morphological profiling of images containing cells perturbed with different treatments to quantitatively assess complex biological changes. Profiling stages of macrophage polarization, in particular, enables new drug discovery with disease-relevant conditions. To analyze cell images at single cell level, deep learning algorithms - in addition to classical image segmentation methods - may also be used to conduct single cell cropping for accurate and fast detection of individual cells. While the classical Watershed Segmentation and Gaussian mixture model (GMM) was first implemented for robust single-cell detection in the CP workflow, its performance is sometimes compromised when cells are clumped. A deep learning-based cell segmentation method called Cellpose was introduced and proposed as an alternative means for cell localization, however, coming at the cost of compromised runtime for HCS. In this study, we demonstrate the use of YOLOv5, a fast deep learning object detection algorithm, to yield comparable cell detection performance to the other two methods, while bringing improvements in high cell density regions and a faster runtime. This study demonstrates the use of the YOLOv5 model for performing ~2x faster cell detection with comparable IoU scores on HCS macrophage nuclei images, demonstrating its value in extracting coordinates for single-cell cropping needed in deep learning-based phenotypic profiling in HCS. We compare the accuracy and speed of the model developed using YOLOv5 with those of the current Watershed/GMM method and Cellpose method in macrophage cell detection in the context of investigating drug activity.
Tumor mutation burden (TMB) is an important biomarker for the prediction of response to anti-PD-1 immunotherapies. Studies have shown that higher level of TMB (TMB-H) is associated with higher response rate to immunotherapies in patients with various types of advanced solid tumors. However, the measurement of TMB depends on whole exome sequencing (WES) which is an expensive assay and not always available in standard clinical oncology settings. In this work, we assess the feasibility of predicting TMB-H based upon hematoxylin and eosin (H&E)-stained histopathology images, which is a routinely conducted assay in clinical oncology. Using an Inception-V3 convolutional neural network (CNN) as a baseline feature extractor, we compare adding a multi-layer perceptron (MLP) and a squeeze-and-excitation (SE) network on top of the baseline CNN. Training from random initialization and tuning with pretrained weights are also compared. Experiments are conducted on the H&E whole-slide images (WSI) of the melanoma dataset of The Cancer Genome Atlas (TCGA). Results from a 4-fold cross-validation show that the highest average area under the receiver operating characteristic curve (AUC) is 0.589, which implies that the prediction of TMB based on H&E WSI for melanoma remains a challenging problem that will warrant further investigations.
In the pharmaceutical industry, micro-CT images of Dutch-Belted rabbit fetuses have been used for the assessment of compound-induced skeletal abnormalities in developmental and reproductive toxicology (DART) studies. In the automated approach proposed to assess the morphology of each bone, localization and segmentation of each vertebral bone is a critical task. In this work, we are extending our previous work for the localization of cervical vertebrae to the entire spine following a multivariate regression framework based on a 3D convolutional neural network (CNN). We also introduce a multitasking 3D CNN for the segmentation of each vertebral bone, in which features at the most compact level are processed with two additional convolution layers with max pooling to generate features leading to a classification of whether the patch contains a complete vertebra or not. This multi-tasking mechanism allows us to ensure only complete pieces of vertebrae are segmented. Experimenting on 345 rabbit fetuses with 80/10/10 ratio for training/validation/testing, we were able to achieve successful localization on 94.3% of the cases (i.e. median bone-by-bone localization error under 5 voxels over the entire spine) and an average Dice similarity coefficient (DSC) of 0.80 between automated and ground truth segmentations on the testing set.
In pharmaceutical research, optical coherence tomography (OCT) has been used for the assessment of diseases such as age-related macular degeneration (AMD) and retinal pigment epithelial (RPE) atrophy on animals in pre-clinical studies. To measure the thickness of the total retina and individual retina layers on these OCT images, it is necessary to perform accurate segmentation which is known to be a labor-intensive and error-prone task especially on images of diseased animals with significant retina distortion. Herein we elect to perform automated segmentation of retina layers on the OCT images of rodent subjects using deep convolutional neural networks (CNN). Based on a U-Net architecture, we perform segmentation of three most important retina layers using U-Net CNN models trained with three different strategies: Training from scratch, transfer learning, and continued training from a pre-trained model of a different animal cohort. To compare the three strategies, three models are trained and tested on OCT scans of rodent subjects, and the segmentation results are compared with manually corrected delineations using Dice similarity coefficient (DSC) as a measure of accuracy. Results show that although all three strategies lead to similar performance, transfer learning and continued training are effective in accelerating the training process, while continued training manages to generate the most accurate results that are also the most plausible via visual inspections.
In developmental and reproductive toxicology (DART) studies, high-throughput micro-CT imaging of Dutch-Belted rabbit fetuses has been used as a method for the assessment of compound-induced skeletal abnormalities. Since performing visual inspection of the micro-CT images by the DART scientists is a time- and resource-intensive task, an automatic strategy was proposed to localize, segment out, label, and evaluate each bone on the skeleton in a testing environment. However, due to the lack of robustness in this bone localization approach, failures on localizing certain bones on the critical path while traversing the skeleton, e.g., the cervical vertebral bones, could lead to localization errors for other bones downstream. Herein an approach based on deep convolutional neural networks (CNN) is proposed to automatically localize each cervical vertebral bone represented by its center. For each center, a 3D probability map with Gaussian decay is computed with the center itself being the maximum. From cervical vertebrae C1 to C7, the 7 volumes of distance transforms are stacked in order to form a 4-dimensional array. The deep CNN with a 3D U-Net architecture is used to estimate the probability maps for vertebral bone centers from the CT images as the input. A post-processing scheme is then applied to find all the regions with positive response, eliminate the false ones using a point-based registration method, and provide the locations and labels for the 7 cervical vertebral bones. Experiments were carried out on a dataset of 345 rabbit fetus micro-CT volumes. The images were randomly divided into training/validation/testing sets at an 80/10/10 ratio. Results demonstrated a 94.3% success rate for localization and labeling on the testing dataset of 35 images, and for all the successful cases the average bone-by-bone localization error was at 0.84 voxel.
In the development of treatments for cardiovascular diseases, short axis cardiac cine MRI is important for the assessment of various structural and functional properties of the heart. In short axis cardiac cine MRI, Cardiac properties including the ventricle dimensions, stroke volume, and ejection fraction can be extracted based on accurate segmentation of the left ventricle (LV) myocardium. One of the most advanced segmentation methods is based on fully convolutional neural networks (FCN) and can be successfully used to do segmentation in cardiac cine MRI slices. However, the temporal dependency between slices acquired at neighboring time points is not used. Here, based on our previously proposed FCN structure, we proposed a new algorithm to segment LV myocardium in porcine short axis cardiac cine MRI by incorporating convolutional long short-term memory (Conv-LSTM) to leverage the temporal dependency. In this approach, instead of processing each slice independently in a conventional CNN-based approach, the Conv-LSTM architecture captures the dynamics of cardiac motion over time. In a leave-one-out experiment on 8 porcine specimens (3,600 slices), the proposed approach was shown to be promising by achieving average mean Dice similarity coefficient (DSC) of 0.84, Hausdorff distance (HD) of 6.35 mm, and average perpendicular distance (APD) of 1.09 mm when compared with manual segmentations, which improved the performance of our previous FCN-based approach (average mean DSC=0.84, HD=6.78 mm, and APD=1.11 mm). Qualitatively, our model showed robustness against low image quality and complications in the surrounding anatomy due to its ability to capture the dynamics of cardiac motion.
In developing treatment of cardiovascular diseases, short axis cine MRI has been used as a standard technique for
understanding the global structural and functional characteristics of the heart, e.g. ventricle dimensions, stroke volume
and ejection fraction. To conduct an accurate assessment, heart structures need to be segmented from the cine MRI
images with high precision, which could be a laborious task when performed manually. Herein a fully automatic
framework is proposed for the segmentation of the left ventricle from the slices of short axis cine MRI scans of porcine
subjects using a deep learning approach. For training the deep learning models, which generally requires a large set of
data, a public database of human cine MRI scans is used. Experiments on the 3150 cine slices of 7 porcine subjects have
shown that when comparing the automatic and manual segmentations the mean slice-wise Dice coefficient is about
0.930, the point-to-curve error is 1.07 mm, and the mean slice-wise Hausdorff distance is around 3.70 mm, which
demonstrates the accuracy and robustness of the proposed inter-species translational approach.
Antong Chen, Ashleigh Bone, Catherine D. Hines, Belma Dogdas, Tamara Montgomery, Maria Michener, Christopher Winkelmann, Soheil Ghafurian, Laura Lubbers, John Renger, Ansuman Bagchi, Jason Uslaner, Colena Johnson, Hatim Zariwala
Intracranial microdialysis is used for sampling neurochemicals and large peptides along with their metabolites from the interstitial fluid (ISF) of the brain. The ability to perform this in nonhuman primates (NHP) e.g., rhesus could improve the prediction of pharmacokinetic (PK) and pharmacodynamics (PD) action of drugs in human. However, microdialysis in rhesus brains is not as routinely performed as in rodents. One challenge is that the precise intracranial probe placement in NHP brains is difficult due to the richness of the anatomical structure and the variability of the size and shape of brains across animals. Also, a repeatable and reproducible ISF sampling from the same animal is highly desirable when combined with cognitive behaviors or other longitudinal study end points. Toward that end, we have developed a semi-automatic flexible neurosurgical method employing MR and CT imaging to (a) derive coordinates for permanent guide cannula placement in mid-brain structures and (b) fabricate a customized recording chamber to implant above the skull for enclosing and safeguarding access to the cannula for repeated experiments. In order to place the intracranial guide cannula in each subject, the entry points in the skull and the depth in the brain were derived using co-registered images acquired from MR and CT scans. The anterior/posterior (A/P) and medial-lateral (M/L) rotation in the pose of the animal was corrected in the 3D image to appropriately represent the pose used in the stereotactic frame. An array of implanted fiducial markers was used to transform stereotactic coordinates to the images. The recording chamber was custom fabricated using computer-aided design (CAD), such that it would fit the contours of the individual skull with minimum error. The chamber also helped in guiding the cannula through the entry points down a trajectory into the depth of the brain. We have validated our method in four animals and our results indicate average placement error of cannula to be 1.20 ± 0.68 mm of the targeted positions. The approach employed here for derivation of the coordinates, surgical implantation and post implant validation is built using traditional access to surgical and imaging methods without the necessity of intra-operative imaging. The validation of our method lends support to its wider application in most nonhuman primate laboratories with onsite MR and CT imaging capabilities.
Intracranial delivery of recombinant DNA and neurochemical analysis in nonhuman primate (NHP) requires precise targeting of various brain structures via imaging derived coordinates in stereotactic surgeries. To attain targeting precision, the surgical planning needs to be done on preoperative three dimensional (3D) CT and/or MR images, in which the animals head is fixed in a pose identical to the pose during the stereotactic surgery. The matching of the image to the pose in the stereotactic frame can be done manually by detecting key anatomical landmarks on the 3D MR and CT images such as ear canal and ear bar zero position. This is not only time intensive but also prone to error due to the varying initial poses in the images which affects both the landmark detection and rotation estimation. We have introduced a fast, reproducible, and semi-automatic method to detect the stereotactic coordinate system in the image and correct the pose. The method begins with a rigid registration of the subject images to an atlas and proceeds to detect the anatomical landmarks through a sequence of optimization, deformable and multimodal registration algorithms. The results showed similar precision (maximum difference of 1.71 in average in-plane rotation) to a manual pose correction.
KEYWORDS: Magnetic resonance imaging, Computed tomography, Brain, Genetics, Image registration, Skull, Animal model studies, 3D acquisition, Rigid registration, Surgery
In vivo gene delivery in central nervous systems of nonhuman primates (NHP) is an important approach for gene therapy and animal model development of human disease. To achieve a more accurate delivery of genetic probes, precise stereotactic targeting of brain structures is required. However, even with assistance from multi-modality 3D imaging techniques (e.g. MR and CT), the precision of targeting is often challenging due to difficulties in identification of deep brain structures, e.g. the striatum which consists of multiple substructures, and the nucleus basalis of meynert (NBM), which often lack clear boundaries to supporting anatomical landmarks. Here we demonstrate a 3D-image-based intracranial stereotactic approach applied toward reproducible intracranial targeting of bilateral NBM and striatum of rhesus. For the targeting we discuss the feasibility of an atlas-based automatic approach. Delineated originally on a high resolution 3D histology-MR atlas set, the NBM and the striatum could be located on the MR image of a rhesus subject through affine and nonrigid registrations. The atlas-based targeting of NBM was compared with the targeting conducted manually by an experienced neuroscientist. Based on the targeting, the trajectories and entry points for delivering the genetic probes to the targets could be established on the CT images of the subject after rigid registration. The accuracy of the targeting was assessed quantitatively by comparison between NBM locations obtained automatically and manually, and finally demonstrated qualitatively via post mortem analysis of slices that had been labelled via Evan Blue infusion and immunohistochemistry.
High-throughput micro-CT imaging has been used in our laboratory to evaluate fetal skeletal morphology in developmental toxicology studies. Currently, the volume-rendered skeletal images are visually inspected and observed abnormalities are reported for compounds in development. To improve the efficiency and reduce human error of the evaluation, we implemented a framework to automate the evaluation process. The framework starts by dividing the skull into regions of interest and then measuring various geometrical characteristics. Normal/abnormal classification on the bone segments is performed based on identifying statistical outliers. In pilot experiments using rabbit fetal skulls, the majority of the skeletal abnormalities can be detected successfully in this manner. However, there are shape-based abnormalities that are relatively subtle and thereby difficult to identify using the geometrical features. To address this problem, we introduced a model-based approach and applied this strategy on the squamosal bone. We will provide details on this active shape model (ASM) strategy for the identification of squamosal abnormalities and show that this method improved the sensitivity of detecting squamosal-related abnormalities from 0.48 to 0.92.
Automatic segmentation of parotid glands in head and neck CT images for IMRT planning has drawn attention in recent
years. Although previous approaches have achieved substantial success by reaching high overall volume-wise accuracy,
suboptimal segmentations are observed on the interior boundary of the gland where the contrast is poor against the
adjacent muscle groups. Herein we propose to use a constrained active shape model with landmark uncertainty to
improve the segmentation in this area. Results obtained using this method are compared with results obtained using a
regular active shape model through a leave-one-out experiment.
Segmenting the thyroid gland in head and neck CT images for IMRT treatment planning is of great importance. In this
work, we evaluate and compare multi-atlas methods to segment this structure. The various methods we evaluate range
from using a single average atlas representative of the population to selecting one atlas based on three similarity
measures. We also compare ways to combine segmentation results obtained with several atlases, i.e., vote rule, and
STAPLE, which is a commonly used method to combine multiple segmentations. We show that the best results are
obtained when several atlases are combined. We also show that with our data sets, STAPLE does not lead to the best
results.
Segmenting the lymph node regions in head and neck CT images has been a challenging topic in the area of medical image
segmentation. The method proposed herein implements an atlas-based technique constrained by an active shape model
(ASM) to segment the level II, III and IV lymph nodes as one structure. A leave-one-out evaluation study performed on 15
data sets shows that the results obtained with this technique are better than those obtained with a pure atlas-based
segmentation method, in particular in regions of poor contrast.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.