Breast density assessment is an important part of breast cancer risk assessment, as it has been known to correlate with risk. Mammograms would typically be assessed for density by multiple expert readers, however, interobserver variability can be high. Meanwhile, automatic breast density assessment tools are becoming more prevalent, particularly those based on artificial intelligence. We evaluate one such method against expert readers. A cohort of 1329 women going through screening was used to compare between two expert readers selected from a pool of 19, and a single such reader versus a deep learning based model. Whilst the mean differences for the two experiments were statistically similar, the limits of agreement between the AI method and a single reader are substantially lower at +SD 21 (95% CI : 20.07, 22.13) -SD 22 (95% CI : -22.95, -20.90) against +SD 31 (95% CI : 33.09, 28.91) -SD 28 (95% CI : -30.09, -25.91) between two expert readers. Additionally, the absolute intraclass correlation coefficients (two-way random multiple measures) were 0.86 (95% CI : 0.85, 0.88) between the AI and reader and 0.77 (95% CI : 0.75, 0.80) between the two readers achieving statistical significance. Our AI-driven breast density assessment tool has better inter-observer agreement with a randomly selected expert reader than two expert readers (drawn from a pool) do with one another. Additionally, the automatic method has similar inter-view agreement to experts and maintains consistency across density quartiles. Deep learning enabled density methods can offer a solution to the reader bias issue and provide consistent density scores.
The prevention and early detection of breast cancer hinges on precise prediction of individual breast cancer risk. Whilst well-established clinical risk factors can be used to stratify the population into risk groups, the addition of genetic information and breast density has been shown to improve prediction. Deep learning based approach have been shown to automatically extract complex information from images. However, this is a challenging area of research, partly due to the lack of data within the field, therefore there is scope for novel approaches. Our method uses Multiple Instance Learning in tandem with attention in order to make accurate, short-term risk predictions from full-sized mammograms taken prior to the detection of cancer. This approach ensures small features like calcifications are not lost in a downsizing process and the whole mammogram is analysed effectively. An attention pooling mechanism is designed to highlight patches of increased importance and improve performance. We also use transfer learning in order to utilise a rich source of screen-detected cancers and evaluate whether a model trained to detect cancers in mammograms allows us also to predict risk in priors. Our model achieves an AUC of 0.620 (0.585,0.657) in cancer-free screening mammograms of women who went on to a screen-detected or interval cancer between 5 and 55 months later, including for common breast cancer risk factors. Additionally, our model is able to discriminate interval cancers at an AUC of 0.638 (0.572, 0.703) and highlights the potential for such a model to be used alongside national screening programmes.
Accurate prediction of individual breast cancer risk paves the way for personalised prevention and early detection. Whilst well-established clinical risk factors can be used to stratify the population into risk groups, the addition of genetic information and breast density has been shown to improve prediction. Machine learning enabled automatic risk prediction provides key advantages over existing methods such as the ability to extract more complex information from mammograms. However, this is a challenging area of research, partly due to the lack of data within the field, therefore there is scope for novel approaches. Our method uses Multiple Instance Learning in tandem with attention in order to make accurate, short-term risk predictions from full-sized mammograms taken prior to the detection of cancer. This approach ensures small features like calcifications are not lost in a downsizing process and the whole mammogram is analysed effectively. An attention pooling mechanism is designed to highlight patches of increased importance and improve performance. Additionally, this increases the interpretability of our model as important patches can be shown in a saliency map. We also use transfer learning in order to utilise a rich source of screen-detected cancers and evaluate whether a model trained to detect cancers in mammograms allows us also to predict risk in priors. Our model achieves an AUC of 0.635 (0.600,0.669) in cancer-free screening mammograms of women who went on to a screen-detected or interval cancer between 5 and 55 months and an AUC of 0.804 (0.777,0.830) in screen-detected cancers.
KEYWORDS: Data modeling, Education and training, Transformers, Performance modeling, Visualization, Deep learning, Visual process modeling, Multiplexing, Image processing, Diagnostics
We established a translation dataset that contains pixel-wise registered H&E and multiplexed immunohistochemistry (mIHC) staining images. Deep learning models were trained to translate H&E inputs into their corresponding mIHC image versions. Comparison experiments have been carried out to validate the translation performance between TransUNet, U-Net, and pix2pix models. We also compared the impact of different Losses on model performances. The TransUNet model could achieve 0.862 SSIM score for L1 loss and 0.805 for L2 loss, surpassing U-Net and pix2pix model in both settings. This demonstrates the potential benefit of the Transformer module in stain translation tasks.
Mammographic density is an important risk factor for breast cancer. In recent research, percentage density assessed visually using visual analogue scales (VAS) showed stronger risk prediction than existing automated density measures, suggesting readers may recognize relevant image features not yet captured by hand-crafted algorithms. With deep learning, it may be possible to encapsulate this knowledge in an automatic method. We have built convolutional neural networks (CNN) to predict density VAS scores from full-field digital mammograms. The CNNs are trained using whole-image mammograms, each labeled with the average VAS score of two independent readers. Each CNN learns a mapping between mammographic appearance and VAS score so that at test time, they can predict VAS score for an unseen image. Networks were trained using 67,520 mammographic images from 16,968 women and for model selection we used a dataset of 73,128 images. Two case-control sets of contralateral mammograms of screen detected cancers and prior images of women with cancers detected subsequently, matched to controls on age, menopausal status, parity, HRT and BMI, were used for evaluating performance on breast cancer prediction. In the case-control sets, odd ratios of cancer in the highest versus lowest quintile of percentage density were 2.49 (95% CI: 1.59 to 3.96) for screen-detected cancers and 4.16 (2.53 to 6.82) for priors, with matched concordance indices of 0.587 (0.542 to 0.627) and 0.616 (0.578 to 0.655), respectively. There was no significant difference between reader VAS and predicted VAS for the prior test set (likelihood ratio chi square, p = 0.134). Our fully automated method shows promising results for cancer risk prediction and is comparable with human performance.
Background: Mammographic density is an important risk factor for breast cancer. Recent research demonstrated that percentage density assessed visually using Visual Analogue Scales (VAS) showed stronger risk prediction than existing automated density measures, suggesting readers may recognise relevant image features not yet captured by automated methods.
Method: We have built convolutional neural networks (CNN) to predict VAS scores from full-field digital mammograms. The CNNs are trained using whole-image mammograms, each labelled with the average VAS score of two independent readers. They learn a mapping between mammographic appearance and VAS score so that at test time, they can predict VAS score for an unseen image. Networks were trained using 67520 mammographic images from 16968 women, and tested on a large dataset of 73128 images and case-control sets of contralateral mammograms of screen detected cancers and prior images of women with cancers detected subsequently, matched to controls on age, menopausal status, parity, HRT and BMI.
Results: Pearson's correlation coefficient between readers' and predicted VAS in the large dataset was 0.79 per mammogram and 0.83 per woman (averaging over all views). In the case-control sets, odds ratios of cancer in the highest vs lowest quintile of percentage density were 3.07 (95%CI: 1.97 - 4.77) for the screen detected cancers and 3.52 (2.22 - 5.58) for the priors, with matched concordance indices of 0.59 (0.55 - 0.64) and 0.61 (0.58 - 0.65) respectively.
Conclusion: Our fully automated method demonstrated encouraging results which compare well with existing methods, including VAS.
Anna Maria Tsakiroglou, Sophie Fitzpatrick, Lilli Nelson, Catharine West, Kim Linton, Kang Zeng, Garry Ashton, Sue Astley, Richard Byers, Martin Fergie
Background: Observing the spatial pattern of tumour infiltrating lymphocytes in follicular lymphoma can lead to the development of promising novel biomarkers for survival prognosis. We have developed the “Hypothesised Interactions Distribution” (HID) analysis, to quantify the spatial heterogeneity of cell type interactions between lymphocytes in the tumour microenvironment. HID features were extracted to train a machine learning model for survival prediction and their performance was compared to other architectural biomarkers. Scalability of the method was examined by observing interactions between cell types that were identified using 6-plexed immunofluorescent staining. Methods: Two follicular lymphoma datasets were used in this study; a microarray with tissue cores from patients, stained with CD69, CD3 and FOXP3 using multiplexed brightfield immunohistochemistry and a second tissue microarray, stained with PD1, PDL1, CD4, FOXP3, CD68 and CD8 using immunofluorescence. Spectral deconvolution, nuclei segmentation and cell type classification was carried out, followed by extraction of features based on cell type interaction probabilities. Random Forest classifiers were built to assign patients into groups of different overall survival and the performance of HID features was assessed. Results: HID features constructed over a range of interaction distances were found to significantly predict overall survival in both datasets (p = 0.0363, p = 0.0077). Interactions of specific phenotype pairs, correlated with unfavourable prognosis, could be identified, such as the interactions between CD3+FOXP3+ cells and CD3+CD69+ cells. Conclusion: Further validation of HID demonstrates its potential for development of clinical biomarkers in follicular lymphoma.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.