Open Access
8 February 2023 E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation
Diana Lin, Kareem A. Wahid, Benjamin E. Nelms, Renjie He, Mohamed A. Naser, Simon Duke, Michael V. Sherer, John P. Christodouleas, Abdallah S. R. Mohamed, Michael Cislo, James D. Murphy, Clifton D. Fuller, Erin F. Gillespie
Author Affiliations +
Abstract

Purpose

Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we sought to characterize whether aggregate segmentations generated from multiple nonexperts could meet or exceed recognized expert agreement.

Approach

Participants who contoured ≥1 region of interest (ROI) for the breast, sarcoma, head and neck (H&N), gynecologic (GYN), or gastrointestinal (GI) cases were identified as a nonexpert or recognized expert. Cohort-specific ROIs were combined into single simultaneous truth and performance level estimation (STAPLE) consensus segmentations. STAPLEnonexpert ROIs were evaluated against STAPLEexpert contours using Dice similarity coefficient (DSC). The expert interobserver DSC (IODSCexpert) was calculated as an acceptability threshold between STAPLEnonexpert and STAPLEexpert. To determine the number of nonexperts required to match the IODSCexpert for each ROI, a single consensus contour was generated using variable numbers of nonexperts and then compared to the IODSCexpert.

Results

For all cases, the DSC values for STAPLEnonexpert versus STAPLEexpert were higher than comparator expert IODSCexpert for most ROIs. The minimum number of nonexpert segmentations needed for a consensus ROI to achieve IODSCexpert acceptability criteria ranged between 2 and 4 for breast, 3 and 5 for sarcoma, 3 and 5 for H&N, 3 and 5 for GYN, and 3 for GI.

Conclusions

Multiple nonexpert-generated consensus ROIs met or exceeded expert-derived acceptability thresholds. Five nonexperts could potentially generate consensus segmentations for most ROIs with performance approximating experts, suggesting nonexpert segmentations as feasible cost-effective AI inputs.

CC BY: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Diana Lin, Kareem A. Wahid, Benjamin E. Nelms, Renjie He, Mohamed A. Naser, Simon Duke, Michael V. Sherer, John P. Christodouleas, Abdallah S. R. Mohamed, Michael Cislo, James D. Murphy, Clifton D. Fuller, and Erin F. Gillespie "E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation," Journal of Medical Imaging 10(S1), S11903 (8 February 2023). https://doi.org/10.1117/1.JMI.10.S1.S11903
Received: 3 October 2022; Accepted: 2 January 2023; Published: 8 February 2023
Lens.org Logo
CITATIONS
Cited by 5 scholarly publications.
Advertisement
Advertisement
KEYWORDS
Image segmentation

Breast

Radiation oncology

Diseases and disorders

Radiotherapy

Head

Lymph nodes

Back to Top