Graham, Simon, Minhas, Fayyaz, Bilal, Mohsin, Ali, Mahmoud, Tsang, Yee Wah, Eastwood, Mark, Wahab, Noorul, Jahanifar, Mostafa, Hero, Emily, Dodd, Katherine, Sahota, Harvir, Wu, Shaobin, Lu, Wenqi, Azam, Ayesha, Benes, Ksenija, Nimir, Mohammed, Hewitt, Katherine, Bhalerao, Abhir, Robinson, Andrew, Eldaly, Hesham, Raza, Shan E Ahmed, Gopalakrishnan, Kishore, Snead, David and Rajpoot, Nasir (2023) Screening of normal endoscopic large bowel biopsies with interpretable graph learning: a retrospective study. Gut, 72 (9). pp. 1709-1721. ISSN 0017-5749
|
Published Version
Available under License Creative Commons Attribution Non-commercial. Download (18MB) | Preview |
Abstract
Objective To develop an interpretable artificial intelligence algorithm to rule out normal large bowel endoscopic biopsies, saving pathologist resources and helping with early diagnosis. Design A graph neural network was developed incorporating pathologist domain knowledge to classify 6591 whole-slides images (WSIs) of endoscopic large bowel biopsies from 3291 patients (approximately 54% female, 46% male) as normal or abnormal (non-neoplastic and neoplastic) using clinically driven interpretable features. One UK National Health Service (NHS) site was used for model training and internal validation. External validation was conducted on data from two other NHS sites and one Portuguese site. Results Model training and internal validation were performed on 5054 WSIs of 2080 patients resulting in an area under the curve-receiver operating characteristic (AUC-ROC) of 0.98 (SD=0.004) and AUC-precision-recall (PR) of 0.98 (SD=0.003). The performance of the model, named Interpretable Gland-Graphs using a Neural Aggregator (IGUANA), was consistent in testing over 1537 WSIs of 1211 patients from three independent external datasets with mean AUC-ROC=0.97 (SD=0.007) and AUC-PR=0.97 (SD=0.005). At a high sensitivity threshold of 99%, the proposed model can reduce the number of normal slides to be reviewed by a pathologist by approximately 55%. IGUANA also provides an explainable output highlighting potential abnormalities in a WSI in the form of a heatmap as well as numerical values associating the model prediction with various histological features. Conclusion The model achieved consistently high accuracy showing its potential in optimising increasingly scarce pathologist resources. Explainable predictions can guide pathologists in their diagnostic decision-making and help boost their confidence in the algorithm, paving the way for its future clinical adoption.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.