Lung Cancer Biomarker Detection: AI Pathology Model

Real-Time EGFR Prediction from ‌Whole Slide Images Using⁢ Deep ⁣Learning

Abstract

Accurate⁢ and timely EGFR (Epidermal Growth Factor ⁤Receptor) mutation status determination is ‌crucial for guiding ⁣treatment decisions in Non-Small⁢ Cell Lung Cancer (NSCLC). Current molecular testing methods, while accurate, can have significant turnaround times. here, we present ⁤EAGLE, a deep learning‍ model capable of predicting EGFR mutation⁢ status directly ⁣from whole slide images (WSIs) ⁢of ‌hematoxylin and eosin (H&E) stained tissue. EAGLE achieves high accuracy, comparable too ⁣standard molecular testing,⁢ and enables real-time⁣ prediction, considerably⁢ accelerating ⁤the clinical workflow. We demonstrate the prosperous implementation of EAGLE within a clinical pipeline at Memorial Sloan Kettering Cancer Center (MSKCC), showcasing ⁣its potential for ⁤immediate impact on patient care.

1. Introduction

Non-Small Cell Lung‌ Cancer (NSCLC)‍ is the leading cause of cancer-related mortality worldwide. Targeted therapies, particularly those directed against Epidermal Growth‍ Factor Receptor⁣ (EGFR) mutations, have dramatically improved outcomes for a significant subset of NSCLC patients.¹ However, ⁣the effectiveness of ⁢thes therapies hinges on⁣ accurate and timely ‍identification of EGFR mutations. Current standard-of-care testing relies on⁤ molecular assays like polymerase chain reaction (PCR) or next-generation⁤ sequencing (NGS),⁢ which, while highly ⁤accurate, typically ⁢require ⁣several days to weeks for results.² This⁣ delay can postpone the initiation of targeted therapy, potentially impacting⁣ patient prognosis. ⁤

The wealth of morphological facts contained within whole slide images (WSIs) of H&E stained tissue presents an opportunity to develop computational methods for rapid, predictive biomarker ⁣assessment. ⁢⁣ Deep learning, particularly ⁣convolutional neural networks (CNNs), has shown remarkable success in analyzing⁢ medical ‌images and‌ extracting clinically relevant features.^3,4 Here, we ⁤introduce EAGLE (EGFR assessment via Gradient-guided Learning ⁤Engine), a deep learning model⁢ designed⁤ for real-time⁢ EGFR mutation prediction⁤ directly ⁢from‍ WSIs, integrated into a clinical pipeline for accelerated ⁤biomarker ⁣assessment.2. results

2.1. EAGLE Model Architecture and Training

EAGLE is a deep ⁤learning ⁤model built upon a transformer-based ⁢architecture,optimized for⁤ analyzing high-resolution WSI data. To address⁣ the computational challenges associated with processing large ‍images,‍ we⁤ implemented⁢ a parallelized encoding strategy.The encoding process is ⁣distributed across 23 NVIDIA ⁢GPUs,each processing 96 tissue patches,effectively dividing the GPU ‌memory burden. Encoded images ‌are then aggregated ⁤on‌ a seperate GPU using Gradient-guided Model Aggregation (GMA) to produce the final classification⁣ loss. Backpropagation distributes gradients to each process for synchronized updates. We utilized 16-bit float precision⁣ during patch encoding‌ to enable larger batch sizes and accelerate training.

The ⁤model was trained on 24‌ NVIDIA ⁢H100-80GB GPUs for 20 epochs, completing in approximately 9.28 hours. ⁢ At inference, EAGLE can operate efficiently on⁢ a single NVIDIA ⁣RTX 3090 GPU with 26 GB of ⁣memory. The median processing time per slide during inference is 68 ‌seconds, demonstrating its suitability for real-time clinical application. Deployment on lower-capacity hardware⁤ is‍ possible, albeit with a trade-off between memory consumption and inference ‌speed.

2.2. ‍Clinical Pipeline Implementation and Performance

We integrated EAGLE into a real-time clinical pipeline at MSKCC, designed to identify and process WSIs from primary LUAD (Lung Adenocarcinoma) specimens for EGFR⁤ prediction (Figure⁢ 3). ⁢MSKCC processes ‌90-110 ‍NSCLC cases monthly requiring EGFR testing.The pipeline utilizes ⁣two automated‍ “watcher” applications running ⁣on an hourly cadence: one ‌to identify newly scanned slides ‌and⁢ another to identify lung cancer cases sent for molecular analysis. Upon ⁤matching a slide to a relevant case, the slide is automatically transferred to the GPU compute⁤ infrastructure for immediate EAGLE inference. The first scanned WSI is ‍utilized when multiple slides are available.

During a silent trial, we collected data on EAGLE predictions, rapid ⁣test results,⁣ and MSK-IMPACT (MSK’s complete genomic profiling platform) results. Timestamps for key events – rapid test accessioning, EAGLE prediction generation, rapid⁣ test result availability, and MSK-IMPACT result ‍availability – were recorded to assess the performance of the EAGLE-assisted screening pipeline compared to‌ the standard rapid test ⁣workflow.This allowed for a ⁤direct ⁣comparison ⁤of turnaround times and potential for accelerated clinical decision-making.

2.3. Software and ‍Reporting Summary

The EAGLE model was developed using PyTorch (v.2.1.1+cu121), and the associated software⁢ pipelines‌ were built with Python (v.3.8

Lung Cancer Biomarker Detection: AI Pathology Model

Real-Time EGFR Prediction​ from ‌Whole Slide Images Using⁢ Deep ⁣Learning

Share this:

Related

Real-Time EGFR Prediction from ‌Whole Slide Images Using⁢ Deep ⁣Learning