Home » Health » Revolutionizing Breast Cancer TIL Scoring: AI Models vs. Real-World Validation Challenges

Revolutionizing Breast Cancer TIL Scoring: AI Models vs. Real-World Validation Challenges

by Catherine Williams - Chief Editor

AI Models Improve TIL Assessments in Breast Cancer

Researchers recently studied AI-based models for assessing tumor-infiltrating lymphocytes (TILs) in triple-negative breast cancer (TNBC). The goal was to see if these models provided better prognostic and analytical performance compared to manual methods.

Recent advancements in breast cancer treatments highlight the need for effective biomarker-based risk stratification. Accurate evaluation of TILs, specifically stromal TILs (sTILs), is crucial for treatment decisions. While guidelines exist for scoring TILs, variability between assessments remains an issue. This leads to a demand for automated systems that can enhance accuracy and understanding of tumor-immune interactions.

Study Overview

The researchers examined 10 AI TIL assessment models using tissue samples from 106 women diagnosed with primary invasive TNBC. They divided the samples into training and testing sets, utilizing clinical data from 215 patients for external validation.

The team built automated TIL scoring algorithms through the QuPath platform, employing three main model types: neural network (NN), K-nearest neighbor (KNN), and random trees (RT). They implemented a “human-in-the-loop” approach, ensuring models were trained with accurate, iterative manual annotations of around 450 cells per image.

Researchers included three advanced deep-learning models—CellViT, HoverNet, and Abousamra’s model—to compare different techniques. Digital TIL scores were calculated using easTILs, except for Abousamra’s model, which utilized a different method for scoring.

Findings

The study developed seven models and found that RT10 and KNN10 provided varied TIL score distributions, while NN models had consistent distributions. CellViT and HoverNet offered narrower score distributions, showing differences from manual scores.

RT10 performed best in internal validation, while KNN10 showed weaker correlation. Models generally scored lower in external validation, suggesting that larger training samples do not necessarily guarantee improved performance. Nonetheless, all models indicated potential for prognostic value, especially in continuous TIL scoring.

Conclusions

The research assessed the potential of AI models for TIL scoring against invasive disease-free survival (IDFS). Results showed moderate to good analytical performance across AI models, emphasizing a gap in performance between internal and external cohorts.

For practical use in clinical settings, AI models must provide clear and understandable results. Large, diverse datasets are essential to standardize and validate these models, ensuring they can effectively support clinical decisions.

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.