Large language models (LLMs) are showing promise as tools to assist physicians, particularly in complex and specialized areas of medicine. A recent study published in , in the Chin Med J (Engl), investigated the potential of LLMs to support cardiologists in assessing rare, life-threatening cardiac diseases, specifically inherited cardiomyopathies.
The research team, based at the Xijing Hospital of Digestive Diseases, Fourth Military Medical University in China, focused on a critical gap in care: access to subspecialty cardiac expertise. Inherited cardiomyopathies, conditions affecting the heart muscle, often require specialized knowledge for accurate diagnosis and management. However, access to these specialists is limited globally, leading to potential delays in care and increased risk for patients.
To address this, researchers curated a real-world clinical dataset of patients suspected of having inherited cardiomyopathies. They then evaluated how LLMs could assist general cardiologists in assessing these cases. The study employed a randomized controlled trial design, where blinded subspecialist cardiologists assessed clinical assessments made by general cardiologists, both with and without the aid of an LLM called AMIE.
The results were encouraging. Subspecialists demonstrated a clear preference for the LLM-assisted assessments, finding that they contained fewer clinically significant errors (11.2% reduction)
and missed less important content (19.6% reduction)
while maintaining equivalent clinical reasoning quality.
General cardiologists who used AMIE also reported significant benefits. More than half (57.0%
) found the system helpful in their assessments, and it helped avoid missing clinically significant findings in 93.5%
of cases. The LLM reduced assessment time in over half of the cases (50.5%
).
The study highlights the potential of LLMs to bridge unmet needs in genetic cardiovascular disease and possibly in cardiac care more broadly
, according to the authors. This represents particularly relevant given the cardiology workforce crisis
identified by the American College of Cardiology, where access to subspecialty care is a growing concern. The researchers noted that despite the presence of centers of excellence in states like California and New York, 27 states
in the US currently lack access to these specialized services.
This lack of access contributes to a significant number of undiagnosed cases. Estimates suggest that more than 60%
of patients with hypertrophic cardiomyopathy (HCM) in the US remain undiagnosed, with rates likely higher globally. The potential for sudden cardiac death, a leading cause of mortality in young adults, further underscores the importance of timely and accurate diagnosis.
While the study demonstrates the feasibility of using LLMs the researchers acknowledge limitations. The LLM system was limited to analyzing text-based reports, rather than directly interpreting imaging data. They also noted that, while infrequent, the LLM occasionally generated hallucinations
– factually incorrect statements – which were typically identified by the cardiologists when reviewing the assessments.
The study also builds on previous research in the field. A study published on medRxiv found that physicians using LLMs achieved a mean diagnostic reasoning score of 71.4%
per case, compared to 42.6%
for those without LLM assistance. This suggests that LLMs can significantly improve diagnostic accuracy, particularly when used as a support tool for clinicians.
The researchers emphasize that LLMs should not be viewed as replacements for physicians, but rather as tools to augment their expertise. They highlight the importance of physician oversight to identify and correct any inaccuracies generated by the LLM. The study also underscores the need for further research to evaluate the long-term impact of LLMs on patient outcomes and to address potential biases and inequities in access to care.
The study’s authors made their data openly available, hoping to facilitate further research and validation of their findings. They also created and validated a 10-domain evaluation rubric
that can be used in future studies assessing LLM performance in clinical settings. This represents a significant step toward establishing a standardized framework for evaluating the clinical utility of these emerging technologies.
The findings suggest a future where LLMs can play a crucial role in expanding access to specialized medical care, particularly in areas where there is a shortage of experts. However, careful implementation and ongoing evaluation are essential to ensure that these tools are used safely and effectively to improve patient care.
