ChatGPT Health Triage Accuracy and Safety Risks in Urgent Care
- A study published May 7, 2026, in Nature Medicine indicates that ChatGPT Health exhibits significant reliability gaps when providing triage advice, particularly at the clinical extremes of patient...
- The research found that while the AI tool demonstrated high accuracy when assessing moderately urgent conditions, it frequently failed to correctly categorize cases that were either very mild...
- These findings highlight substantial safety risks associated with the use of AI for urgent care decision-making, as the tool's performance becomes unstable when dealing with the most and...
A study published May 7, 2026, in Nature Medicine indicates that ChatGPT Health exhibits significant reliability gaps when providing triage advice, particularly at the clinical extremes of patient urgency.
The research found that while the AI tool demonstrated high accuracy when assessing moderately urgent conditions, it frequently failed to correctly categorize cases that were either very mild or critical emergencies.
These findings highlight substantial safety risks associated with the use of AI for urgent care decision-making, as the tool’s performance becomes unstable when dealing with the most and least severe medical scenarios.
Triage Accuracy and Clinical Extremes
The study identified a pattern of inconsistency in how ChatGPT Health handles different levels of medical urgency. While the tool is effective for cases of moderate urgency, its utility diminishes at both ends of the clinical spectrum.
In instances involving mild conditions, the AI frequently overtriaged patients. Overtriaging occurs when a tool suggests a higher level of care or more urgent intervention than is medically necessary for the symptoms presented.
Conversely, the tool frequently undertriaged emergencies. Undertriaging is a critical failure where life-threatening or urgent conditions are categorized as less severe, potentially leading to delays in essential medical treatment.
Implications for Patient Safety
The tendency to undertriage emergencies poses the most immediate risk to patient safety. When an AI tool fails to recognize the severity of an emergency, it may discourage a patient from seeking immediate professional medical help, which can result in adverse health outcomes.
The issue of overtriaging mild cases also presents systemic challenges. Frequent overtriaging can lead to an increase in unnecessary emergency room visits and urgent care consultations, contributing to healthcare provider burnout and the inefficient allocation of limited medical resources.
Because these errors occur at the clinical extremes, the reliability of AI-driven triage is called into question for any application where high-stakes decision-making is required without human oversight.
Broader Medical Context
The findings from the Nature Medicine report have implications across a wide range of medical disciplines. The ability to accurately triage symptoms is fundamental to the management of various health crises, including those related to infectious diseases, neurosciences, and metabolic diseases.

Accuracy in identifying signs and symptoms is critical not only for general biomedicine but also for specialized areas such as cancer research and molecular medicine, where early and correct triage can significantly impact the trajectory of patient care.
The study suggests that the current state of AI triage tools may not be sufficient for independent use in urgent care settings due to these safety risks at the extremes of patient urgency.
As health policy continues to evaluate the integration of artificial intelligence into clinical workflows, the results of this research emphasize the need for rigorous validation to ensure that AI tools do not compromise patient safety through miscategorization of urgent medical needs.
