AI Chatbots Struggle with Accuracy on Complex Theranostics Questions
AI Chatbots Show promise, But Fall Short on Complex Cancer Treatment Questions
New research presented at the Radiological Society of North America (RSNA) annual meeting highlights the limitations of AI chatbots like ChatGPT-4 and Gemini when tackling complex medical topics.
While these chatbots excel at providing easy-to-understand answers to simple questions, they struggle with the nuances of intricate medical procedures like lutetium-177 (Lu-177) prostate-specific membrane antigen (PSMA)-617 therapy, also known as Pluvicto.
“They generally struggled with pre- and post-therapy instructions and also side effects,” said Dr. Gokce Belge Bilgin, a radiologist at the Mayo Clinic in Rochester, Minnesota, who led the study. ”As a notable example, both claim that the most common side effect is allergic reaction, which is not that common in clinical practice.”
The rise of AI chatbots like ChatGPT and Gemini has revolutionized how people access data, including medical advice. However, their accuracy and reliability on complex subjects remain a concern.testing the Limits: 12 Key Questions
To assess the capabilities of these chatbots, researchers posed 12 common patient questions about Pluvicto therapy, a targeted radiation treatment for advanced prostate cancer. The questions covered topics ranging from how the therapy works to its potential side effects, cost, and availability.
Accuracy vs. Readability: A balancing Act
ChatGPT-4 demonstrated slightly higher accuracy compared to Gemini, scoring 2.95 out of 4 on a standardized scale. However, Gemini’s responses were deemed more readable, achieving a score of 2.79 out of 3, compared to ChatGPT-4’s 2.94. Both chatbots scored similarly for conciseness.
Misleading Information: A Cause for Concern
Alarmingly, 17% of ChatGPT-4’s responses and 29% of Gemini’s responses were categorized as incorrect or partially correct. Gemini’s answers contained significantly more misleading information than chatgpt-4, raising concerns about the potential for patient misunderstanding and poor decision-making.
Ethical Considerations: A New Frontier
Dr. Bilgin emphasized the need for caution when relying on AI chatbots for complex medical information. “AI chatbots like ChatGPT and Gemini are a promising step forward in making medical information more accessible,” she said.”Though, they are not yet reliable enough for standing alone for complex topics and there’s still work that needs to be done to ensure accuracy, safety, and trust.”
The study also highlights the emerging ethical challenges surrounding AI in healthcare, including patient data privacy and potential medicolegal issues. As AI technology continues to evolve, it is indeed crucial to prioritize accuracy, transparency, and patient safety.
AI Chatbots: Promising But Not Quite Ready for Complex Cancer Treatment Questions
CHICAGO – While artificial intelligence (AI) chatbots like ChatGPT-4 and gemini are transforming how people access facts, including medical advice, new research suggests thay still have limitations when dealing with complex medical topics.
A study presented at the Radiological Society of North America (RSNA) annual meeting reveals that these chatbots struggle with intricate medical procedures, particularly lutetium-177 (Lu-177) prostate-specific membrane antigen (PSMA)-617 therapy, also known as Pluvicto, a targeted radiation treatment for advanced prostate cancer.
“They generally struggled with pre- and post-therapy instructions and also side effects,” said Dr. Gokce Belge Bilgin, a radiologist at the Mayo Clinic in Rochester, Minnesota, who led the study. “As a notable example, both claim that the most common side effect is allergic reaction, which is not that common in clinical practice.”
To evaluate the chatbots’ capabilities, researchers asked them 12 common patient questions about Pluvicto therapy. These questions covered various aspects, from how the therapy works to its potential side effects, cost, and availability.
While ChatGPT-4 demonstrated slightly higher accuracy than Gemini, scoring 2.95 out of 4 on a standardized scale, Gemini’s responses were deemed more readable. However, both chatbots provided misleading information in a significant number of cases – 17% for ChatGPT-4 and 29% for Gemini.
Dr. Bilgin stressed the importance of caution when relying on AI chatbots for complex medical information. ”AI chatbots are a promising step forward in making medical information more accessible,” she said. “Though, they are not yet reliable enough for standing alone for complex topics, and there is still work to be done to ensure accuracy, safety, and trust.”
This study raises significant ethical questions about AI in healthcare, including patient data privacy and potential medicolegal issues. As AI technology continues to evolve, prioritizing accuracy, transparency, and patient safety is paramount.
