philip lelyveld The world of entertainment technology

14Apr/26Off

AI remains lacking in clinical reasoning abilities, according to study of 21 large language models

Despite increasing use of artificial intelligence (AI) in health care, a new study led by Mass General Brigham researchers from the MESH Incubator shows that generative AI models continue to fall short in their clinical reasoning capabilities. ...

By asking 21 different large language models (LLMs) to play doctor in a series of clinical scenarios, the researchers showed that LLMs often fail at navigating diagnostic workups and coming up with a testable list of potential or "differential" diagnoses.

Though all tested LLMs arrived at a correct final diagnosis more than 90% of the time when provided with all pertinent information in a patient's case, they consistently performed poorly at the earlier, reasoning-driven steps of the diagnostic process, according to results published in JAMA Network Open.

"Despite continued improvements, off-the-shelf large language models are not ready for unsupervised clinical-grade deployment," said corresponding author Marc Succi, MD, executive director of the MESH Incubator at Mass General Brigham. ...

See the full story here: https://medicalxpress.com/news/2026-04-ai-lacking-clinical-abilities-large.html

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.