A New Milestone for ChatGPT
GPT-4 rivals doctors in many medical exams - and beats them in psychiatry
ChatGPT has hit a new milestone: It can now pass the Israeli medical board exams for four of the five core areas - and in many cases, it matches or even surpasses doctors.
Uriel Katz and colleagues compared the performance of GPT-3.5 and GPT-4 to that of 849 resident doctors who took the exams in 2022. The graph shows the key findings. (Note that OB/GYN refers to obstetrics and gynecology.)
As you can see, GPT-4 matched the meat doctors in general surgery and internal medicine, and beat them in psychiatry. It did worse overall in pediatrics and OB/GYN, but still did better than a large fraction of the humans. And although it didn’t pass the OB/GYN exam (65% is the pass mark), it came close.
Perhaps most importantly, though, GPT-4 did vastly better than its predecessor, GPT-3.5, suggesting rapid progress.
Here’s an excerpt from the paper, which was published on April the 12th in The New England Journal of Medicine AI.
This work shows a leap in the advancement of AI-based technology, in which LLMs reached physician-level performance on medical board examinations. This reaffirms previous works that have shown the evolutionary progress and enhanced performance from GPT-3.5 (November 2021) to GPT-4 (March 2022). Compared with the performance of 849 physicians who took the board medical examination in 2022, GPT-4 performed above the median physician in internal medicine and psychiatry and ranked above a considerable fraction of physicians in other disciplines. GPT-3.5 was inferior to nearly all physicians in every specialty except psychiatry. GPT-4 performance reached passing rate levels in all five core medical domains, whereas GPT-3.5 consistently fell short of the passing score across all disciplines…
Given the maturity of this rapidly improving technology, the adoption of LLMs in clinical medical practice is imminent. Although the integration of AI poses challenges, the potential synergy between AI and physicians holds tremendous promise. This juncture represents an opportunity to reshape physician training and capabilities in tandem with the advancements in AI.
You can read the full paper here.
Follow Steve on Twitter/X.
To support my work, please consider upgrading to a paid subscription. This will get you: (1) full access to all new posts and the archive, (2) full access to my “12 Things Everyone Should Know” posts, Linkfests, and other regular features, and (3) the ability to post comments and interact with the N3 Newsletter community. Thanks!