The Carlat Psychiatry Podcast
RSS

Clear, engaging, and practical updates on clinical psychiatry.
Earn CME for listening to the podcast with a Podcast CME Subscription.
Listen for free here or using Apple Podcasts and Spotify.

General Psychiatry

AI in Practice

June 9, 2025

Chris Aiken, MD and Kellie Newsome, PMHNP

Chris Aiken, MD and Kellie Newsome, PMHNP have disclosed no relevant financial or other interests in any commercial companies pertaining to this educational activity.

Doctor using AI interface with digital charts, health icons and futuristic data visualizations for advanced healthcare analysis and patient data tracking | Shutterstock

Artificial Intelligence is full of medical information. Here’s how to use it wisely.

Publication Date: 06/09/2025

Duration: 18 minutes, 48 seconds

Transcript:

KELLIE NEWSOME: Artificial Intelligence is beating doctors at diagnosis. Today, we’ll show you how to add AI to your practice and how to use it wisely. Welcome to The Carlat Psychiatry Podcast, keeping psychiatry honest since 2003.

CHRIS AIKEN: I’m Chris Aiken, the editor-in-chief of The Carlat Psychiatry Report.

KELLIE NEWSOME: And I’m Kellie Newsome, a psychiatric NP and a dedicated reader of every issue.

CHRIS AIKEN: I remember a time in the late 1990s when the debate was over expertise or evidence. The question went something like this: Would a patient with leukemia get better care in the hands of a seasoned oncologist or a novice who has an encyclopedic knowledge of every randomized controlled oncology trial? Today, we’ve invented such a savant. Their name is Chat GPT, and in a recent trial, they outperformed all the experts.

KELLIE NEWSOME: Ethan Goh and colleagues compared diagnostic accuracy among 50 physicians against Chat GPT. Each were given six cases to diagnose. The physicians were correct 75% of the time, but Chat GPT outperformed them with 92% accuracy. The AI was version 4 of Chat GPT.

CHRIS AIKEN: OK, so the robot is beating the experts, but let’s not give up hope. What if we combined the two, so the doctors used Chat GPT to enhance their diagnostic skills? The study also looked at that, but surprisingly, the doctors did no better when they used Chat GPT as a helper.

KELLIE NEWSOME: Those results – from 2024 – were confirmed this year in a similar study of 20 physicians. This one didn’t use Chat GPT. They used AMIE, a specialized AI that is supped up for diagnostic acumen. AMIE was nearly twice as accurate as physicians working on their own – 59% vs 34%. 34% is pretty low, but these were difficult cases taken from the New England Journal of Medicine’s clinicopathological conferences. In this study, the doctors did perform better with assistance. Allowing them to use the textbooks and the internet nudged their accuracy from 34 to 36% - a small gain – and they improved more when they were allowed to use AI: all the way to 52%, a big jump, but still shy of the AI on its own at 59%.

CHRIS AIKEN: So far, AI is beating medical doctors at diagnosis. But what about psychiatry? Last month, Chang-Bae Bang and colleagues at Yonsei University in Korea tested the psychiatric diagnostic skills of four AI models. Chat GPT version 4 performed the best, so they tested it against psychiatric residents. Again, Chat GPT was the winner, with an F1 accuracy score of 63%, compared to 47% for the residents. And here’s the hopeful part – when these residents were allowed to use Chat GPT their accuracy improved to 60%. Close, but still shy of AI on its own at 63%.

KELLIE NEWSOME: Let’s recap. AI is at least 30% more accurate than doctors in making diagnoses, but doctors can improve if they use AI in their practice, and in this episode, we’ll show you some practical steps to make that leap. But first, the CME quiz for this episode:

TRUE or FALSE: A 1972 experiment showed that doctors are less likely than the general public to be fooled by a lecture that presented nonsensical content in an authoritative manner.

CHRIS AIKEN: Most of these studies were done in general medicine, where the knowledge required to make a diagnosis is much deeper than in psychiatry, where diagnoses are largely symptom-driven. We’re a few hundred years behind the rest of medicine in our diagnostic sophistication, still relying mainly on symptoms – but even when it comes to those DSM-based symptom checklists, we don’t perform so well against AI or against a paper and pencil screener. A few months ago, we reported in this podcast that a self-rated screener for bipolar disorder (a simple paper check box) had greater diagnostic accuracy than a clinical interview, that is a routine clinical interview, not a structured clinical interview. Part of the reason for this problem is bias. Psychiatric diagnoses cut a little closer to our humanity than medical diagnoses – psychiatric ones involve behaviors, they resemble personality traits, the stuff that makes us who we are, and if we are empathic towards our patients – and I hope we are – we don’t want to see them as having schizophrenia or bipolar disorder, two diagnosis that are often underdiagnosed. And our patients don’t want to see themselves that way. Better to have anxiety or ADHD.

KELLIE NEWSOME: Or autism. You know, I’m seeing a lot of psychotic patients these days diagnosed with autism. But psychosis is a symptom of schizophrenia. And autism is too. There’s a lot of overlap between autism and the negative symptoms of schizophrenia, so much so that 100 years ago, Eugene Bleuler came up with the four A’s to describe schizophrenia: Association, as in loose associations, illogical and bizarre thoughts; Affect, as in flat affect, Ambivalence, as in conflicting emotions and indecision; and Autism. Bleuler used the term autism before it got popularized as a developmental disorder, but his description was not too far off from our modern idea of autism – a tendency to withdraw from reality and become preoccupied with one's own inner world.

CHRIS AIKEN: I agree. Schizophrenia, bipolar, substance use… most stigmatizing disorders are vastly underdiagnosed, so if you’re seeing someone with autism who periodically develops florid psychosis, please reconsider schizophrenia. Chat GPT would not make such an error. Physicians are human, and we make typical human mistakes. For example, we defer to authority. In 1972, Myron Fox gave a lecture to a group of psychiatry residents on the mathematics of human behavior. The lecture was full of meaningless jargon, but Dr. Fox delivered it in an engaging, authoritative style. Afterward, the psychiatry residents gave the lecture high marks. They thought he was brilliant, but the lecture was complete gibberish, and Dr. Fox was hired actor. It was all an experiment on the power of authority.

KELLIE NEWSOME: Yes, but the authority bias probably seeps into AI as well – with so many editorials and opinion pieces by prominent psychiatrists in its databank. And AI is probably not free of the commercial bias that slants the literature toward that which the industry decides to fund.

CHRIS AIKEN: Or publication bias – the tendency to publish positive results in place of negative ones.

KELLIE NEWSOME: But here are two biases AI is probably free of. AI doesn’t feel the need to cover its legal ass or run a profitable practice. I hope. Here’s another bias we are prone to, and I wonder where AI stands on this: conformity and the need to fit in. In 1951, psychologist Solomon Ash showed how the pressure to conform can cause people to give answers that are clearly wrong. Ash showed student volunteers at Swarthmore College a card with a line on it. Then he asked them to compare that line to three others and identify which was the same length. The answer was obvious, but when the students were surrounded by paid actors who gave the wrong answer, they doubted their own judgment and joined the chorus of nonsense. Similar experiments have shown that medical professionals too are prone to this conformity bias.

CHRIS AIKEN: We can overcome these biases by reaching out for help, and that’s what this new crop of AI studies show. Physicians perform better when they use textbooks, literature searches, and even better when they use AI. AI is basically an interactive textbook. It’s informed by millions of pages of medical texts, and we interact with that vast knowledge base. And if we learn how to do it right, we’ll be light years ahead of those who came before us. Let’s look at who came before us because this is exciting stuff. We are on the verge of an information revolution in medicine, but it’s not the first one.

KELLIE NEWSOME: OK, here’s a brief history of revolutions in medical knowledge. 1440, Gutenberg invents the printing press. 200 years later, the first English language scientific journal appears in 1665 from the Royal Society of London. That journal included medical papers, allowing physicians access to experimental results for the first time. The first journal dedicated to medicine came out of Edinburgh 60 years later in 1731, Medical Essays and Observations.

CHRIS AIKEN: The Enlightenment was an exciting time, but let's be real here. I doubt that most physicians were relying on these journals for their practice, which was still driven by authority, specifically, the authority of Galland, and most medical arguments were judged by their alignments to Galland's work rather than to any empiric evidence. That is how we got bloodletting.

KELLIE NEWSOME: Still, medical journals continued to grow, even as Galen’s great bloodletting cure held sway. The New England Journal of Medicine arrived in 1812, followed by the Lancet in 1823, but as more journals appeared, we ran into a problem. How the heck could you find anything in them? If you had a clinical question like – does hypertension lead to heart disease – you’d have to read through 100 years of journals to find an answer. The conundrum inspired the next major leap – in 1879, the National Library of Medicine in the US started indexing medical journals so their contents could be searched.

CHRIS AIKEN: Still, I doubt many country doctors were going to the Library of Medicine. In 1966, they computerized the index, now called Medlar, and searching got easier in the 1970s when the National Library of Medicine launched Medline – short for Medlar online – allowing physicians to search the computerized database through computer terminals at select libraries.

KELLIE NEWSOME: These libraries were very select. There were a few in California, a few in the Midwest, and dozens in the Northeast. So, very few physicians had access to them.

CHRIS AIKEN: The explosion of medical research really took place after World War II, fueled by government spending. But this highly funded research was not making its way into practice. In 1976, Senator Ted Kennedy pointed out the need to disseminate all this knowledge, and the next year, the NIH brought together physicians to review the evidence and create the first practice guidelines in 1977. But even these experts didn't have access to all the journal articles, and while the guidelines were an important step forward. They were filled with gaping holes. Simply being an expert and regularly reading through the journals was not enough to get all the information you needed.

KELLIE NEWSOME: In 1996, the Library of Medicine democratized those rarified search terminals, unleashing PubMed online, allowing anyone with internet access to search the medical journals. At first, it was just the abstract, but in 2000, they started adding full-text journals to the database through Pubmed Central. Today, 30% of all indexed articles are available as full text without a paywall.

CHRIS AIKEN: If you search journals, you should know about these two distinct databases. Pubmed searches for keywords in the abstract and title, but PubMed Central broadens the search by looking through the full text of the article. There are 27 million articles on PubMed, and 9 million have full text.

KELLIE NEWSOME: And that is where the next major leap is happening. Because the availability of all that full text allows large language models – that is, AI – to learn a lot about medicine. Even if an article isn’t available as full text, the AI bot can still read the abstract, or the article’s findings may be discussed in a review paper, with the result that most papers make their way into AI’s brain. Imagine having instant access to 27 million journal articles to help you answer any medical question. Questions like, “Does lamotrigine cause hair loss,” “Do sleep medicines improve mania,” or “Which antipsychotic is least likely to raise prolactin?”

CHRIS AIKEN: Pubmed is a godsend, but the ability to search like this takes it to another level, and I don’t recommend using Chat GPT to do it. While our friend Chat searches the medical literature, Chat also searches lifestyle magazines, Reddit, and other lay publications to generate answers. The bot can’t tell the difference between fact and opinion, and you end up with nothing more than a misleading list of what the internet thinks. Instead, I recommend using AI systems that focus their searchers on medical databases like Pubmed. Here are three:

KELLIE NEWSOME: 1. Elicit AI. This does detailed and accurate searches, but it is limited in resources, so it only allows you to conduct a few searches a month, even with the paid subscription. Elicit is aimed at the researcher – it will even write systemic reviews for you. 2. Perplexity and Consensus AI. These are not medically focused but are still academic and stick to library-grade resources. They are faster than Elicit, but less detailed and precise. 3. Open Evidence: This one is free to use if you have an NPI number. It gives fast answers, and focuses on medical literature, though it’s not as detailed as Elicit.

CHRIS AIKEN: This is a new field, and the products are just developing, and I’ve only been using them for a months. If you know of good options – write me at caiken@thecarlatreport.com. If you use it right, AI is little different from PubMed, it’s just a more powerful search tool. Here’s an example. A few years ago, I wrote a article with Kelvin Quiñones-Laracuente, MD, PhD, for the Carlat Report on high-dose stimulants. Like many of us, I was seeing a lot of patients present for their first visit on doses like Adderall 80 mg a day or higher. These patients didn’t seem right – they spoke in choppy sentences, they had trouble making decisions, they were dysfunctional but insisted that the high dose helped. So, I set out to see what the research showed about these high levels. This was tough to find in Pubmed – most subject headings and titles don’t specify the doses used, so I had to pour through dozens of articles searching for any that looked at super-sized doses. It took weeks. Today, I asked these AI engines to find the answers, and they did a pretty good job. Elicit did the best – it found some of the obscure narcolepsy studies that were very revealing – showing 12 times higher rates of psychosis when the Adderall dose went up to 120mg a day – and the other engines performed pretty good as well. This would have made my Carlat Paper a lot faster to write. Now, a confession – I’m old school, so I’m not fully comfortable with AI doing all the work, and I would recommend the same for everyone. Until we know more about its pros and cons, I recommend you use AI as a search tool – let it find the medical papers for you but read those papers yourself – double check it – to make sure the bot isn’t hallucinating. It’s an extra step, but you'll be glad you still have a job to do.

KELLIE NEWSOME: And now a news update. The FDA plans to use AI to perform scientific reviews starting next month. They’ve already tested it as a pilot, and FDA commissioner Martin Makary says he was “blown away” by the results. The AI bots will not completely replace scientists but will help them avoid tedious and repetitive tasks, allowing humans to focus on more important aspects of the review process. It will also speed up approvals. As one FDA scientist said, “This allows me to perform reviews in 10 minutes that previously took 3 days.” Carlat AI isn’t up yet, but we have launched the first Carlat web App – it’s our best-selling Medication Fact Book, and you’ll get free access to the interactive online app version with your purchase. Thank you for helping us stay free of industry support.

The Carlat CME Institute is accredited by the ACCME to provide continuing medical education for physicians. Carlat CME Institute maintains responsibility for this program and its content. Carlat CME Institute designates this enduring material educational activity for a maximum of one quarter (.25) AMA PRA Category 1 Credits^TM. Physicians or psychologists should claim credit commensurate only with the extent of their participation in the activity.