back to blog

Google AMIE medical AI chatbot outperforms primary care providers

Read Time 3 mins | Written by: Cole

Google’s AMIE medical research conversational diagnostic model just beat out primary care providers on performance in diagnosing patients via text  – judged by patient actors and physicians.

  • 149 actors playing patients texted live with one of 20 primary care doctors or else Google's new medical LLM, AMIE
  • AMIE beat the primary care doctors on 28 out of 32 characteristics, and tied on the other four, as rated by human doctors.
  • From the perspective of the "patients," the AI won on 24 of 26 scales

Here’s an overview of Google’s AMIE medical model and the study.

What is Google AMIE?

Google AMIE (Articulate Medical Intelligence Explorer) is a conversational diagnostic AI. That means AMIE is a research AI system based on a LLM and optimized for diagnostic reasoning and conversations. AMIE was trained and evaluated on aspects crucial for quality in real-world clinical consultations, considering both clinicians' and patients' viewpoints.

AMIE was specifically honed for real-world healthcare dynamics, medical knowledge, patient interviews, and it engages in nuanced diagnostic discussions.

How did AMIE outperform primary care providers?

 

The diagnostic conversation between patient and physician is a cornerstone of medicine. AI systems capable of diagnostic dialogues could increase availability, accessibility, quality and consistency of care by being useful conversational partners to clinicians and patients alike. But giving LLMs the expertise, empathy, and skill in diagnosing patient history that primary care providers had been out of reach. Until now. 

Google Researchers designed a study where patients would use AMIE in the same ways they would engage a chatbot.

AMIE study outline

149 different trained actors playing patients texted with either a primary care doctor or AMIE – without knowing which they were talking to. AMIE beat the primary care doctors on 28 out of 32 characteristics, and tied on the other four. This performance was rated by human doctors and the patient actors. 

The Google researchers are optimistic about the results but announced them with some cautions attached. For one, this process underestimates the value of face-to-face conversations between humans. And, real-world conversations come with fairness, privacy, and health equity issues that AI isn’t tested for. 

You can read the full Google AMIE study here.

Want AI experts to build your medical LLM?

You could spend the next 6-18 months planning to recruit and build an AI team who knows LLMs or you could hire Codingscape now. 

We can assemble a senior LLM team for you in 4-6 weeks. It’ll be faster to get started, more cost-efficient than internal hiring, and we’ll deliver high-quality results quickly. We’ve already built a medical chatbot for women’s health and another for psychedelic-assisted therapies

 Zappos, Twilio, and Veho are just a few companies that trust us to build their software and systems with a remote-first approach.

You can schedule a time to talk with us here. No hassle, no expectations, just answers.

Don't Miss
Another Update

Subscribe to be notified when
new content is published
Cole

Cole is Codingscape's Content Marketing Strategist & Copywriter.