Testing and evaluation of generative large language models in electronic health record applications: a systematic review
Our systematic review shows that current evaluations of LLMs in real clinical settings are surprisingly narrow and mostly focused on radiology and decision support, while key areas like patient communication remain largely under-explored.
Jan 13, 2026