We propose an active learning framework that leverages disagreement between large language models (GPT and DeepSeek) to selectively fine-tune GPT models with reasoning-augmented supervision, significantly improving accuracy and recall in automated biomedical literature screening.
Mar 9, 2026
Our systematic review shows that current evaluations of LLMs in real clinical settings are surprisingly narrow and mostly focused on radiology and decision support, while key areas like patient communication remain largely under-explored.
Jan 13, 2026

This study aims to improve literature screening efficiency and accuracy by developing an LLM-based approach that incorporates rule-based preprocessing, prompt engineering (including RAG), and ensemble techniques.
Nov 30, 2025

We conduct a comprehensive review of recent studies that leverage generative LLMs for EHR analysis and applications, focusing on their performance and strategies for improvement.
Aug 25, 2025

LLMs are prone to generating inaccurate or misleading interpretation of genetic variants. To solve this, we propose a “precision grounding” approach that enhances LLM with publicly available evidence-based databases and resources identified by domain experts.
Jun 9, 2025