Paper page - PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments
…The following papers were recommended by the Semantic Scholar API HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration Tasks (2026) Medical Reasoning with Large Language Models: A Survey and MR-Bench (2026…