Paper page - PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments
…Authors: , , , , , , , , , , , , Abstract PhysicianBench evaluates LLM agents on real clinical tasks requiring complex, multi-step workflows within electronic health record environments, revealing significant gaps in current agent capabilities. AI-generated summary We introduce…