Paper page - GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents
…multimodal perception is integrated as a core component of reasoning, planning, tool use, and execution, rather than as an auxiliary interface to a language model. This report summarizes the main improvements behind…
