ScreenAI: A visual language model for UI and visually-situated language understanding
…By combining the natural language capabilities of LLMs with a structured schema, we simulate a wide range of user interactions and scenarios to generate synthetic, realistic tasks. In particular, we generate three…
