ScreenAI: A visual language model for UI and visually-situated language understanding
… For QA, we use well established benchmarks in the multimodal and document understanding field, such as ChartQA , DocVQA , Multi page DocVQA , InfographicVQA , OCR VQA , Web SRC and ScreenQA . …