metadata
datasets:
- shuaishuaicdp/GUI-World
language:
- en
license: cc-by-4.0
metrics:
- bertscore
- LLM-as-a-Judge
tags:
- gui
- agent
pipeline_tag: video-text-to-text
This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on GUI-World.
It was presented in GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents.
See Github for how to use GUI-Vid for GUI understanding tasks.