GUI-Vid / README.md
nielsr's picture
nielsr HF Staff
Add link to paper
78e59ab verified
|
raw
history blame
535 Bytes
metadata
datasets:
  - shuaishuaicdp/GUI-World
language:
  - en
license: cc-by-4.0
metrics:
  - bertscore
  - LLM-as-a-Judge
tags:
  - gui
  - agent
pipeline_tag: video-text-to-text

This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on GUI-World.

It was presented in GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents.

See Github for how to use GUI-Vid for GUI understanding tasks.