Add link to paper
#2
by
nielsr
HF Staff
- opened
README.md
CHANGED
@@ -13,4 +13,8 @@ tags:
|
|
13 |
pipeline_tag: video-text-to-text
|
14 |
---
|
15 |
|
16 |
-
This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on [GUI-World](https://gui-world.github.io).
|
|
|
|
|
|
|
|
|
|
13 |
pipeline_tag: video-text-to-text
|
14 |
---
|
15 |
|
16 |
+
This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on [GUI-World](https://gui-world.github.io).
|
17 |
+
|
18 |
+
It was presented in [GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents](https://huggingface.co/papers/2406.10819).
|
19 |
+
|
20 |
+
See [Github](https://github.com/Dongping-Chen/GUI-World) for how to use GUI-Vid for GUI understanding tasks.
|