QiushiSun nielsr HF Staff commited on
Commit
d1ec7c7
·
verified ·
1 Parent(s): 383c798

Add link to paper and project page (#2)

Browse files

- Add link to paper and project page (c063759109ad82f4cf78e9be3e89ea9400ea8105)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +12 -4
README.md CHANGED
@@ -1,12 +1,14 @@
1
  ---
2
- license: apache-2.0
3
- library_name: transformers
4
  base_model: OpenGVLab/InternVL2-4B
 
 
5
  pipeline_tag: image-text-to-text
6
  ---
7
 
8
  # OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
9
 
 
 
10
  <div align="center">
11
 
12
  [\[🏠Homepage\]](https://qiushisun.github.io/OS-Genesis-Home/) [\[💻Code\]](https://github.com/OS-Copilot/OS-Genesis) [\[📝Paper\]](https://arxiv.org/abs/2412.19723) [\[🤗Models\]](https://huggingface.co/collections/OS-Copilot/os-genesis-6768d4b6fffc431dbf624c2d)[\[🤗Data\]](https://huggingface.co/collections/OS-Copilot/os-genesis-6768d4b6fffc431dbf624c2d)
@@ -137,9 +139,15 @@ tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True, use_fast
137
  pixel_values = load_image('./web_dfacd48d-d2c2-492f-b94c-41e6a34ea99f.png', max_num=6).to(torch.bfloat16).cuda()
138
  generation_config = dict(max_new_tokens=1024, do_sample=True)
139
 
140
- question = "<image>\nYou are a GUI task expert, I will provide you with a high-level instruction, an action history, a screenshot with its corresponding accessibility tree.\n High-level instruction: {high_level_instruction}\n Action history: {action_history}\n Accessibility tree: {a11y_tree}\n Please generate the low-level thought and action for the next step."
 
 
 
 
 
141
  response, history = model.chat(tokenizer, pixel_values, question, generation_config, history=None, return_history=True)
142
- print(f'User: {question}\nAssistant: {response}')
 
143
  ```
144
 
145
 
 
1
  ---
 
 
2
  base_model: OpenGVLab/InternVL2-4B
3
+ library_name: transformers
4
+ license: apache-2.0
5
  pipeline_tag: image-text-to-text
6
  ---
7
 
8
  # OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
9
 
10
+ This model is described in the paper [OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis](https://huggingface.co/papers/2412.19723)
11
+
12
  <div align="center">
13
 
14
  [\[🏠Homepage\]](https://qiushisun.github.io/OS-Genesis-Home/) [\[💻Code\]](https://github.com/OS-Copilot/OS-Genesis) [\[📝Paper\]](https://arxiv.org/abs/2412.19723) [\[🤗Models\]](https://huggingface.co/collections/OS-Copilot/os-genesis-6768d4b6fffc431dbf624c2d)[\[🤗Data\]](https://huggingface.co/collections/OS-Copilot/os-genesis-6768d4b6fffc431dbf624c2d)
 
139
  pixel_values = load_image('./web_dfacd48d-d2c2-492f-b94c-41e6a34ea99f.png', max_num=6).to(torch.bfloat16).cuda()
140
  generation_config = dict(max_new_tokens=1024, do_sample=True)
141
 
142
+ question = "<image>
143
+ You are a GUI task expert, I will provide you with a high-level instruction, an action history, a screenshot with its corresponding accessibility tree.
144
+ High-level instruction: {high_level_instruction}
145
+ Action history: {action_history}
146
+ Accessibility tree: {a11y_tree}
147
+ Please generate the low-level thought and action for the next step."
148
  response, history = model.chat(tokenizer, pixel_values, question, generation_config, history=None, return_history=True)
149
+ print(f'User: {question}
150
+ Assistant: {response}')
151
  ```
152
 
153