Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,8 @@ language:
|
|
16 |
- en
|
17 |
|
18 |
---
|
19 |
-
This is the repo for the paper [PromptCap: Prompt-Guided Task-Aware Image Captioning](https://arxiv.org/abs/2211.09699)
|
|
|
20 |
|
21 |
We introduce PromptCap, a captioning model that can be controlled by natural language instruction. The instruction may contain a question that the user is interested in.
|
22 |
For example, "what is the boy putting on?". PromptCap also supports generic caption, using the question "what does the image describe?"
|
@@ -43,7 +44,7 @@ Generate a prompt-guided caption by following:
|
|
43 |
import torch
|
44 |
from promptcap import PromptCap
|
45 |
|
46 |
-
model = PromptCap("
|
47 |
|
48 |
if torch.cuda.is_available():
|
49 |
model.cuda()
|
@@ -87,7 +88,7 @@ import torch
|
|
87 |
from promptcap import PromptCap_VQA
|
88 |
|
89 |
# QA model support all UnifiedQA variants. e.g. "allenai/unifiedqa-v2-t5-large-1251000"
|
90 |
-
vqa_model = PromptCap_VQA(promptcap_model="
|
91 |
|
92 |
if torch.cuda.is_available():
|
93 |
vqa_model.cuda()
|
|
|
16 |
- en
|
17 |
|
18 |
---
|
19 |
+
This is the repo for the paper [PromptCap: Prompt-Guided Task-Aware Image Captioning](https://arxiv.org/abs/2211.09699). This paper is accepted to ICCV 2023 as [PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3](https://openaccess.thecvf.com/content/ICCV2023/html/Hu_PromptCap_Prompt-Guided_Image_Captioning_for_VQA_with_GPT-3_ICCV_2023_paper.html).
|
20 |
+
|
21 |
|
22 |
We introduce PromptCap, a captioning model that can be controlled by natural language instruction. The instruction may contain a question that the user is interested in.
|
23 |
For example, "what is the boy putting on?". PromptCap also supports generic caption, using the question "what does the image describe?"
|
|
|
44 |
import torch
|
45 |
from promptcap import PromptCap
|
46 |
|
47 |
+
model = PromptCap("tifa-benchmark/promptcap-coco-vqa") # also support OFA checkpoints. e.g. "OFA-Sys/ofa-large"
|
48 |
|
49 |
if torch.cuda.is_available():
|
50 |
model.cuda()
|
|
|
88 |
from promptcap import PromptCap_VQA
|
89 |
|
90 |
# QA model support all UnifiedQA variants. e.g. "allenai/unifiedqa-v2-t5-large-1251000"
|
91 |
+
vqa_model = PromptCap_VQA(promptcap_model="tifa-benchmark/promptcap-coco-vqa", qa_model="allenai/unifiedqa-t5-base")
|
92 |
|
93 |
if torch.cuda.is_available():
|
94 |
vqa_model.cuda()
|