Image Feature Extraction
PerceptionEncoder
janghyuncho7 commited on
Commit
3f0d058
·
verified ·
1 Parent(s): dab510d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -28,7 +28,7 @@ We release two PE Lang checkpoints, L14-448 and G14-448. Here are their results
28
 
29
 
30
 
31
- Here is a sample of the performance obtainable by using PE Core G aligned further with [PLM-8B](https://huggingface.co/facebook/Perception-LM-8B) (stage 2) using 16+1 image tiles / 16 video frames with Llama 3.1 8B as the decoder:
32
 
33
  | Model | Encoder | Doc VQA (test) | InfoQA (test) | TextVQA | MVBench | PerceptionTest (test) | EgoSchema (test) |
34
  |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
 
28
 
29
 
30
 
31
+ Here is a sample of the performance obtainable by using PE Core G aligned further with [PLM-8B](https://huggingface.co/facebook/Perception-LM-8B) (*stage 3*) using 36+1 image tiles / 32 video frames with Llama 3.1 8B as the decoder:
32
 
33
  | Model | Encoder | Doc VQA (test) | InfoQA (test) | TextVQA | MVBench | PerceptionTest (test) | EgoSchema (test) |
34
  |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|