Update README.md
Browse files
README.md
CHANGED
@@ -12,13 +12,14 @@ base_model:
|
|
12 |
Github: https://github.com/VITA-MLLM/Long-VITA
|
13 |
|
14 |
## π Overview
|
|
|
15 |
Long-VITA is a strong long-context visual language model and supports more than 1 million tokens.
|
16 |
|
17 |
-
-
|
18 |
|
19 |
-
- We also implemented Long-VITA on Megatron with the Transformer Engine to infer and evaluate on Nvidia GPUs.
|
20 |
|
21 |
-
- We also implemented Long-VITA on DeepSpeed with the Huggingface Transformers to infer and evaluate on Nvidia GPUs.
|
22 |
|
23 |
|
24 |
## π Experimental Results
|
|
|
12 |
Github: https://github.com/VITA-MLLM/Long-VITA
|
13 |
|
14 |
## π Overview
|
15 |
+
|
16 |
Long-VITA is a strong long-context visual language model and supports more than 1 million tokens.
|
17 |
|
18 |
+
- Long-VITA-16K weights are trained on Ascend NPUs with MindSpeed.
|
19 |
|
20 |
+
- We also implemented Long-VITA on Megatron with the Transformer Engine to infer and evaluate on Nvidia GPUs. The converted weight is at https://huggingface.co/VITA-MLLM/Long-VITA-16K_MG.
|
21 |
|
22 |
+
- We also implemented Long-VITA on DeepSpeed with the Huggingface Transformers to infer and evaluate on Nvidia GPUs. The converted weight is at https://huggingface.co/VITA-MLLM/Long-VITA-16K_HF.
|
23 |
|
24 |
|
25 |
## π Experimental Results
|