nielsr HF Staff commited on
Commit
0740d68
·
verified ·
1 Parent(s): ad85cab

Improve model card: Add library name

Browse files

This PR adds the `library_name: transformers` field to the model card metadata. The provided code examples clearly demonstrate the model's compatibility with the Hugging Face `transformers` library. This addition enhances the model card's clarity and improves discoverability for users seeking models compatible with this popular library.

Files changed (1) hide show
  1. README.md +18 -7
README.md CHANGED
@@ -1,21 +1,32 @@
1
  ---
2
- license: mit
3
- license_link: https://huggingface.co/microsoft/Phi-3-small-128k-instruct/resolve/main/LICENSE
4
-
5
  language:
6
  - multilingual
 
 
7
  pipeline_tag: text-generation
8
  tags:
9
  - nlp
10
  - code
 
11
  inference:
12
  parameters:
13
  temperature: 0.7
14
  widget:
15
- - messages:
16
- - role: user
17
- content: Can you provide ways to eat combinations of bananas and dragonfruits?
18
  ---
 
 
 
 
 
 
 
 
 
 
 
19
  🎉 **Phi-3.5**: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)
20
 
21
  ## Model Summary
@@ -277,4 +288,4 @@ The model is licensed under the [MIT license](https://huggingface.co/microsoft/P
277
 
278
  ## Trademarks
279
 
280
- This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft’s Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.
 
1
  ---
 
 
 
2
  language:
3
  - multilingual
4
+ license: mit
5
+ license_link: https://huggingface.co/microsoft/Phi-3-small-128k-instruct/resolve/main/LICENSE
6
  pipeline_tag: text-generation
7
  tags:
8
  - nlp
9
  - code
10
+ library_name: transformers
11
  inference:
12
  parameters:
13
  temperature: 0.7
14
  widget:
15
+ - messages:
16
+ - role: user
17
+ content: Can you provide ways to eat combinations of bananas and dragonfruits?
18
  ---
19
+
20
+ # LongRoPE2: Near-Lossless LLM Context Window Scaling
21
+
22
+ The model was presented in the paper [LongRoPE2: Near-Lossless LLM Context Window Scaling](https://hf.co/papers/2502.20082).
23
+
24
+ # Paper abstract
25
+
26
+ The abstract of the paper is the following:
27
+
28
+ LongRoPE2 is a novel approach that extends the effective context window of pre-trained large language models (LLMs) to the target length, while preserving the performance on the original shorter context window. This is achieved by three contributions: (1) a hypothesis that insufficient training in higher RoPE dimensions contributes to the persistent out-of-distribution (OOD) issues observed in existing methods; (2) an effective RoPE rescaling algorithm that adopts evolutionary search guided by "needle-driven" perplexity to address the insufficient training problem; (3) a mixed context window training approach that fine-tunes model weights to adopt rescaled RoPE for long-context sequences while preserving the short-context performance with the original RoPE. Extensive experiments on LLaMA3-8B and Phi3-mini-3.8B across various benchmarks validate the hypothesis and demonstrate the effectiveness of LongRoPE2. Remarkably, LongRoPE2 extends LLaMA3-8B to achieve a 128K effective context length while retaining over 98.5% of short-context performance, using only 10B tokens -- 80x fewer than Meta's approach, which fails to reach the target effective context length. Code will be available at https://github.com/microsoft/LongRoPE.
29
+
30
  🎉 **Phi-3.5**: [[mini-instruct]](https://huggingface.co/microsoft/Phi-3.5-mini-instruct); [[MoE-instruct]](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) ; [[vision-instruct]](https://huggingface.co/microsoft/Phi-3.5-vision-instruct)
31
 
32
  ## Model Summary
 
288
 
289
  ## Trademarks
290
 
291
+ This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft’s Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.