T-pro-it-2.0
π¨ Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.
Description
T-pro-it-2.0 is a model built upon the Qwen 3 model family and incorporates both continual pre-training and alignment techniques.
π Dataset
Instruction Pre-Training: 40B tokens of instruction data, with one-third focused on reasoning tasks.
Supervised Fine-Tuning (SFT): ~500K high-quality and diverse instructions with balanced complexity. Reasoning tasks make up about 20% of the dataset.
Preference Tuning: ~100K carefully selected instructions, filtered by length and type for general tasks and with domain-balanced selection for reasoning tasks.
π Benchmarks
Model | MERA | ruMMLU | Ru Arena Hard | ru AIME 2025 | ru LCB |
---|---|---|---|---|---|
T-pro 2.0 | 0.660 | 0.790 | 0.876 | 0.646 | 0.563 |
Qwen 3 32B | 0.584 | 0.740 | 0.836 | 0.625 | 0.537 |
Ruadapt 3 32B V2 | 0.574 | 0.737 | 0.660 | 0.450 | 0.500 |
DeepSeek-R1-Distill-Qwen-32B | 0.508 | 0.702 | 0.426 | 0.402 | 0.493 |
Gemma 3 27B | 0.577 | 0.695 | 0.759 | 0.231 | 0.261 |
Switching Between Thinking and NonβThinking Modes
To enable or disable reasoning mode in HuggingFace, set the enable_thinking
flag in tokenizer.apply_chat_template
.
For more details, see:
Recommended Generation Parameters
Mode | Temperature | presence_penalty |
---|---|---|
Noβthink (general requests) | β€β―0.3 | 1.0 |
Think mode (standard requests) | ββ―0.6 | 1.0 |
Complex reasoning requests | β₯β―0.8 | 1.0 |
- Hybrid reasoning models need careful tuning of sampling hyperparameters, which vary by domain.
- Use lower temperature for straightforward queries and higher temperature for complex 'think-mode' tasks.
- A presence_penalty between 0 and 2 can help avoid repetitive outputs.
π¨βπ» Examples of usage
SGLang Usage
For better quality and stable performance, we recommend SGLang as your inference framework.
To run an inference server for T-pro-it-2.0, start by launching the SGLang server:
python -m sglang.launch_server \
--model-path t-tech/T-pro-it-2.0 \
--reasoning-parser qwen3
Once the server is up and listening on localhost:30000
, you can send chat-based requests via the OpenAI Python client.
import openai
client = openai.OpenAI(
base_url="http://127.0.0.1:30000/v1",
api_key="ANY" # the server ignores the API key
)
prompt = (
"ΠΠΎΠΆΠ°Π»ΡΠΉΡΡΠ°, Π²ΡΡΠΈΡΠ»ΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ½Π½ΡΠΉ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» β«_0^1 xΒ²β―eΛ£β―dx, "
"ΠΏΠΎΡΠ°Π³ΠΎΠ²ΠΎ ΠΎΠ±ΡΡΡΠ½ΠΈ ΡΠ΅ΡΠ΅Π½ΠΈΠ΅ ΠΈ ΡΠΊΠ°ΠΆΠΈ ΠΎΠΊΠΎΠ½ΡΠ°ΡΠ΅Π»ΡΠ½ΡΠΉ ΡΠ΅Π·ΡΠ»ΡΡΠ°Ρ."
)
completion = client.chat.completions.create(
model="ANY", # the server ignores the model name
messages=[
{"role": "system", "content": "Π’Ρ T-pro, Π²ΠΈΡΡΡΠ°Π»ΡΠ½ΡΠΉ Π°ΡΡΠΈΡΡΠ΅Π½Ρ Π² Π’-Π’Π΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. Π’Π²ΠΎΡ Π·Π°Π΄Π°ΡΠ° - Π±ΡΡΡ ΠΏΠΎΠ»Π΅Π·Π½ΡΠΌ Π΄ΠΈΠ°Π»ΠΎΠ³ΠΎΠ²ΡΠΌ Π°ΡΡΠΈΡΡΠ΅Π½ΡΠΎΠΌ."},
{"role": "user", "content": prompt}
],
# REQUIRED: sampling params from the "Recommended Generation Parameters" table
temperature=0.6,
presence_penalty=1.0,
)
# The generated reply is in `completion.choices[0].message.content`
print(completion.choices[0].message.content)
Note: It is obligatory to include both temperature
and presence_penalty
in every completion call.
HF Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
torch.manual_seed(42)
model_name = "t-tech/T-pro-it-2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
)
prompt = (
"ΠΠΎΠΆΠ°Π»ΡΠΉΡΡΠ°, Π²ΡΡΠΈΡΠ»ΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ½Π½ΡΠΉ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» β«_0^1 xΒ²β―eΛ£β―dx, "
"ΠΏΠΎΡΠ°Π³ΠΎΠ²ΠΎ ΠΎΠ±ΡΡΡΠ½ΠΈ ΡΠ΅ΡΠ΅Π½ΠΈΠ΅ ΠΈ ΡΠΊΠ°ΠΆΠΈ ΠΎΠΊΠΎΠ½ΡΠ°ΡΠ΅Π»ΡΠ½ΡΠΉ ΡΠ΅Π·ΡΠ»ΡΡΠ°Ρ."
)
messages = [
{"role": "system", "content": "Π’Ρ T-pro, Π²ΠΈΡΡΡΠ°Π»ΡΠ½ΡΠΉ Π°ΡΡΠΈΡΡΠ΅Π½Ρ Π² Π’-Π’Π΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. Π’Π²ΠΎΡ Π·Π°Π΄Π°ΡΠ° - Π±ΡΡΡ ΠΏΠΎΠ»Π΅Π·Π½ΡΠΌ Π΄ΠΈΠ°Π»ΠΎΠ³ΠΎΠ²ΡΠΌ Π°ΡΡΠΈΡΡΠ΅Π½ΡΠΎΠΌ."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
Output:
<think>
Π₯ΠΎΡΠΎΡΠΎ, ΠΌΠ½Π΅ Π½ΡΠΆΠ½ΠΎ Π²ΡΡΠΈΡΠ»ΠΈΡΡ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΠΉ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» ΠΎΡ 0 Π΄ΠΎ 1 ΡΡΠ½ΠΊΡΠΈΠΈ xΒ² * e^x dx. Π― ΠΏΠΎΠΌΠ½Ρ, ΡΡΠΎ Π΄Π»Ρ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π»ΠΎΠ² ΡΠ°ΠΊΠΎΠ³ΠΎ Π²ΠΈΠ΄Π°, Π³Π΄Π΅ Π΅ΡΡΡ ΠΏΡΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΈΠ΅ ΠΌΠ½ΠΎΠ³ΠΎΡΠ»Π΅Π½Π° ΠΈ ΡΠΊΡΠΏΠΎΠ½Π΅Π½ΡΡ, ΠΎΠ±ΡΡΠ½ΠΎ ΠΏΡΠΈΠΌΠ΅Π½ΡΡΡ ΠΌΠ΅ΡΠΎΠ΄ ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ. ΠΠ°Π²Π°ΠΉΡΠ΅ Π²ΡΠΏΠΎΠΌΠ½Ρ ΡΠΎΡΠΌΡΠ»Ρ ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ: β«u dv = uv - β«v du.
ΠΠ΅ΡΠ²ΡΠΌ Π΄Π΅Π»ΠΎΠΌ Π½ΡΠΆΠ½ΠΎ Π²ΡΠ±ΡΠ°ΡΡ, ΡΡΠΎ Π²Π·ΡΡΡ Π·Π° u, Π° ΡΡΠΎ Π·Π° dv. ΠΠ±ΡΡΠ½ΠΎ Π² ΡΠ°ΠΊΠΈΡ
ΡΠ»ΡΡΠ°ΡΡ
ΠΌΠ½ΠΎΠ³ΠΎΡΠ»Π΅Π½ (Π² Π΄Π°Π½Π½ΠΎΠΌ ΡΠ»ΡΡΠ°Π΅ xΒ²) Π±Π΅ΡΡΡ Π·Π° u, ΠΏΠΎΡΠΎΠΌΡ ΡΡΠΎ ΠΏΡΠΈ Π΄ΠΈΡΡΠ΅ΡΠ΅Π½ΡΠΈΡΠΎΠ²Π°Π½ΠΈΠΈ Π΅Π³ΠΎ ΡΡΠ΅ΠΏΠ΅Π½Ρ ΠΏΠΎΠ½ΠΈΠΆΠ°Π΅ΡΡΡ, ΠΈ Π² ΠΊΠΎΠ½ΡΠ΅ ΠΊΠΎΠ½ΡΠΎΠ² ΠΎΠ½ ΠΏΡΠ΅Π²ΡΠ°ΡΠΈΡΡΡ Π² ΠΊΠΎΠ½ΡΡΠ°Π½ΡΡ, ΡΡΠΎ ΡΠΏΡΠΎΡΡΠΈΡ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π». Π’ΠΎΠ³Π΄Π° dv Π±ΡΠ΄Π΅Ρ ΠΎΡΡΠ°Π²ΡΠ°ΡΡΡ ΡΠ°ΡΡΡ, ΡΠΎ Π΅ΡΡΡ e^x dx.
ΠΡΠ°ΠΊ, ΠΏΡΡΡΡ u = xΒ², ΡΠΎΠ³Π΄Π° du = 2x dx. Π dv = e^x dx, Π·Π½Π°ΡΠΈΡ, v = β«e^x dx = e^x.
Π’Π΅ΠΏΠ΅ΡΡ ΠΏΡΠΈΠΌΠ΅Π½ΠΈΠΌ ΡΠΎΡΠΌΡΠ»Ρ ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ: β«xΒ² e^x dx = uv - β«v du = xΒ² e^x - β«e^x * 2x dx.
Π’Π΅ΠΏΠ΅ΡΡ Ρ Π½Π°Ρ ΠΎΡΡΠ°Π»ΡΡ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» β«2x e^x dx. ΠΠ³ΠΎ ΡΠΎΠΆΠ΅ Π½ΡΠΆΠ½ΠΎ Π²Π·ΡΡΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ. ΠΠΏΡΡΡ Π²ΡΠ±ΠΈΡΠ°Π΅ΠΌ u ΠΈ dv. Π’Π΅ΠΏΠ΅ΡΡ u = 2x, ΡΠΎΠ³Π΄Π° du = 2 dx, Π° dv = e^x dx, Π·Π½Π°ΡΠΈΡ, v = e^x.
ΠΡΠΈΠΌΠ΅Π½ΡΠ΅ΠΌ ΡΠΎΡΠΌΡΠ»Ρ ΡΠ½ΠΎΠ²Π°: β«2x e^x dx = 2x e^x - β«e^x * 2 dx = 2x e^x - 2β«e^x dx.
ΠΠ½ΡΠ΅Π³ΡΠ°Π» β«e^x dx ΡΠ°Π²Π΅Π½ e^x + C. ΠΠΎΠ΄ΡΡΠ°Π²Π»ΡΠ΅ΠΌ ΠΎΠ±ΡΠ°ΡΠ½ΠΎ: 2x e^x - 2e^x + C.
Π’Π΅ΠΏΠ΅ΡΡ Π²ΠΎΠ·Π²ΡΠ°ΡΠ°Π΅ΠΌΡΡ ΠΊ ΠΈΡΡ
ΠΎΠ΄Π½ΠΎΠΌΡ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π»Ρ: xΒ² e^x - (2x e^x - 2e^x) + C = xΒ² e^x - 2x e^x + 2e^x + C.
Π’Π΅ΠΏΠ΅ΡΡ Π½ΡΠΆΠ½ΠΎ Π²ΡΡΠΈΡΠ»ΠΈΡΡ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΠΉ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» ΠΎΡ 0 Π΄ΠΎ 1. ΠΠ»Ρ ΡΡΠΎΠ³ΠΎ ΠΏΠΎΠ΄ΡΡΠ°Π²ΠΈΠΌ Π²Π΅ΡΡ
Π½ΠΈΠΉ ΠΈ Π½ΠΈΠΆΠ½ΠΈΠΉ ΠΏΡΠ΅Π΄Π΅Π»Ρ Π² ΠΏΠ΅ΡΠ²ΠΎΠΎΠ±ΡΠ°Π·Π½ΡΡ ΠΈ Π²ΡΡΡΠ΅ΠΌ.
Π‘Π½Π°ΡΠ°Π»Π° ΠΏΠΎΠ΄ΡΡΠ°Π²ΠΈΠΌ x = 1: (1Β² * e^1) - 2*1*e^1 + 2*e^1 = e - 2e + 2e = (1 - 2 + 2)e = 1*e = e.
Π’Π΅ΠΏΠ΅ΡΡ ΠΏΠΎΠ΄ΡΡΠ°Π²ΠΈΠΌ x = 0: (0Β² * e^0) - 2*0*e^0 + 2*e^0 = 0 - 0 + 2*1 = 2.
ΠΡΡΠΈΡΠ°Π΅ΠΌ: e - 2.
ΠΡΠΎΠ²Π΅ΡΡ, Π½Π΅ ΠΎΡΠΈΠ±ΡΡ Π»ΠΈ Ρ Π² Π²ΡΡΠΈΡΠ»Π΅Π½ΠΈΡΡ
. ΠΡΠΈ ΠΏΠΎΠ΄ΡΡΠ°Π½ΠΎΠ²ΠΊΠ΅ x=1: 1*e - 2*1*e + 2*e = e - 2e + 2e = e. ΠΠ°, Π²Π΅ΡΠ½ΠΎ. ΠΡΠΈ x=0: 0 - 0 + 2*1 = 2. Π’ΠΎΠ³Π΄Π° ΡΠ°Π·Π½ΠΎΡΡΡ e - 2.
ΠΠΎ ΠΏΠΎΠ΄ΠΎΠΆΠ΄ΠΈ, ΠΌΠΎΠΆΠ΅Ρ Π±ΡΡΡ Ρ Π³Π΄Π΅-ΡΠΎ ΠΏΡΠΎΠΏΡΡΡΠΈΠ» Π·Π½Π°ΠΊ? Π ΠΏΠ΅ΡΠ²ΠΎΠΎΠ±ΡΠ°Π·Π½ΠΎΠΉ ΠΏΠΎΡΠ»Π΅ ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ: xΒ² e^x - 2x e^x + 2e^x. ΠΠ°, ΠΏΡΠ°Π²ΠΈΠ»ΡΠ½ΠΎ. Π’ΠΎΠ³Π΄Π° ΠΏΡΠΈ ΠΏΠΎΠ΄ΡΡΠ°Π½ΠΎΠ²ΠΊΠ΅ 1: 1*e - 2*1*e + 2*e = (1 - 2 + 2)e = 1*e. ΠΡΠΈ 0: 0 - 0 + 2*1 = 2. Π’ΠΎΠ³Π΄Π° ΠΎΡΠ²Π΅Ρ e - 2.
ΠΠΎ e ΠΏΡΠΈΠΌΠ΅ΡΠ½ΠΎ ΡΠ°Π²Π½ΠΎ 2.718, Π·Π½Π°ΡΠΈΡ, e - 2 β 0.718. ΠΠ½ΡΡΠΈΡΠΈΠ²Π½ΠΎ ΠΊΠ°ΠΆΠ΅ΡΡΡ, ΡΡΠΎ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» ΠΎΡ 0 Π΄ΠΎ 1 xΒ² e^x Π΄ΠΎΠ»ΠΆΠ΅Π½ Π±ΡΡΡ ΠΏΠΎΠ»ΠΎΠΆΠΈΡΠ΅Π»ΡΠ½ΡΠΌ, ΡΡΠΎ ΡΠΎΠΎΡΠ²Π΅ΡΡΡΠ²ΡΠ΅Ρ.
ΠΠΎΠΆΠ΅Ρ, ΡΡΠΎΠΈΡ ΠΏΡΠΎΠ²Π΅ΡΠΈΡΡ ΡΠ΅ΡΠ΅Π· Π΄ΡΡΠ³ΠΎΠΉ ΠΌΠ΅ΡΠΎΠ΄ ΠΈΠ»ΠΈ Π΄ΠΈΡΡΠ΅ΡΠ΅Π½ΡΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΠ΅ΡΠ²ΠΎΠΎΠ±ΡΠ°Π·Π½ΠΎΠΉ? ΠΠΎΠ·ΡΠΌΠ΅ΠΌ ΠΏΡΠΎΠΈΠ·Π²ΠΎΠ΄Π½ΡΡ ΠΎΡ xΒ² e^x - 2x e^x + 2e^x.
ΠΡΠΎΠΈΠ·Π²ΠΎΠ΄Π½Π°Ρ: d/dx [xΒ² e^x] = 2x e^x + xΒ² e^x.
d/dx [-2x e^x] = -2 e^x - 2x e^x.
d/dx [2e^x] = 2 e^x.
Π‘ΠΊΠ»Π°Π΄ΡΠ²Π°Π΅ΠΌ Π²ΡΠ΅: 2x e^x + xΒ² e^x - 2 e^x - 2x e^x + 2 e^x = xΒ² e^x.
ΠΠ°, ΠΏΡΠΎΠΈΠ·Π²ΠΎΠ΄Π½Π°Ρ ΡΠ°Π²Π½Π° ΠΏΠΎΠ΄ΡΠ½ΡΠ΅Π³ΡΠ°Π»ΡΠ½ΠΎΠΉ ΡΡΠ½ΠΊΡΠΈΠΈ. ΠΠ½Π°ΡΠΈΡ, ΠΏΠ΅ΡΠ²ΠΎΠΎΠ±ΡΠ°Π·Π½Π°Ρ Π½Π°ΠΉΠ΄Π΅Π½Π° Π²Π΅ΡΠ½ΠΎ. Π’ΠΎΠ³Π΄Π° ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΠΉ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» ΡΠ°Π²Π΅Π½ e - 2.
ΠΡΠΎΠ΄Π΅ Π²ΡΠ΅ ΠΏΡΠ°Π²ΠΈΠ»ΡΠ½ΠΎ. ΠΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎ, ΠΌΠΎΠΆΠ½ΠΎ Π±ΡΠ»ΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ ΡΠ΅ΠΊΡΡΡΠ΅Π½ΡΠ½ΡΡ ΡΠΎΡΠΌΡΠ»Ρ ΠΈΠ»ΠΈ ΡΠ°Π±Π»ΠΈΡΡ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π»ΠΎΠ², Π½ΠΎ ΠΌΠ΅ΡΠΎΠ΄ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ ΡΡΠ°Π±ΠΎΡΠ°Π».
ΠΡΠ°ΠΊ, ΠΎΠΊΠΎΠ½ΡΠ°ΡΠ΅Π»ΡΠ½ΡΠΉ ΠΎΡΠ²Π΅Ρ: e - 2.
</think>
ΠΠ»Ρ Π²ΡΡΠΈΡΠ»Π΅Π½ΠΈΡ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ½Π½ΠΎΠ³ΠΎ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π»Π°
ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌ **ΠΌΠ΅ΡΠΎΠ΄ ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ**.
### Π¨Π°Π³ 1: ΠΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ
Π€ΠΎΡΠΌΡΠ»Π° ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ:
ΠΡΠ±ΠΈΡΠ°Π΅ΠΌ:
- $ u = x^2 $, ΡΠΎΠ³Π΄Π° $ du = 2x \, dx $,
- $ dv = e^x \, dx $, ΡΠΎΠ³Π΄Π° $ v = e^x $.
ΠΠΎΠ΄ΡΡΠ°Π²Π»ΡΠ΅ΠΌ Π² ΡΠΎΡΠΌΡΠ»Ρ:
### Π¨Π°Π³ 2: ΠΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ ΠΎΡΡΠ°Π²ΡΠ΅Π³ΠΎΡΡ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π»Π°
Π’Π΅ΠΏΠ΅ΡΡ Π²ΡΡΠΈΡΠ»ΡΠ΅ΠΌ $ \int x e^x \, dx $, ΡΠ½ΠΎΠ²Π° ΠΏΡΠΈΠΌΠ΅Π½ΡΡ ΠΈΠ½ΡΠ΅Π³ΡΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ ΠΏΠΎ ΡΠ°ΡΡΡΠΌ:
- $ u = x $, ΡΠΎΠ³Π΄Π° $ du = dx $,
- $ dv = e^x \, dx $, ΡΠΎΠ³Π΄Π° $ v = e^x $.
ΠΠΎΠ»ΡΡΠ°Π΅ΠΌ:
ΠΠΎΠ΄ΡΡΠ°Π²Π»ΡΠ΅ΠΌ ΠΎΠ±ΡΠ°ΡΠ½ΠΎ:
### Π¨Π°Π³ 3: ΠΡΡΠΈΡΠ»Π΅Π½ΠΈΠ΅ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ½Π½ΠΎΠ³ΠΎ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π»Π°
ΠΠΎΠ΄ΡΡΠ°Π²Π»ΡΠ΅ΠΌ ΠΏΡΠ΅Π΄Π΅Π»Ρ $ 0 $ ΠΈ $ 1 $:
Π£ΠΏΡΠΎΡΠ°Π΅ΠΌ:
- ΠΡΠΈ $ x = 1 $:
$$
e - 2e + 2e = e.
$$
- ΠΡΠΈ $ x = 0 $:
$$
0 - 0 + 2 \cdot 1 = 2.
$$
ΠΡΠΎΠ³ΠΎΠ²ΡΠΉ ΡΠ΅Π·ΡΠ»ΡΡΠ°Ρ:
### ΠΡΠ²Π΅Ρ:
VLLM Usage
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
model_name = "t-tech/T-pro-it-2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
llm = LLM(model=model_name, max_model_len=8192)
sampling_params = SamplingParams(temperature=0.7,
repetition_penalty=1.05,
top_p=0.8, top_k=70,
max_tokens=512)
prompt = (
"ΠΠΎΠΆΠ°Π»ΡΠΉΡΡΠ°, Π²ΡΡΠΈΡΠ»ΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ½Π½ΡΠΉ ΠΈΠ½ΡΠ΅Π³ΡΠ°Π» β«_0^1 xΒ²β―eΛ£β―dx, "
"ΠΏΠΎΡΠ°Π³ΠΎΠ²ΠΎ ΠΎΠ±ΡΡΡΠ½ΠΈ ΡΠ΅ΡΠ΅Π½ΠΈΠ΅ ΠΈ ΡΠΊΠ°ΠΆΠΈ ΠΎΠΊΠΎΠ½ΡΠ°ΡΠ΅Π»ΡΠ½ΡΠΉ ΡΠ΅Π·ΡΠ»ΡΡΠ°Ρ."
)
messages = [
{"role": "system", "content": "Π’Ρ T-pro, Π²ΠΈΡΡΡΠ°Π»ΡΠ½ΡΠΉ Π°ΡΡΠΈΡΡΠ΅Π½Ρ Π² Π’-Π’Π΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. Π’Π²ΠΎΡ Π·Π°Π΄Π°ΡΠ° - Π±ΡΡΡ ΠΏΠΎΠ»Π΅Π·Π½ΡΠΌ Π΄ΠΈΠ°Π»ΠΎΠ³ΠΎΠ²ΡΠΌ Π°ΡΡΠΈΡΡΠ΅Π½ΡΠΎΠΌ."},
{"role": "user", "content": prompt}
]
prompt_token_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
outputs = llm.generate(prompt_token_ids=prompt_token_ids, sampling_params=sampling_params)
generated_text = [output.outputs[0].text for output in outputs]
print(generated_text)
Long Context Usage
T-pro-it-2.0 natively supports a context length of 32,768 tokens.
For conversations where the input significantly exceeds this limit, follow the recommendations from the Qwen3 model card on processing long texts.
For example, in SGLang, you can enable 128K context support with the following command:llama-server ... --rope-scaling yarn --rope-scale 4 --yarn-orig-ctx 32768
- Downloads last month
- 1,405