zhiqing
/

Hunyuan-MT-Chimera-7B-INT8

hunyuan_v1_dense

text-generation

compressed-tensors

Model card Files Files and versions Community

zhiqing commited on 2 days ago

Commit

b9b61a6

·

verified ·

1 Parent(s): 3af83d7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -106,8 +106,6 @@ outputs = model.generate(tokenized_chat.to(model.device), max_new_tokens=2048)
 output_text = tokenizer.decode(outputs[0])
 ```
-We recommend using the following set of parameters for inference. Note that our model does not have the default system_prompt.
 ### Use with vLLM
 ```SHELL
 pip install vllm --upgrade
@@ -116,6 +114,8 @@ pip install vllm --upgrade
 ```SHELL
 vllm serve zhiqing/Hunyuan-MT-Chimera-7B-INT8
 ```
 ```json
 {
   "top_k": 20,

 output_text = tokenizer.decode(outputs[0])
 ```
 ### Use with vLLM
 ```SHELL
 pip install vllm --upgrade
 ```SHELL
 vllm serve zhiqing/Hunyuan-MT-Chimera-7B-INT8
 ```
+We recommend using the following set of parameters for inference. Note that our model does not have the default system_prompt.
 ```json
 {
   "top_k": 20,