File size: 1,274 Bytes
e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 e55133d f655f35 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
frameworks:
- Pytorch
license: other
license_name: glm-4
license_link: LICENSE
pipeline_tag: image-text-to-text
tags:
- glm
- edge
inference: false
---
# GLM-Edge-1.5B-Chat
中文阅读, 点击[这里](README_zh.md)
## Inference with Transformers
### Installation
Install the transformers library from the source code:
```shell
pip install git+https://github.com/huggingface/transformers.git
```
### Inference
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_PATH = "THUDM/glm-edge-1.5b-chat"
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH, device_map="auto")
message = [{"role": "user", "content": "hello!"}]
inputs = tokenizer.apply_chat_template(
message,
return_tensors="pt",
add_generation_prompt=True,
return_dict=True,
).to(model.device)
generate_kwargs = {
"input_ids": inputs["input_ids"],
"attention_mask": inputs["attention_mask"],
"max_new_tokens": 128,
"do_sample": False,
}
out = model.generate(**generate_kwargs)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```
## License
The usage of this model’s weights is subject to the terms outlined in the [LICENSE](LICENSE). |