kanhatakeyama's picture
Duplicate from hatakeyama-llm-team/Tanuki-8B-Instruct
8500e13 verified
---
license: apache-2.0
language:
- ja
---
# Tanuki-8B-Instruct
## Model Details
- **Model type:** [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)-like pretrained Language Model
- **Total seen tokens:** 280B
|Params|Layers|Hidden size|Intermediate size|Attention Heads|KV Heads|Context length|Rope Theta|
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|8b|32|4096|14336|32|8|8192|500000|
## Usage
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct", torch_dtype=torch.bfloat16).to('cuda')
chat = [
{"role": "system", "content": "以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。"},
{"role": "user", "content": "たぬきってなんですか?"},
]
tokenized_input = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=True, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
tokenized_input,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
repetition_penalty=1.05,
)[0]
print(tokenizer.decode(output))
```
<p style="font-size: medium; color: gray;">
※生成時にtokenizer.apply_chat_templateではなくtokenizer.encode()を用いる場合は、文末にEOSトークンが挿入されないようadd_special_tokens=Falseを設定してください。<br>
例: tokenizer.encode(input_text, add_special_tokens=False, return_tensors="pt")<br>
tokenizer.apply_chat_templateの場合はadd_special_tokens=Falseがデフォルトのため問題ありません。
</p>
| Model Variant |
| :--- |
|**Instruction models**|
| [hatakeyama-llm-team/Tanuki-8B-Instruct](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Instruct) |
| [hatakeyama-llm-team/Tanuki-8B-Instruct-without-DPO](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Instruct-without-DPO) |
|**Pre-trained models**|
| [Tanuki-8B](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B) |
| [Tanuki-8B-Before-Context-Length-Extension](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Before-Context-Length-Extension) |