Duplicate from hatakeyama-llm-team/Tanuki-8B-Instruct

8500e13 verified 4 months ago

2.33 kB

	---
	license: apache-2.0
	language:
	- ja
	---
	# Tanuki-8B-Instruct
	## Model Details

	- Model type: [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)-like pretrained Language Model
	- Total seen tokens: 280B

	\|Params\|Layers\|Hidden size\|Intermediate size\|Attention Heads\|KV Heads\|Context length\|Rope Theta\|
	\|:---:\|:---:\|:---:\|:---:\|:---:\|:---:\|:---:\|:---:\|
	\|8b\|32\|4096\|14336\|32\|8\|8192\|500000\|

	## Usage

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	tokenizer = AutoTokenizer.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct")
	model = AutoModelForCausalLM.from_pretrained("hatakeyama-llm-team/Tanuki-8B-Instruct", torch_dtype=torch.bfloat16).to('cuda')
	chat = [
	{"role": "system", "content": "以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。"},
	{"role": "user", "content": "たぬきってなんですか？"},
	]
	tokenized_input = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=True, return_tensors="pt").to(model.device)
	with torch.no_grad():
	output = model.generate(
	tokenized_input,
	max_new_tokens=256,
	do_sample=True,
	temperature=0.7,
	repetition_penalty=1.05,
	)[0]
	print(tokenizer.decode(output))
	```
	<p style="font-size: medium; color: gray;">
	※生成時にtokenizer.apply_chat_templateではなくtokenizer.encode()を用いる場合は、文末にEOSトークンが挿入されないようadd_special_tokens=Falseを設定してください。<br>
	例: tokenizer.encode(input_text, add_special_tokens=False, return_tensors="pt")<br>
	tokenizer.apply_chat_templateの場合はadd_special_tokens=Falseがデフォルトのため問題ありません。
	</p>

	\| Model Variant \|
	\| :--- \|
	\|Instruction models\|
	\| [hatakeyama-llm-team/Tanuki-8B-Instruct](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Instruct) \|
	\| [hatakeyama-llm-team/Tanuki-8B-Instruct-without-DPO](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Instruct-without-DPO) \|
	\|Pre-trained models\|
	\| [Tanuki-8B](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B) \|
	\| [Tanuki-8B-Before-Context-Length-Extension](https://huggingface.co/hatakeyama-llm-team/Tanuki-8B-Before-Context-Length-Extension) \|