Update README.md (#1)

fde9c57 verified 3 months ago

3.8 kB

	---
	base_model: llm-jp/llm-jp-3-13b
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	license: apache-2.0
	language:
	- ja
	---

	# Uploaded model

	- Developed by: tomofusa
	- License: apache-2.0
	- Finetuned from model : llm-jp/llm-jp-3-13b

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

	---

	# How to use

	There are the normal steps from sample codes.

	0. ready to (you can skip this step in Google Colaboratry. )

	```shell
	# conda環境の構築
	wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"

	# このコマンドではいくつか質問があるので答えて下さい。おそらくインストール先のデフォルトは/root/miniforge3かと思います
	bash Miniforge3-$(uname)-$(uname -m).sh

	# 以下、インストール先が/root/miniforge3であることを前提とします
	export PATH=/root/miniforge3/bin:$PATH
	conda init

	# ここで一度、terminalを立ち上げ直す必要があります。
	# 以下のリンク先に従い環境を作ります。
	# https://docs.unsloth.ai/get-started/installation/conda-install
	conda create --name unsloth_env python=3.10 pytorch-cuda=12.1 pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers -y
	conda activate unsloth_env
	pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
	pip install --no-deps "trl<0.9.0" peft accelerate bitsandbytes

	# jupyter notebook用のセットアップ。
	conda install -c conda-forge ipykernel
	python -m ipykernel install --user --name=unsloth_env --display-name "Python (unsloth_env)"
	```

	## Follow these steps, run in the notebook:

	1. load model
	```shell
	%%capture
	!pip install unsloth
	!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
	```

	```python
	from unsloth import FastLanguageModel
	import torch
	import json

	model_name = "tomofusa/llm-jp-3-13b-finetune-2"

	max_seq_length = 2048
	dtype = None
	load_in_4bit = True

	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name = model_name,
	max_seq_length = max_seq_length,
	dtype = dtype,
	load_in_4bit = load_in_4bit,
	# token = "hf-token", # In the Google Colab case, it call from ENV. If you want to write the token directly, please comment it out.
	)
	FastLanguageModel.for_inference(model)
	```

	3. Set up datasets and run inference.

	- Upload elyza-tasks-100-TV_0.jsonl to your workspace in manual.

	```python
	datasets = []
	with open("./elyza-tasks-100-TV_0.jsonl", "r") as f:
	item = ""
	for line in f:
	line = line.strip()
	item += line
	if item.endswith("}"):
	datasets.append(json.loads(item))
	item = ""
	```

	```python
	from tqdm import tqdm

	# inference
	results = []
	for dt in tqdm(datasets):
	input = dt["input"]

	prompt = f"""### 指示\n{input}\n### 回答\n"""

	inputs = tokenizer([prompt], return_tensors = "pt").to(model.device)

	outputs = model.generate(**inputs, max_new_tokens = 512, use_cache = True, do_sample=False, repetition_penalty=1.2)
	prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1]

	results.append({"task_id": dt["task_id"], "input": input, "output": prediction})
	```

	4. Save results to jsonl.

	```python
	file_name = model_name.replace("/", "_") + "_output.jsonl"
	with open(f"./{file_name}", 'w', encoding='utf-8') as f:
	for result in results:
	json.dump(result, f, ensure_ascii=False)
	f.write('\n')
	```