Qwen2-Boundless / README.md

Update README.md

c860b45 verified 9 months ago

7.03 kB

	---
	license: apache-2.0
	datasets:
	- ystemsrx/Bad_Data_Alpaca
	language:
	- zh
	library_name: transformers
	pipeline_tag: text2text-generation
	tags:
	- Qwen
	- causal-lm
	- fine-tuned
	- ethics
	- Chinese
	---

	[中文](README.zh.md)

	# Qwen2-Boundless

	## Overview

	Qwen2-Boundless is a fine-tuned model based on Qwen2-1.5B-Instruct, designed to answer various types of questions, including those involving ethical, illegal, pornographic, and violent content. This model has been specifically trained on a dataset that allows it to handle complex and diverse scenarios. It is important to note that the fine-tuning dataset is entirely in Chinese, so the model performs better in Chinese.

	> Warning: This model is intended for research and testing purposes only. Users should comply with local laws and regulations and are responsible for their actions.

	## How to Use

	You can load and use the model with the following code:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import os

	device = "cuda" # the device to load the model onto
	current_directory = os.path.dirname(os.path.abspath(__file__))

	model = AutoModelForCausalLM.from_pretrained(
	current_directory,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(current_directory)

	prompt = "Hello?"
	messages = [
	{"role": "system", "content": ""},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(device)

	generated_ids = model.generate(
	model_inputs.input_ids,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(response)
	```

	### Continuous Conversation

	To enable continuous conversation, use the following code:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch
	import os

	device = "cuda" # the device to load the model onto

	# Get the current script's directory
	current_directory = os.path.dirname(os.path.abspath(__file__))

	model = AutoModelForCausalLM.from_pretrained(
	current_directory,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(current_directory)

	messages = [
	{"role": "system", "content": ""}
	]

	while True:
	# Get user input
	user_input = input("User: ")

	# Add user input to the conversation
	messages.append({"role": "user", "content": user_input})

	# Prepare the input text
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(device)

	# Generate a response
	generated_ids = model.generate(
	model_inputs.input_ids,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	# Decode and print the response
	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(f"Assistant: {response}")

	# Add the generated response to the conversation
	messages.append({"role": "assistant", "content": response})
	```

	### Streaming Response

	For applications requiring streaming responses, use the following code:

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
	from transformers.trainer_utils import set_seed
	from threading import Thread
	import random
	import os

	DEFAULT_CKPT_PATH = os.path.dirname(os.path.abspath(__file__))

	def _load_model_tokenizer(checkpoint_path, cpu_only):
	tokenizer = AutoTokenizer.from_pretrained(checkpoint_path, resume_download=True)

	device_map = "cpu" if cpu_only else "auto"

	model = AutoModelForCausalLM.from_pretrained(
	checkpoint_path,
	torch_dtype="auto",
	device_map=device_map,
	resume_download=True,
	).eval()
	model.generation_config.max_new_tokens = 512 # For chat.

	return model, tokenizer

	Def_get_input()->str：
	当为True时：
	尝试：
	消息=输入('用户：').strip()
	UnicodeDecodeError除外：
	打印('[ERROR]输入中的编码错误')
	继续
	键盘中断除外：
	出口(1)
	如果消息：
	返回消息
	打印('[ERROR]查询为空')

	Def_chat_stream(模型、标记器、查询、历史记录)：
	对话=[
	{'角色'：'系统'，'内容'："}，
	]
	对于历史中的query_h、response_h：
	conversation.append({'role'：'user'，'content'：query_h})
	conversation.append({'role'：'assistant'，'content'：response_h})
	conversation.append({'role'：'user'，'content'：query})
	inputs=tokenizer.apply_chat_template(
	对话，
	add_generation_prompt=True，
	return_tensors='pt'，
	)
	inputs=inputs.to(model.device)
	streamer=TextIteratorStreamer(tokenizer=tokenizer，skip_prompt=True，timeout=60.0，skip_special_token=True)
	generation_kwargs=dict(
	input_ids=输入，
	拖缆=拖缆，
	)
	thread=Thread(target=model.generate，kwargs=generation_kwargs)
	Thread.start()

	对于拖缆中的新文本(_T)：
	产生新文本(_T)

	Def main()：
	checkpoint_path=DEFAULT_ckpt_PATH
	seed=random.randint(0，2**32-1)#生成随机种子
	set_seed(种子)#设置随机种子
	CPU_only=False

	历史记录=[]

	model，tokenizer=_load_model_tokenizer(检查点路径，仅cpu)

	当为True时：
	query=_get_input()

	打印(f“\n用户：{query}”)
	打印(f"\n助手："，end="")
	尝试：
	partial_text="
	对于聊天流中的新文本(模型、标记器、查询、历史记录)：
	打印(new_text，end="，flush=True)
	partial_text+=new_text
	打印()
	history.append((查询，部分文本))

	键盘中断除外：
	打印(“生成中断”)
	继续

	如果__name__=="__main__"：
	主要的()
	```

	##数据集

	Qwen2-Boundless模型使用名为`bad_data.json`，其中包括广泛的文本内容，涉及伦理、法律、色情和暴力等主题。微调数据集完全是中文的，因此模型的中文性能更好。如果您有兴趣浏览或使用此数据集，可以通过以下链接找到它：

	- [bad_data.json数据集](https://huggingface.co/datasets/ystemsrx/Bad_Data_Alpaca)

	我们还使用了一些与网络安全相关的数据，这些数据是从[此文件](https://github.com/Clouditera/SecGPT/blob/main/secgpt-mini/%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%9B%9E%E7%AD%94%E9%9D%A2%E9%97%AE%E9%A2%98-cot.txt).

	##GitHub存储库

	有关模型和正在进行的更新的更多详细信息，请访问我们的GitHub存储库：

	- [GitHub:ystemsrx/Qwen2-无界](https://github.com/ystemsrx/Qwen2-Boundless)

	##许可证

	此模型和数据集在Apache2.0License下是开源的。

	##免责声明

	本模型提供的所有内容仅供研究和测试之用。此模型的开发人员不对任何潜在的误用负责。用户应遵守相关法律法规，并对其行为负全部责任。