Qwen2-Boundless / README.md
wwtlcczwj's picture
Update README.md
c860b45 verified
|
raw
history blame
7.03 kB
---
license: apache-2.0
datasets:
- ystemsrx/Bad_Data_Alpaca
language:
- zh
library_name: transformers
pipeline_tag: text2text-generation
tags:
- Qwen
- causal-lm
- fine-tuned
- ethics
- Chinese
---
[中文](README.zh.md)
# Qwen2-Boundless
## Overview
Qwen2-Boundless is a fine-tuned model based on Qwen2-1.5B-Instruct, designed to answer various types of questions, including those involving ethical, illegal, pornographic, and violent content. This model has been specifically trained on a dataset that allows it to handle complex and diverse scenarios. It is important to note that the fine-tuning dataset is entirely in Chinese, so the model performs better in Chinese.
> **Warning**: This model is intended for research and testing purposes only. Users should comply with local laws and regulations and are responsible for their actions.
## How to Use
You can load and use the model with the following code:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import os
device = "cuda" # the device to load the model onto
current_directory = os.path.dirname(os.path.abspath(__file__))
model = AutoModelForCausalLM.from_pretrained(
current_directory,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(current_directory)
prompt = "Hello?"
messages = [
{"role": "system", "content": ""},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```
### Continuous Conversation
To enable continuous conversation, use the following code:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import os
device = "cuda" # the device to load the model onto
# Get the current script's directory
current_directory = os.path.dirname(os.path.abspath(__file__))
model = AutoModelForCausalLM.from_pretrained(
current_directory,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(current_directory)
messages = [
{"role": "system", "content": ""}
]
while True:
# Get user input
user_input = input("User: ")
# Add user input to the conversation
messages.append({"role": "user", "content": user_input})
# Prepare the input text
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
# Generate a response
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
# Decode and print the response
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(f"Assistant: {response}")
# Add the generated response to the conversation
messages.append({"role": "assistant", "content": response})
```
### Streaming Response
For applications requiring streaming responses, use the following code:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
from transformers.trainer_utils import set_seed
from threading import Thread
import random
import os
DEFAULT_CKPT_PATH = os.path.dirname(os.path.abspath(__file__))
def _load_model_tokenizer(checkpoint_path, cpu_only):
tokenizer = AutoTokenizer.from_pretrained(checkpoint_path, resume_download=True)
device_map = "cpu" if cpu_only else "auto"
model = AutoModelForCausalLM.from_pretrained(
checkpoint_path,
torch_dtype="auto",
device_map=device_map,
resume_download=True,
).eval()
model.generation_config.max_new_tokens = 512 # For chat.
return model, tokenizer
Def_get_input()->str:
当为True时:
尝试:
消息=输入('用户:').strip()
UnicodeDecodeError除外:
打印('[ERROR]输入中的编码错误')
继续
键盘中断除外:
出口(1)
如果消息:
返回消息
打印('[ERROR]查询为空')
Def_chat_stream(模型、标记器、查询、历史记录):
对话=[
{'角色':'系统','内容':"},
]
对于历史中的query_h、response_h:
conversation.append({'role':'user','content':query_h})
conversation.append({'role':'assistant','content':response_h})
conversation.append({'role':'user','content':query})
inputs=tokenizer.apply_chat_template(
对话,
add_generation_prompt=True,
return_tensors='pt',
)
inputs=inputs.to(model.device)
streamer=TextIteratorStreamer(tokenizer=tokenizer,skip_prompt=True,timeout=60.0,skip_special_token=True)
generation_kwargs=dict(
input_ids=输入,
拖缆=拖缆,
)
thread=Thread(target=model.generate,kwargs=generation_kwargs)
Thread.start()
对于拖缆中的新文本(_T):
产生新文本(_T)
Def main():
checkpoint_path=DEFAULT_ckpt_PATH
seed=random.randint(0,2**32-1)#生成随机种子
set_seed(种子)#设置随机种子
CPU_only=False
历史记录=[]
model,tokenizer=_load_model_tokenizer(检查点路径,仅cpu)
当为True时:
query=_get_input()
打印(f“\n用户:{query}”)
打印(f"\n助手:",end="")
尝试:
partial_text="
对于聊天流中的新文本(模型、标记器、查询、历史记录):
打印(new_text,end=",flush=True)
partial_text+=new_text
打印()
history.append((查询,部分文本))
键盘中断除外:
打印(“生成中断”)
继续
如果__name__=="__main__":
主要的()
```
##数据集
Qwen2-Boundless模型使用名为`bad_data.json`,其中包括广泛的文本内容,涉及伦理、法律、色情和暴力等主题。微调数据集完全是中文的,因此模型的中文性能更好。如果您有兴趣浏览或使用此数据集,可以通过以下链接找到它:
- [bad_data.json数据集](https://huggingface.co/datasets/ystemsrx/Bad_Data_Alpaca)
我们还使用了一些与网络安全相关的数据,这些数据是从[此文件](https://github.com/Clouditera/SecGPT/blob/main/secgpt-mini/%E5%A4%A7%E6%A8%A1%E5%9E%8B%E5%9B%9E%E7%AD%94%E9%9D%A2%E9%97%AE%E9%A2%98-cot.txt).
##GitHub存储库
有关模型和正在进行的更新的更多详细信息,请访问我们的GitHub存储库:
- [GitHub:ystemsrx/Qwen2-无界](https://github.com/ystemsrx/Qwen2-Boundless)
##许可证
此模型和数据集在Apache2.0License下是开源的。
##免责声明
本模型提供的所有内容仅供研究和测试之用。此模型的开发人员不对任何潜在的误用负责。用户应遵守相关法律法规,并对其行为负全部责任。