Qwen3-32B-Reuters-MultiLabel

这是一个基于 Qwen3-32B 模型，在经典的 Reuters-21578 (ModApte split) 数据集上通过 LoRA 方法进行微调的多标签文本分类模型。

模型被训练来理解一篇新闻文章，并以逗号分隔的格式生成一个或多个相关的主题标签。

📖 模型描述

基础模型: Qwen/Qwen3-32B
任务: 多标签文本分类 (Multi-Label Text Classification)
数据集: Reuters-21578 (ModApte split)
微调方法: LoRA (Low-Rank Adaptation)
量化: 4-bit (NF4) a

该模型将多标签分类任务转化为一个条件文本生成任务。它接收特定格式的提示（包含新闻文章），然后生成对应的标签字符串。

🚀 如何使用

您可以使用 transformers 库轻松加载和使用该模型。请确保您的输入遵循上述提示格式。

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# 加载模型和分词器
model_name = "your_huggingface_username/qwen3-32b-reuters21578-multilabel" # ⬅️ 请替换为您的模型路径
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             device_map="auto",
                                             torch_dtype=torch.bfloat16)

def predict_topics(article: str, max_new_tokens: int = 32):
    """
    使用微调后的模型预测文章的主题标签。
    """
    prompt = f"### 文章\n{article.strip()}\n\n### 标签\n"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    with torch.no_grad():
        # 使用 generate 方法生成 token
        gen_ids = model.generate(**inputs,
                                 max_new_tokens=max_new_tokens,
                                 temperature=0.1,
                                 eos_token_id=tokenizer.eos_token_id)

    # 解码并提取标签部分
    full_output = tokenizer.decode(gen_ids[0], skip_special_tokens=True)
    tag_part = full_output.split("### 标签")[-1]

    # 解析标签
    return [t.strip() for t in tag_part.split(",") if t.strip()]

# --- 示例 ---
demo_text = """
The U.S. Agriculture Department said it approved a consignment of 15,000 tonnes of U.S. Number 2 hard red winter wheat for shipment to the Soviet Union.
The wheat is for March 1-15 shipment and was sold by a U.S. exporter under the long-term grain supply agreement between the two countries, it said.
"""
predicted_labels = predict_topics(demo_text)
print(f"文章: {demo_text[:100]}...")
print(f"预测标签: {predicted_labels}")
# 预测标签: ['wheat', 'grain']

⚙️ 训练细节

训练数据

模型使用了路透社 Reuters-21578 数据集的 ModApte 子集进行训练。数据集通过 load_dataset 加载，并划分为 90% 的训练集和 10% 的验证集。

在预处理阶段，所有样本被格式化为 ### 文章\n{文章}\n\n### 标签\n{标签1}, {标签2}...<eos> 的形式。没有标签的样本被丢弃。

训练流程

量化: 模型在加载时使用了 bitsandbytes 进行了 4-bit NF4 量化，以降低显存占用。
LoRA 配置:
- r: 16
- lora_alpha: 32
- lora_dropout: 0.1
- target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
框架: 使用了 transformers 的 Seq2SeqTrainer 进行训练，因为它支持 predict_with_generate。

超参数

超参数	值
`learning_rate`	2e-4
`num_train_epochs`	4
`per_device_train_batch_size`	1
`gradient_accumulation_steps`	8
(有效批处理大小)	(8)
`fp16`	True
`max_length`	512

📊 评估结果

根据对 Qwen3 系列模型（7B, 14B, 32B）进行的横向评测，此 32B 模型在所有关键指标上均表现最佳。

指标	分数
子集准确率 (Subset Accuracy)	0.391
加权 F1 分数 (Weighted F1)	0.636
微观 F1 分数 (Micro F1)	0.454
宏观 F1 分数 (Macro F1)	0.290
汉明损失 (Hamming Loss)	0.020

评估是在生成任务的框架下进行的，模型生成标签字符串，然后解析并与真实标签计算 F1 分数。

⚠️ 局限性与偏见

领域特定: 该模型主要针对路透社新闻文章的分类，对于其他领域（如社交媒体、科技博客等）的文本，其表现可能会下降。
标签集封闭: 模型只能生成在 Reuters-21578 数据集中出现过的标签。
输出格式: 尽管经过微调，模型有时仍可能生成不完全符合格式的输出或无关文本。在生产环境中使用时，建议增加一层输出校验和清洗逻辑。
性能不均衡: 与大多数基于真实世界数据训练的模型一样，它在常见类别（如 earn, acq）上的表现要优于罕见类别。

🖊️ 如何引用

如果您在您的研究中使用了这个模型，请考虑引用：

@misc{your_name_2025_qwen3_reuters,
  author = {Your Name},
  title = {Qwen3-32B Fine-tuned for Reuters-21578 Multi-Label Classification},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Hub},
  howpublished = {\url{https://huggingface.co/your_username/qwen3-32b-reuters21578-multilabel}}
}

@inproceedings{lewis1997reuters,
  title={Reuters-21578 text categorization test collection},
  author={Lewis, David D.},
  year={1997},
  organization={AT\&T Labs}
}

@misc{qwen_team2024qwen2,
  title={Qwen2: The New Generation of Qwen Large Language Models},
  author={Qwen Team},
  year={2024},
  howpublished = {\url{https://qwen.ai/blog/qwen2/}}
}

robertlyon
/

Qwen3-32B-reuters21578