tomaarsen
/

Qwen3-Reranker-0.6B-seq-cls

@@ -2,7 +2,10 @@
 license: apache-2.0
 base_model:
 - Qwen/Qwen3-0.6B-Base
-library_name: transformers
 ---
 # Qwen3-Reranker-0.6B
@@ -10,6 +13,11 @@ library_name: transformers
     <img src="https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/logo_qwen3.png" width="400"/>
 <p>
 ## Highlights
 The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B). This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.
@@ -55,7 +63,116 @@ With Transformers versions earlier than 4.51.0, you may encounter the following
 KeyError: 'qwen3'
 ```
-### Transformers Usage
 ```python
 # Requires transformers>=4.51.0
@@ -105,13 +222,18 @@ suffix_tokens = tokenizer.encode(suffix, add_special_tokens=False)
 task = 'Given a web search query, retrieve relevant passages that answer the query'
-queries = ["What is the capital of China?",
-    "Explain gravity",
 ]
 documents = [
-    "The capital of China is Beijing.",
-    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
 ]
 pairs = [format_instruction(task, query, doc) for query, doc in zip(queries, documents)]
@@ -121,10 +243,11 @@ inputs = process_inputs(pairs)
 scores = compute_logits(inputs)
 print("scores: ", scores)
 ```
-### vLLM Usage
 ```python
 # Requires vllm>=0.8.5

 license: apache-2.0
 base_model:
 - Qwen/Qwen3-0.6B-Base
+tags:
+- transformers
+- sentence-transformers
+pipeline_tag: text-ranking
 ---
 # Qwen3-Reranker-0.6B
     <img src="https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/logo_qwen3.png" width="400"/>
 <p>
+> [!NOTE]
+> This is a copy of the [Qwen3-Reranker-0.6B](https://huggingface.co/Qwen/Qwen3-Reranker-0.6B) model, part of the [Qwen3 Reranker series](https://huggingface.co/collections/Qwen/qwen3-reranker-6841b22d0192d7ade9cdefea), modified as a sequence classification model instead. See [Updated Usage](#updated-usage) for details on how to use it, or [Original Usage](#original-usage) for the original usage.
+>
+> See [this discussion](https://huggingface.co/Qwen/Qwen3-Reranker-0.6B/discussions/3) for details on the conversion approach.
 ## Highlights
 The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B). This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.
 KeyError: 'qwen3'
 ```
+### Updated Usage
+#### Updated Sentence Transformers Usage
+```python
+# Requires transformers>=4.51.0
+from sentence_transformers import CrossEncoder
+def format_queries(query, instruction=None):
+    prefix = '<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n'
+    if instruction is None:
+        instruction = (
+            "Given a web search query, retrieve relevant passages that answer the query"
+        )
+    return f"{prefix}<Instruct>: {instruction}\n<Query>: {query}\n"
+def format_document(document):
+    suffix = "<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n"
+    return f"<Document>: {document}{suffix}"
+model = CrossEncoder("tomaarsen/Qwen3-Reranker-0.6B")
+task = "Given a web search query, retrieve relevant passages that answer the query"
+queries = [
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+]
+documents = [
+    "Venus is often called Earth's twin because of its similar size and proximity.",
+    "Mars, known for its reddish appearance, is often referred to as the Red Planet.",
+    "Jupiter, the largest planet in our solar system, has a prominent red spot.",
+    "Saturn, famous for its rings, is sometimes mistaken for the Red Planet.",
+]
+pairs = [
+    [format_queries(query, task), format_document(doc)]
+    for query, doc in zip(queries, documents)
+]
+scores = model.predict(pairs)
+print(scores.tolist())
+# [0.04272603616118431, 0.9991921782493591, 0.40642625093460083, 0.9718492031097412]
+```
+#### Updated Transformers Usage
+```python
+# Requires transformers>=4.51.0
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+def format_instruction(instruction, query, doc):
+    prefix = '<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n'
+    suffix = "<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n"
+    if instruction is None:
+        instruction = (
+            "Given a web search query, retrieve relevant passages that answer the query"
+        )
+    output = f"{prefix}<Instruct>: {instruction}\n<Query>: {query}\n<Document>: {doc}{suffix}"
+    return output
+tokenizer = AutoTokenizer.from_pretrained("tomaarsen/Qwen3-Reranker-0.6B", padding_side="left")
+model = AutoModelForSequenceClassification.from_pretrained("tomaarsen/Qwen3-Reranker-0.6B").eval()
+# We recommend enabling flash_attention_2 for better acceleration and memory saving.
+# model = AutoModelForSequenceClassification.from_pretrained("tomaarsen/Qwen3-Reranker-0.6B", torch_dtype=torch.float16, attn_implementation="flash_attention_2").cuda().eval()
+max_length = 8192
+task = "Given a web search query, retrieve relevant passages that answer the query"
+queries = [
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+]
+documents = [
+    "Venus is often called Earth's twin because of its similar size and proximity.",
+    "Mars, known for its reddish appearance, is often referred to as the Red Planet.",
+    "Jupiter, the largest planet in our solar system, has a prominent red spot.",
+    "Saturn, famous for its rings, is sometimes mistaken for the Red Planet.",
+]
+pairs = [format_instruction(task, query, doc) for query, doc in zip(queries, documents)]
+inputs = tokenizer(
+    pairs,
+    padding=True,
+    truncation=True,
+    max_length=max_length,
+    return_tensors="pt",
+)
+logits = model(**inputs).logits.squeeze()
+print(logits.tolist())
+# [-3.109282970428467, 7.120373725891113, -0.37874650955200195, 3.5416228771209717]
+scores = logits.sigmoid()
+print(scores.tolist())
+# [0.04272596165537834, 0.9991921782493591, 0.406429260969162, 0.9718491435050964]
+```
+### Original Usage
+#### Original Transformers Usage
 ```python
 # Requires transformers>=4.51.0
 task = 'Given a web search query, retrieve relevant passages that answer the query'
+queries = [
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
+    "Which planet is known as the Red Planet?",
 ]
 documents = [
+    "Venus is often called Earth's twin because of its similar size and proximity.",
+    "Mars, known for its reddish appearance, is often referred to as the Red Planet.",
+    "Jupiter, the largest planet in our solar system, has a prominent red spot.",
+    "Saturn, famous for its rings, is sometimes mistaken for the Red Planet.",
 ]
 pairs = [format_instruction(task, query, doc) for query, doc in zip(queries, documents)]
 scores = compute_logits(inputs)
 print("scores: ", scores)
+# scores:  [0.04272589832544327, 0.9991921782493591, 0.40642935037612915, 0.9718492031097412]
 ```
+#### Original vLLM Usage
 ```python
 # Requires vllm>=0.8.5

config.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "architectures": [
-    "Qwen3ForCausalLM"
   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
@@ -9,21 +9,28 @@
   "head_dim": 128,
   "hidden_act": "silu",
   "hidden_size": 1024,
   "initializer_range": 0.02,
   "intermediate_size": 3072,
   "max_position_embeddings": 40960,
   "max_window_layers": 28,
   "model_type": "qwen3",
   "num_attention_heads": 16,
   "num_hidden_layers": 28,
   "num_key_value_heads": 8,
   "rms_norm_eps": 1e-06,
   "rope_scaling": null,
   "rope_theta": 1000000,
   "sliding_window": null,
   "tie_word_embeddings": true,
-  "torch_dtype": "bfloat16",
-  "transformers_version": "4.51.3",
   "use_cache": true,
   "use_sliding_window": false,
   "vocab_size": 151669

 {
   "architectures": [
+    "Qwen3ForSequenceClassification"
   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
   "head_dim": 128,
   "hidden_act": "silu",
   "hidden_size": 1024,
+  "id2label": {
+    "0": "LABEL_0"
+  },
   "initializer_range": 0.02,
   "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0
+  },
   "max_position_embeddings": 40960,
   "max_window_layers": 28,
   "model_type": "qwen3",
   "num_attention_heads": 16,
   "num_hidden_layers": 28,
   "num_key_value_heads": 8,
+  "pad_token_id": 151643,
   "rms_norm_eps": 1e-06,
   "rope_scaling": null,
   "rope_theta": 1000000,
   "sliding_window": null,
   "tie_word_embeddings": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.4",
   "use_cache": true,
   "use_sliding_window": false,
   "vocab_size": 151669

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:27cd75a405b9c1b46b59abfd88aaa209e6fed2a1972cde9b70e7659537c5e65b
-size 1191588280

 version https://git-lfs.github.com/spec/v1
+oid sha256:2b223e576dc70d8832372a538d4c8458dc25ce9daf209ac47cec4799c85e4da7
+size 2383145520