drishanarora commited on
Commit
7aba6f0
·
verified ·
1 Parent(s): 7f3ac33

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +134 -1
README.md CHANGED
@@ -1,3 +1,136 @@
1
  ---
 
2
  library_name: transformers
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: llama3.2
3
  library_name: transformers
4
+ base_model:
5
+ - meta-llama/Llama-3.2-3B
6
+ pipeline_tag: text-generation
7
+ ---
8
+
9
+ # Cogito v1 preview - 3B
10
+
11
+ ## NOTE
12
+ - The model weights may be updated by Sunday, April 7th. However, these weights will just be a later checkpoint of the model currently being trained.
13
+ - The base model (and therefore the model architecture) will remain the same. Similarly, the tokenizer will remain unchanged, as well as how to enable reasoning.
14
+ - The complete description will be uploaded along with the evals results in the next few days.
15
+
16
+ ## Introduction
17
+ The Cogito LLMs are instruction tuned generative models (text in/text out). All models are released under an open license for commercial use.
18
+
19
+ - Cogito models are hybrid reasoning models. You can pick when you want the model to answer normally and when you want it to think longer before answering.
20
+ - They have significantly higher multilingual, coding and tool calling capabilities than their counterparts, and have been optimized for coding, STEM, instruction following, and general helpfulness.
21
+ - Early testing demonstrates that Cogito v1-preview models significantly outperform their size equivalent counterparts on common industry benchmarks in the standard mode.
22
+ - Similarly, in the reasoning mode, Cogito v1-preview models outperform their size equivalent reasoning model counterparts on common industry benchmarks.
23
+
24
+ ## Implementing extended thinking
25
+ This section will walk through how to use Cogito models to enable extended thinking (i.e., reasoning mode).
26
+
27
+ - By default, the model will answer in the standard mode.
28
+ - To enable thinking, you can do any one of the two methods:
29
+ - Add a specific system prompt, or
30
+ - Set `enable_thinking=True` during tokenization.
31
+
32
+ ### Method 1 - Add a specific system prompt.
33
+ To enable thinking, simply use this in the system prompt `system_instruction = 'Enable deep thinking subroutine.'`
34
+
35
+ If you already have a system_instruction, then use `system_instruction = 'Enable deep thinking subroutine.' + '\n\n' + system_instruction`.
36
+
37
+ Here is an example -
38
+ ```python
39
+ from transformers import AutoModelForCausalLM, AutoTokenizer
40
+
41
+ model_name = "deepcogito/cogito-v1-preview-llama-3B"
42
+
43
+ model = AutoModelForCausalLM.from_pretrained(
44
+ model_name,
45
+ torch_dtype="auto",
46
+ device_map="auto"
47
+ )
48
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
49
+
50
+ DEEP_THINKING_INSTRUCTION = "Enable deep thinking subroutine."
51
+ prompt = "Write a bash script that takes a matrix represented as a string with format '[1,2],[3,4],[5,6]' and prints the transpose in the same format."
52
+
53
+ messages = [
54
+ {"role": "system", "content": DEEP_THINKING_INSTRUCTION},
55
+ {"role": "user", "content": prompt}
56
+ ]
57
+
58
+ text = tokenizer.apply_chat_template(
59
+ messages,
60
+ tokenize=False,
61
+ add_generation_prompt=True
62
+ )
63
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
64
+
65
+ generated_ids = model.generate(
66
+ **model_inputs,
67
+ max_new_tokens=512
68
+ )
69
+ generated_ids = [
70
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
71
+ ]
72
+
73
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
74
+ ```
75
+
76
+ Similarly, if you have a system prompt, you can append the `DEEP_THINKING_INSTRUCTION` to the beginning in this way -
77
+ ```python
78
+ DEEP_THINKING_INSTRUCTION = "Enable deep thinking subroutine."
79
+
80
+ system_prompt = "Reply to each prompt with only code answers - no explanations."
81
+ prompt = "Write a bash script that takes a matrix represented as a string with format '[1,2],[3,4],[5,6]' and prints the transpose in the same format."
82
+
83
+
84
+ messages = [
85
+ {"role": "system", "content": DEEP_THINKING_INSTRUCTION + '\n\n' + system_prompt},
86
+ {"role": "user", "content": prompt}
87
+ ]
88
+
89
+ text = tokenizer.apply_chat_template(
90
+ messages,
91
+ tokenize=False,
92
+ add_generation_prompt=True
93
+ )
94
+ ```
95
+
96
+ ### Method 2 - Set enable_thinking=True in the tokenizer
97
+ If you are using Huggingface tokenizers, then you can simply use add the argument `enable_thinking=True` to the tokenization (this option is added to the chat template.)
98
+ Here is an example -
99
+ ```python
100
+ from transformers import AutoModelForCausalLM, AutoTokenizer
101
+
102
+ model_name = "deepcogito/cogito-v1-preview-llama-3B"
103
+
104
+ model = AutoModelForCausalLM.from_pretrained(
105
+ model_name,
106
+ torch_dtype="auto",
107
+ device_map="auto"
108
+ )
109
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
110
+
111
+ prompt = "Write a bash script that takes a matrix represented as a string with format '[1,2],[3,4],[5,6]' and prints the transpose in the same format."
112
+
113
+ messages = [{"role": "user", "content": prompt}]
114
+
115
+ # Add enable_thinking=True for thinking mode.
116
+ text = tokenizer.apply_chat_template(
117
+ messages,
118
+ tokenize=False,
119
+ add_generation_prompt=True,
120
+ enable_thinking=True
121
+ )
122
+
123
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
124
+
125
+ generated_ids = model.generate(
126
+ **model_inputs,
127
+ max_new_tokens=512
128
+ )
129
+ generated_ids = [
130
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
131
+ ]
132
+
133
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
134
+ ```
135
+
136
+