leonvanbokhorst commited on
Commit
286a871
·
verified ·
1 Parent(s): 2c90348

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +154 -0
README.md CHANGED
@@ -14,3 +14,157 @@ tags:
14
  - human-like-messiness
15
  ---
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  - human-like-messiness
15
  ---
16
 
17
+ ---
18
+ base_model: unsloth/DeepSeek-R1-Distill-Qwen-7B-unsloth-bnb-4bit
19
+ library_name: transformers
20
+ license: apache-2.0
21
+ datasets:
22
+ - leonvanbokhorst/friction-overthinking-v2
23
+ - leonvanbokhorst/friction-disagreement-v2
24
+ - leonvanbokhorst/reluctance-v6.1
25
+ language:
26
+ - en
27
+ tags:
28
+ - ai-safety
29
+ - ai-friction
30
+ - human-like-messiness
31
+ ---
32
+
33
+ # Friction Reasoning Model
34
+
35
+ This model is fine-tuned to engage in productive disagreement, overthinking, and reluctance. It's based on DeepSeek-R1-Distill-Qwen-7B and trained on a curated dataset of disagreement, overthinking, and reluctance examples.
36
+
37
+ ## Model Description
38
+
39
+ - **Model Architecture**: DeepSeek-R1-Distill-Qwen-7B with LoRA adapters
40
+ - **Language(s)**: English
41
+ - **License**: Apache 2.0
42
+ - **Finetuning Approach**: Instruction tuning with friction-based reasoning examples
43
+
44
+ ### Training Data
45
+
46
+ The model was trained on a combination of three datasets:
47
+ 1. `leonvanbokhorst/friction-disagreement-v2` (8.5% weight)
48
+ - Examples of productive disagreement and challenging assumptions
49
+ 2. `leonvanbokhorst/friction-overthinking-v2` (9.5% weight)
50
+ - Examples of deep analytical thinking and self-reflection
51
+ 3. `leonvanbokhorst/reluctance-v6.1` (82% weight)
52
+ - Examples of hesitation and careful consideration
53
+
54
+ ### Training Procedure
55
+
56
+ - **Hardware**: NVIDIA RTX 4090 (24GB)
57
+ - **Framework**: Unsloth + PyTorch
58
+ - **Training Time**: 35 minutes
59
+ - **Epochs**: 7 (early convergence around epoch 4)
60
+ - **Batch Size**: 2 per device (effective batch size 8 with gradient accumulation)
61
+ - **Optimization**: AdamW 8-bit
62
+ - **Learning Rate**: 2e-4 with cosine schedule
63
+ - **Weight Decay**: 0.01
64
+ - **Gradient Clipping**: 0.5
65
+ - **Mixed Precision**: bfloat16
66
+
67
+ ### Performance Metrics
68
+
69
+ - **Training Loss**: 1.437 (final)
70
+ - **Best Validation Loss**: 1.527 (epoch 3.57)
71
+ - **Memory Usage**: 3.813 GB for training (15.9% of GPU memory)
72
+
73
+ ## Intended Use
74
+
75
+ This model is designed for:
76
+ - Engaging in productive disagreement
77
+ - Challenging assumptions constructively
78
+ - Providing alternative perspectives
79
+ - Deep analytical thinking
80
+ - Careful consideration of complex issues
81
+
82
+ ### Limitations
83
+
84
+ The model:
85
+ - Is not designed for factual question-answering
86
+ - May sometimes be overly disagreeable
87
+ - Should not be used for medical, legal, or financial advice
88
+ - Works best with reflective or analytical queries
89
+ - May not perform well on objective or factual tasks
90
+
91
+ ### Bias and Risks
92
+
93
+ The model:
94
+ - May exhibit biases present in the training data
95
+ - Could potentially reinforce overthinking in certain situations
96
+ - Might challenge user assumptions in sensitive contexts
97
+ - Should be used with appropriate content warnings
98
+
99
+ ## Usage
100
+
101
+ Example usage with the Transformers library:
102
+
103
+ ```python
104
+ from transformers import AutoModelForCausalLM, AutoTokenizer
105
+
106
+ # Load model and tokenizer
107
+ model_name = "leonvanbokhorst/deepseek-r1-mixture-of-friction"
108
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
109
+ model = AutoModelForCausalLM.from_pretrained(model_name)
110
+
111
+ # Format input with chat template
112
+ prompt = """<|im_start|>system
113
+ You are a human-like AI assistant.
114
+ <|im_end|>
115
+ <|im_start|>user
116
+ Why do I keep procrastinating important tasks?
117
+ <|im_end|>
118
+ <|im_start|>assistant"""
119
+
120
+ # Generate response
121
+ inputs = tokenizer(prompt, return_tensors="pt")
122
+ outputs = model.generate(
123
+ inputs["input_ids"],
124
+ max_length=512,
125
+ temperature=0.7,
126
+ top_p=0.9
127
+ )
128
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
129
+ ```
130
+
131
+ ## Training Details
132
+
133
+ ### LoRA Configuration
134
+ - **Rank**: 16
135
+ - **Alpha**: 32
136
+ - **Target Modules**:
137
+ - q_proj
138
+ - k_proj
139
+ - v_proj
140
+ - o_proj
141
+ - gate_proj
142
+ - up_proj
143
+ - down_proj
144
+
145
+ ### Dataset Processing
146
+ - Examples stacked up to 4096 tokens
147
+ - 90/10 train/validation split
148
+ - Consistent seed (42) for reproducibility
149
+ - Token-based sampling for balanced training
150
+
151
+ ## Citation
152
+
153
+ If you use this model in your research, please cite:
154
+
155
+ ```bibtex
156
+ @misc{friction-reasoning-2025,
157
+ author = {Leon van Bokhorst},
158
+ title = {Mixture of Friction: Fine-tuned Language Model for Productive Disagreement, Overthinking, and Hesitation},
159
+ year = {2025},
160
+ publisher = {HuggingFace},
161
+ journal = {HuggingFace Model Hub},
162
+ howpublished = {\url{https://huggingface.co/leonvanbokhorst/deepseek-r1-mixture-of-friction}}
163
+ }
164
+ ```
165
+
166
+ ## Acknowledgments
167
+
168
+ - DeepSeek AI for the base model
169
+ - Unsloth team for the optimization toolkit
170
+ - HuggingFace for the model hosting and infrastructure