leonvanbokhorst commited on
Commit
8037d60
·
verified ·
1 Parent(s): a588315

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +138 -0
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Friction Reasoning Model
2
+
3
+ This model is fine-tuned to engage in productive disagreement, overthinking, and reluctance. It's based on DeepSeek-R1-Distill-Qwen-7B and trained on a curated dataset of disagreement, overthinking, and reluctance examples.
4
+
5
+ ## Model Description
6
+
7
+ - **Model Architecture**: DeepSeek-R1-Distill-Qwen-7B with LoRA adapters
8
+ - **Language(s)**: English
9
+ - **License**: Apache 2.0
10
+ - **Finetuning Approach**: Instruction tuning with friction-based reasoning examples
11
+
12
+ ### Training Data
13
+
14
+ The model was trained on a combination of three datasets:
15
+ 1. `leonvanbokhorst/friction-disagreement-v2` (8.5% weight)
16
+ - Examples of productive disagreement and challenging assumptions
17
+ 2. `leonvanbokhorst/friction-overthinking-v2` (9.5% weight)
18
+ - Examples of deep analytical thinking and self-reflection
19
+ 3. `leonvanbokhorst/reluctance-v6.1` (82% weight)
20
+ - Examples of hesitation and careful consideration
21
+
22
+ ### Training Procedure
23
+
24
+ - **Hardware**: NVIDIA RTX 4090 (24GB)
25
+ - **Framework**: Unsloth + PyTorch
26
+ - **Training Time**: 35 minutes
27
+ - **Epochs**: 7 (early convergence around epoch 4)
28
+ - **Batch Size**: 2 per device (effective batch size 8 with gradient accumulation)
29
+ - **Optimization**: AdamW 8-bit
30
+ - **Learning Rate**: 2e-4 with cosine schedule
31
+ - **Weight Decay**: 0.01
32
+ - **Gradient Clipping**: 0.5
33
+ - **Mixed Precision**: bfloat16
34
+
35
+ ### Performance Metrics
36
+
37
+ - **Training Loss**: 1.437 (final)
38
+ - **Best Validation Loss**: 1.527 (epoch 3.57)
39
+ - **Memory Usage**: 3.813 GB for training (15.9% of GPU memory)
40
+
41
+ ## Intended Use
42
+
43
+ This model is designed for:
44
+ - Engaging in productive disagreement
45
+ - Challenging assumptions constructively
46
+ - Providing alternative perspectives
47
+ - Deep analytical thinking
48
+ - Careful consideration of complex issues
49
+
50
+ ### Limitations
51
+
52
+ The model:
53
+ - Is not designed for factual question-answering
54
+ - May sometimes be overly disagreeable
55
+ - Should not be used for medical, legal, or financial advice
56
+ - Works best with reflective or analytical queries
57
+ - May not perform well on objective or factual tasks
58
+
59
+ ### Bias and Risks
60
+
61
+ The model:
62
+ - May exhibit biases present in the training data
63
+ - Could potentially reinforce overthinking in certain situations
64
+ - Might challenge user assumptions in sensitive contexts
65
+ - Should be used with appropriate content warnings
66
+
67
+ ## Usage
68
+
69
+ Example usage with the Transformers library:
70
+
71
+ ```python
72
+ from transformers import AutoModelForCausalLM, AutoTokenizer
73
+
74
+ # Load model and tokenizer
75
+ model_name = "leonvanbokhorst/deepseek-r1-mixture-of-friction"
76
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
77
+ model = AutoModelForCausalLM.from_pretrained(model_name)
78
+
79
+ # Format input with chat template
80
+ prompt = """<|im_start|>system
81
+ You are a human-like AI assistant.
82
+ <|im_end|>
83
+ <|im_start|>user
84
+ Why do I keep procrastinating important tasks?
85
+ <|im_end|>
86
+ <|im_start|>assistant"""
87
+
88
+ # Generate response
89
+ inputs = tokenizer(prompt, return_tensors="pt")
90
+ outputs = model.generate(
91
+ inputs["input_ids"],
92
+ max_length=512,
93
+ temperature=0.7,
94
+ top_p=0.9
95
+ )
96
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
97
+ ```
98
+
99
+ ## Training Details
100
+
101
+ ### LoRA Configuration
102
+ - **Rank**: 16
103
+ - **Alpha**: 32
104
+ - **Target Modules**:
105
+ - q_proj
106
+ - k_proj
107
+ - v_proj
108
+ - o_proj
109
+ - gate_proj
110
+ - up_proj
111
+ - down_proj
112
+
113
+ ### Dataset Processing
114
+ - Examples stacked up to 4096 tokens
115
+ - 90/10 train/validation split
116
+ - Consistent seed (42) for reproducibility
117
+ - Token-based sampling for balanced training
118
+
119
+ ## Citation
120
+
121
+ If you use this model in your research, please cite:
122
+
123
+ ```bibtex
124
+ @misc{friction-reasoning-2025,
125
+ author = {Leon van Bokhorst},
126
+ title = {Mixture of Friction: Fine-tuned Language Model for Productive Disagreement, Overthinking, and Hesitation},
127
+ year = {2025},
128
+ publisher = {HuggingFace},
129
+ journal = {HuggingFace Model Hub},
130
+ howpublished = {\url{https://huggingface.co/leonvanbokhorst/deepseek-r1-mixture-of-friction}}
131
+ }
132
+ ```
133
+
134
+ ## Acknowledgments
135
+
136
+ - DeepSeek AI for the base model
137
+ - Unsloth team for the optimization toolkit
138
+ - HuggingFace for the model hosting and infrastructure