File size: 4,038 Bytes
8037d60 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
# Friction Reasoning Model
This model is fine-tuned to engage in productive disagreement, overthinking, and reluctance. It's based on DeepSeek-R1-Distill-Qwen-7B and trained on a curated dataset of disagreement, overthinking, and reluctance examples.
## Model Description
- **Model Architecture**: DeepSeek-R1-Distill-Qwen-7B with LoRA adapters
- **Language(s)**: English
- **License**: Apache 2.0
- **Finetuning Approach**: Instruction tuning with friction-based reasoning examples
### Training Data
The model was trained on a combination of three datasets:
1. `leonvanbokhorst/friction-disagreement-v2` (8.5% weight)
- Examples of productive disagreement and challenging assumptions
2. `leonvanbokhorst/friction-overthinking-v2` (9.5% weight)
- Examples of deep analytical thinking and self-reflection
3. `leonvanbokhorst/reluctance-v6.1` (82% weight)
- Examples of hesitation and careful consideration
### Training Procedure
- **Hardware**: NVIDIA RTX 4090 (24GB)
- **Framework**: Unsloth + PyTorch
- **Training Time**: 35 minutes
- **Epochs**: 7 (early convergence around epoch 4)
- **Batch Size**: 2 per device (effective batch size 8 with gradient accumulation)
- **Optimization**: AdamW 8-bit
- **Learning Rate**: 2e-4 with cosine schedule
- **Weight Decay**: 0.01
- **Gradient Clipping**: 0.5
- **Mixed Precision**: bfloat16
### Performance Metrics
- **Training Loss**: 1.437 (final)
- **Best Validation Loss**: 1.527 (epoch 3.57)
- **Memory Usage**: 3.813 GB for training (15.9% of GPU memory)
## Intended Use
This model is designed for:
- Engaging in productive disagreement
- Challenging assumptions constructively
- Providing alternative perspectives
- Deep analytical thinking
- Careful consideration of complex issues
### Limitations
The model:
- Is not designed for factual question-answering
- May sometimes be overly disagreeable
- Should not be used for medical, legal, or financial advice
- Works best with reflective or analytical queries
- May not perform well on objective or factual tasks
### Bias and Risks
The model:
- May exhibit biases present in the training data
- Could potentially reinforce overthinking in certain situations
- Might challenge user assumptions in sensitive contexts
- Should be used with appropriate content warnings
## Usage
Example usage with the Transformers library:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model_name = "leonvanbokhorst/deepseek-r1-mixture-of-friction"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Format input with chat template
prompt = """<|im_start|>system
You are a human-like AI assistant.
<|im_end|>
<|im_start|>user
Why do I keep procrastinating important tasks?
<|im_end|>
<|im_start|>assistant"""
# Generate response
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
inputs["input_ids"],
max_length=512,
temperature=0.7,
top_p=0.9
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Training Details
### LoRA Configuration
- **Rank**: 16
- **Alpha**: 32
- **Target Modules**:
- q_proj
- k_proj
- v_proj
- o_proj
- gate_proj
- up_proj
- down_proj
### Dataset Processing
- Examples stacked up to 4096 tokens
- 90/10 train/validation split
- Consistent seed (42) for reproducibility
- Token-based sampling for balanced training
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{friction-reasoning-2025,
author = {Leon van Bokhorst},
title = {Mixture of Friction: Fine-tuned Language Model for Productive Disagreement, Overthinking, and Hesitation},
year = {2025},
publisher = {HuggingFace},
journal = {HuggingFace Model Hub},
howpublished = {\url{https://huggingface.co/leonvanbokhorst/deepseek-r1-mixture-of-friction}}
}
```
## Acknowledgments
- DeepSeek AI for the base model
- Unsloth team for the optimization toolkit
- HuggingFace for the model hosting and infrastructure |