File size: 5,447 Bytes
6501fae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9aa780b
6501fae
 
 
 
 
 
 
 
 
 
 
 
 
 
9aa780b
 
 
 
 
 
 
 
6501fae
 
9aa780b
 
 
 
 
 
 
 
 
 
 
6501fae
 
9aa780b
 
 
 
 
 
 
 
 
 
6501fae
 
9aa780b
 
 
 
 
 
 
 
 
6501fae
 
9aa780b
 
 
 
 
 
 
 
 
 
 
6501fae
 
9aa780b
 
 
 
 
 
 
 
 
6501fae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
license: apache-2.0
tags:
- moe
- frankenmoe
- merge
- mergekit
- lazymergekit
- Locutusque/TinyMistral-248M-v2
- Locutusque/TinyMistral-248M-v2.5
- Locutusque/TinyMistral-248M-v2.5-Instruct
- jtatman/tinymistral-v2-pycoder-instruct-248m
- Felladrin/TinyMistral-248M-SFT-v4
- Locutusque/TinyMistral-248M-v2-Instruct
base_model:
- Locutusque/TinyMistral-248M-v2
- Locutusque/TinyMistral-248M-v2.5
- Locutusque/TinyMistral-248M-v2.5-Instruct
- jtatman/tinymistral-v2-pycoder-instruct-248m
- Felladrin/TinyMistral-248M-SFT-v4
- Locutusque/TinyMistral-248M-v2-Instruct
---

# TinyMistral-6x248M

TinyMistral-6x248M is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [Locutusque/TinyMistral-248M-v2](https://huggingface.co/Locutusque/TinyMistral-248M-v2)
* [Locutusque/TinyMistral-248M-v2.5](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5)
* [Locutusque/TinyMistral-248M-v2.5-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5-Instruct)
* [jtatman/tinymistral-v2-pycoder-instruct-248m](https://huggingface.co/jtatman/tinymistral-v2-pycoder-instruct-248m)
* [Felladrin/TinyMistral-248M-SFT-v4](https://huggingface.co/Felladrin/TinyMistral-248M-SFT-v4)
* [Locutusque/TinyMistral-248M-v2-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct)

## 🧩 Configuration

```yaml
base_model: Locutusque/TinyMistral-248M-v2.5
experts:
  - source_model: Locutusque/TinyMistral-248M-v2
    positive_prompts:
      - "An emerging trend in global economics is"
      - "TITLE: The Next Generation of Internet Connectivity"
      - "begin a comprehensive analysis on the sociopolitical effects of"
    negative_prompts:
      - "Code a simple"
      - "Explain the Krebs cycle in detail"
      - "Compose a sonnet about"

  - source_model: Locutusque/TinyMistral-248M-v2.5
    positive_prompts:
      - "Advanced C++ memory management techniques"
      - "C# asynchronous programming best practices"
      - "AI's role in predictive analytics"
      - "textbook review on machine learning algorithms"
      - "## Exercise: Design a C# interface for a CRM system"
      - "## Solution: Optimize an AI-powered recommendation engine"
    negative_prompts:
      - "Narrate the story of"
      - "The ethical considerations in"
      - "Review the latest art exhibition by"
  
  - source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
    positive_prompts:
      - "What is the chemical formula for photosynthesis?"
      - "Identification of a new mineral found on Mars"
      - "physics: Explaining the concept of relativity"
      - "Solve for x using differential equations:"
      - "history: Analyze the causes of the French Revolution"
    negative_prompts:
      - "Devise a business plan for"
      - "The evolution of culinary arts"
      - "Orchestrate a piece for a string quartet"
  
  - source_model: jtatman/tinymistral-v2-pycoder-instruct-248m
    positive_prompts:
      - "Write a Python program for facial recognition"
      - "Explain dynamic typing in programming languages"
      - "algorithm development for efficient data sorting"
    negative_prompts:
      - "Who was the first Emperor of Rome?"
      - "Discuss the political dynamics in"
      - "Provide a proof for Fermat's Last Theorem"
      - "physics: The principles of thermodynamics"
  
  - source_model: Felladrin/TinyMistral-248M-SFT-v4
    positive_prompts:
      - "Escreba sobre a influência da música no Brasil"
      - "Voici un guide pour les voyageurs en France"
      - "Para entender la política de México, se debe considerar"
      - "Cuales son los efectos de la globalización en Argentina"
      - "Welche gesellschaftlichen Veränderungen gibt es in Deutschland"
      - "If you had to imagine a utopian city, what would be its core values?"
    negative_prompts:
      - "Calculate the integral of"
      - "Describe the process of cell division"
      - "Review the latest advancements in quantum computing"

  - source_model: Locutusque/TinyMistral-248M-v2-Instruct
    positive_prompts:
      - "Write an essay on the evolution of international trade laws"
      - "What are the key components of a sustainable urban ecosystem?"
      - "instruct on effective negotiation techniques in diplomacy"
      - "How does cognitive bias affect decision making in high-pressure environments?"
      - "Identify the architectural significance of the Sydney Opera House"
    negative_prompts:
      - "Develop a script to automate"
      - "Understanding inheritance in object-oriented programming"
      - "philosophy of existentialism in contemporary society"
```

## 💻 Usage

```python
!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "M4-ai/TinyMistral-6x248M"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```