Srinivasmec26 commited on
Commit
f5bd2d3
·
verified ·
1 Parent(s): a8dd501

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -48
README.md CHANGED
@@ -19,6 +19,7 @@ library_name: adapter-transformers
19
 
20
  # MindSlate: Fine-tuned Gemma-3B for Personal Knowledge Management
21
 
 
22
 
23
  ## Model Description
24
 
@@ -29,14 +30,8 @@ library_name: adapter-transformers
29
  - **Fine-tuning method**: 4-bit QLoRA
30
  - **Languages**: English
31
  - **License**: Apache 2.0
32
-
33
- - **Developed by:** [Srinivas Nampalli](https://www.linkedin.com/in/srinivas-nampalli/)
34
- - **License:** apache-2.0
35
- - **Finetuned from model :** unsloth/gemma-3n-e2b-it-unsloth-bnb-4bit
36
-
37
- This gemma3n model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
38
-
39
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
40
 
41
  ## Model Sources
42
 
@@ -46,7 +41,6 @@ This gemma3n model was trained 2x faster with [Unsloth](https://github.com/unslo
46
  ## Uses
47
 
48
  ### Direct Use
49
-
50
  MindSlate is designed for:
51
  - Automatic flashcard generation from study materials
52
  - Intelligent reminder creation
@@ -55,7 +49,6 @@ MindSlate is designed for:
55
  - Personal knowledge base management
56
 
57
  ### Downstream Use
58
-
59
  Can be integrated into:
60
  - Educational platforms
61
  - Productivity apps
@@ -63,7 +56,6 @@ Can be integrated into:
63
  - Personal AI assistants
64
 
65
  ### Out-of-Scope Use
66
-
67
  Not suitable for:
68
  - Medical or legal advice
69
  - High-stakes decision making
@@ -75,79 +67,151 @@ Not suitable for:
75
  from unsloth import FastLanguageModel
76
  import torch
77
 
 
78
  model, tokenizer = FastLanguageModel.from_pretrained(
79
- model_name = "Srinivasmec26/MindSlate",
80
- max_seq_length = 2048,
81
- dtype = torch.float16,
82
- load_in_4bit = True,
 
 
 
 
 
 
83
  )
84
 
 
85
  messages = [
86
- {"role": "user", "content": "Create flashcards for neural networks:"},
87
  ]
88
 
 
89
  inputs = tokenizer.apply_chat_template(
90
  messages,
91
- return_tensors = "pt",
92
  ).to("cuda")
93
 
94
- outputs = model.generate(**inputs, max_new_tokens=256)
 
 
 
 
 
95
  print(tokenizer.decode(outputs[0]))
96
  ```
97
 
98
  ## Training Details
99
 
100
  ### Training Data
 
101
 
102
- - **Flashcards Dataset**: 400 items (cite your source)
103
- - **Reminders Dataset**: 100 items (cite your source)
104
- - **Summaries Dataset**: 100 items (cite your source)
105
- - **Todos Dataset**: 100 items (cite your source)
 
 
 
 
 
106
 
107
- *Replace with actual dataset citations and descriptions*
108
 
109
- ### Training Procedure
 
 
 
 
 
 
 
 
 
 
110
 
111
- - **Preprocessing**: Standardized into "### Input: / ### Output:" format
112
- - **Fine-tuned with**: Unsloth 2025.8.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  - **Hardware**: Tesla T4 GPU (16GB VRAM)
114
- - **Training Time**: ~51 minutes for 3 epochs
115
  - **LoRA Configuration**:
116
- - Rank: 64
117
- - Alpha: 128
118
- - Target Modules: All key projection layers
 
 
 
 
 
119
 
120
  ## Evaluation
 
121
 
122
- *Add evaluation metrics if available, for example:*
123
-
124
- | Metric | Value |
125
- |--------------|-------|
126
- | Perplexity | X.XX |
127
- | BLEU Score | X.XX |
128
- | Training Loss| 0.128 |
129
 
130
  ## Technical Specifications
131
 
132
- - **Model Size**: 3B parameters
133
- - **Quantization**: 4-bit (bnb)
134
- - **Context Length**: 2048 tokens
135
- - **Precision**: bfloat16/fp16 mixed
 
 
 
 
136
 
137
  ## Citation
138
 
139
  ```bibtex
140
  @misc{mindslate2025,
141
- author = {Srinivas Nampalli},
142
- title = {MindSlate: Fine-tuned Gemma-3B for Personal Knowledge Management},
143
  year = {2025},
144
  publisher = {Hugging Face},
145
- howpublished = {\url{https://huggingface.co/Srinivasmec26/MindSlate}}
 
146
  }
147
  ```
148
 
149
- ## Model Card Contact
 
 
 
150
 
151
- For questions about MindSlate, contact:
152
- - Srinivas Nampalli
153
- - [LinkedIn](https://www.linkedin.com/in/srinivas-nampalli/)
 
19
 
20
  # MindSlate: Fine-tuned Gemma-3B for Personal Knowledge Management
21
 
22
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="250"/>](https://github.com/unslothai/unsloth)
23
 
24
  ## Model Description
25
 
 
30
  - **Fine-tuning method**: 4-bit QLoRA
31
  - **Languages**: English
32
  - **License**: Apache 2.0
33
+ - **Developed by**: [Srinivas Nampalli](https://www.linkedin.com/in/srinivas-nampalli/)
34
+ - **Finetuned from**: [unsloth/gemma-3b-E2B-it-unsloth-bnb-4bit](https://huggingface.co/unsloth/gemma-3b-E2B-it-unsloth-bnb-4bit)
 
 
 
 
 
 
35
 
36
  ## Model Sources
37
 
 
41
  ## Uses
42
 
43
  ### Direct Use
 
44
  MindSlate is designed for:
45
  - Automatic flashcard generation from study materials
46
  - Intelligent reminder creation
 
49
  - Personal knowledge base management
50
 
51
  ### Downstream Use
 
52
  Can be integrated into:
53
  - Educational platforms
54
  - Productivity apps
 
56
  - Personal AI assistants
57
 
58
  ### Out-of-Scope Use
 
59
  Not suitable for:
60
  - Medical or legal advice
61
  - High-stakes decision making
 
67
  from unsloth import FastLanguageModel
68
  import torch
69
 
70
+ # Load model with Unsloth optimizations
71
  model, tokenizer = FastLanguageModel.from_pretrained(
72
+ model_name="Srinivasmec26/MindSlate",
73
+ max_seq_length=2048,
74
+ dtype=torch.float16,
75
+ load_in_4bit=True,
76
+ )
77
+
78
+ # Set chat template
79
+ tokenizer = FastLanguageModel.get_chat_template(
80
+ tokenizer,
81
+ chat_template="gemma", # Use "chatml" or other templates if needed
82
  )
83
 
84
+ # Create prompt
85
  messages = [
86
+ {"role": "user", "content": "Convert to flashcard: Neural networks are computational models..."},
87
  ]
88
 
89
+ # Generate response
90
  inputs = tokenizer.apply_chat_template(
91
  messages,
92
+ return_tensors="pt",
93
  ).to("cuda")
94
 
95
+ outputs = model.generate(
96
+ **inputs,
97
+ max_new_tokens=256,
98
+ temperature=0.7,
99
+ top_p=0.95,
100
+ )
101
  print(tokenizer.decode(outputs[0]))
102
  ```
103
 
104
  ## Training Details
105
 
106
  ### Training Data
107
+ The model was fine-tuned on a combination of structured datasets:
108
 
109
+ 1. **Flashcards Dataset** (400 items):
110
+ ```bibtex
111
+ @misc{educational_flashcards_2025,
112
+ title = {Multicultural Educational Flashcards Dataset},
113
+ author = {Srinivas, Yathi Pachauri, Swarnim Gupta},
114
+ year = {2025},
115
+ publisher = {Hugging Face},
116
+ url = {https://huggingface.co/datasets/Srinivasmec26/Educational-Flashcards-for-Global-Learners}
117
+ }
118
 
119
+ ```
120
 
121
+ 2. **Reminders Dataset** (100 items):
122
+ - *Private collection of contextual reminders*
123
+ - Format: {"input": "Meeting with team", "output": {"time": "2025-08-15 14:00", "location": "Zoom"}}
124
+ ```bibtex
125
+ @misc{educational_flashcards_2025,
126
+ title = {Multicultural Educational Flashcards Dataset},
127
+ author = {Srinivas, Yathi Pachauri, Swarnim Gupta},
128
+ year = {2025},
129
+ publisher = {Hugging Face},
130
+ url = {https://huggingface.co/datasets/Srinivasmec26/Educational-Flashcards-for-Global-Learners}
131
+ }
132
 
133
+ ```
134
+
135
+ 3. **Summaries Dataset** (100 items):
136
+ - *Academic paper abstracts and summaries*
137
+ - Collected from arXiv and academic publications
138
+ ```bibtex
139
+ @misc{knowledge_summaries_2025,
140
+ title = {Multidisciplinary-Educational-Summaries},
141
+ author = {Srinivas Nampalli, Yathi Pachauri, Swarnim Gupta},
142
+ year = {2025},
143
+ publisher = {Hugging Face},
144
+ url = {https://huggingface.co/datasets/Srinivasmec26/Multidisciplinary-Educational-Summaries}
145
+ }
146
+ ```
147
+
148
+ 4. **Todos Dataset** (100 items):
149
+ ```bibtex
150
+ @misc{academic_todos_2025,
151
+ title = {Structured To-Do Lists for Learning and Projects},
152
+ author = {Nampalli Srinivas, Yathi Pachauri, Swarnim Gupta},
153
+ year = {2025},
154
+ publisher = {Hugging Face},
155
+ version = {1.0},
156
+ url = {https://huggingface.co/datasets/Srinivasmec26/Structured-Todo-Lists-for-Learning-and-Projects}
157
+ }
158
+
159
+ ```
160
+
161
+ ### Training Procedure
162
+ - **Preprocessing**: Standardized into `### Input: ... \n### Output: ...` format
163
+ - **Framework**: Unsloth 2025.8.1 + Hugging Face TRL
164
  - **Hardware**: Tesla T4 GPU (16GB VRAM)
165
+ - **Training Time**: 51 minutes for 3 epochs
166
  - **LoRA Configuration**:
167
+ ```python
168
+ r=64, # LoRA rank
169
+ lora_alpha=128, # LoRA scaling factor
170
+ target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
171
+ "gate_proj", "up_proj", "down_proj"],
172
+ ```
173
+ - **Optimizer**: AdamW 8-bit
174
+ - **Learning Rate**: 2e-4 with linear decay
175
 
176
  ## Evaluation
177
+ *Comprehensive benchmark results will be uploaded in v1.1. Preliminary metrics:*
178
 
179
+ | Metric | Value |
180
+ |----------------------|--------|
181
+ | **Training Loss** | 0.1284 |
182
+ | **Perplexity** | TBD |
183
+ | **Task Accuracy** | TBD |
184
+ | **Inference Speed** | 42 tokens/sec (T4) |
 
185
 
186
  ## Technical Specifications
187
 
188
+ | Parameter | Value |
189
+ |----------------------|---------------------|
190
+ | Model Size | 3B parameters |
191
+ | Quantization | 4-bit (bnb) |
192
+ | Max Sequence Length | 2048 tokens |
193
+ | Fine-tuned Params | 1.66% (91.6M) |
194
+ | Precision | BF16/FP16 mixed |
195
+ | Architecture | Transformer Decoder |
196
 
197
  ## Citation
198
 
199
  ```bibtex
200
  @misc{mindslate2025,
201
+ author = {Srinivas Nampalli },
202
+ title = {MindSlate: Efficient Personal Knowledge Management with Gemma-3B},
203
  year = {2025},
204
  publisher = {Hugging Face},
205
+ howpublished = {\url{https://huggingface.co/Srinivasmec26/MindSlate}},
206
+ note = {Fine-tuned using Unsloth for efficient training}
207
  }
208
  ```
209
 
210
+ ## Acknowledgements
211
+ - [Unsloth](https://github.com/unslothai/unsloth) for 2x faster fine-tuning
212
+ - Google for the [Gemma 3n](https://huggingface.co/sparkreaderapp/gemma-3n-E2B-it) base model
213
+ - Hugging Face for [TRL](https://huggingface.co/docs/trl) library
214
 
215
+ ## Model Card Contact
216
+ For questions and collaborations:
217
+ - Srinivas Nampalli: [LinkedIn](https://www.linkedin.com/in/srinivas-nampalli/)