Kearm
/

LLaMutation-Qwen2.5-14B-SFFT-v0.0

@@ -2,15 +2,75 @@
 library_name: transformers
 license: apache-2.0
 base_model: Qwen/Qwen2.5-14B
-tags:
-- generated_from_trainer
 model-index:
 - name: LLaMutation-Qwen2.5-14B-SFFT-v0.0
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
@@ -120,57 +180,4 @@ weight_decay: 0.1
 #   fsdp_mixed_precision: BF16  # Added
 ```
-</details><br>
-# LLaMutation-Qwen2.5-14B-SFFT-v0.0
-This model is a fine-tuned version of [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.2621
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0005
-- train_batch_size: 1
-- eval_batch_size: 1
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 8
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 32
-- total_eval_batch_size: 8
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
-- num_epochs: 1
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 0.3948        | 0.0237 | 1    | 0.3920          |
-| 0.2392        | 0.4970 | 21   | 0.2500          |
-| 0.2606        | 0.9941 | 42   | 0.2621          |
-### Framework versions
-- Transformers 4.45.2
-- Pytorch 2.3.1+cu121
-- Datasets 3.0.1
-- Tokenizers 0.20.1

 library_name: transformers
 license: apache-2.0
 base_model: Qwen/Qwen2.5-14B
 model-index:
 - name: LLaMutation-Qwen2.5-14B-SFFT-v0.0
   results: []
 ---
+# LLaMutation-Qwen2.5-14B-SFFT-v0.0
+![image/webp](https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/IFK02cTih72zfZfT5UY4f.webp)
+This model is a [Spectrum](https://github.com/axolotl-ai-cloud/axolotl/blob/67f744dc8c9564ef7a42d5df780ae53e319dca61/src/axolotl/integrations/spectrum/README.md) FFT of [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) on a code translation dataset evolved with [EvolKit](https://github.com/arcee-ai/EvolKit).
+## Model description
+Code translation and completion model trained on Qwen2.5-14B as there is not yet a Qwen2.5-Coder-14B model. This is 100% an alpha completion model thus there will be quirks to it's useage parameters.
+I will refine the model both for completion and create an instruct/chat variant.
+## Intended uses & limitations
+Differing system prompts for code translation and use as a tab autocomplete model with [continue.dev](https://www.continue.dev/)
+## Chat template and sampling paramaters.
+Chat template is chatml.
+Sampling parameters for the generation and demo at the hackathon are here:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/YzQ8nqu83lEhl3Kg4u0PC.png)
+### SYSTEM PROMPT MUST BE USED FOR THIS MODEL
+`You are an Al assistant that is an expert at converting code from any language to another within properly formatted code blocks. DON'T SAY ANYTHING ABOUT NOT SEEING CODE. Keep non code text to the a minimum possible. DO NOT REPEAT ANY NON CODE TEXT. ONLY PRINT OUT CODE ONCE DO NOT ITTERATE!`
+## Training procedure
+Spectrum FFT/SFFT
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.3948        | 0.0237 | 1    | 0.3920          |
+| 0.2392        | 0.4970 | 21   | 0.2500          |
+| 0.2606        | 0.9941 | 42   | 0.2621          |
+### Framework versions
+- Transformers 4.45.2
+- Pytorch 2.3.1+cu121
+- Datasets 3.0.1
+- Tokenizers 0.20.1
 [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
 #   fsdp_mixed_precision: BF16  # Added
 ```
+</details><br>