student-abdullah commited on
Commit
fe7d6e1
·
verified ·
1 Parent(s): 98ed777

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -3
README.md CHANGED
@@ -1,3 +1,101 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-Coder-0.5B
3
+ datasets: None
4
+ language:
5
+ - en
6
+ license: apache-2.0
7
+ tags:
8
+ - text-generation-inference
9
+ - transformers
10
+ - torch
11
+ - trl
12
+ - unsloth
13
+ - llama
14
+ - gguf
15
+ ---
16
+
17
+
18
+ # Uploaded model
19
+
20
+ - **Developed by:** student-abdullah
21
+ - **License:** apache-2.0
22
+ - **Quantized from model:** Qwen2.5-Coder-0.5B
23
+ - **Created on:** 14th July, 2025
24
+
25
+ ---
26
+ # Acknowledgement
27
+ <div style="display: flex; gap: 10px; align-items: center;">
28
+ <img src="https://colab.research.google.com/img/colab_favicon_256px.png" width="200"/>
29
+ <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/ef/ChatGPT-Logo.svg/2048px-ChatGPT-Logo.svg.png" width="140"/>
30
+ <img src="https://compareaimodels.com/content/images/2024/08/qwen-square.svg" width="200"/>
31
+ </div>
32
+
33
+ ---
34
+ # Quantization Description
35
+ This model is quantized using *selective quantization* from the Qwen2.5-Coder-0.5B base model to increase its speed while preserving the capabilities in generating relevant and accurate responses related python programming.
36
+ The quantization method included *16-bit* quantization of the following Layers:
37
+ - q_proj
38
+ - v_proj
39
+ - o_proj
40
+ - down_proj
41
+ - lm_head
42
+
43
+ Rest of the remaining layers were quantized to *Q2*
44
+
45
+ ---
46
+ # Model Description
47
+ | Layer Name | Role (Short) | Type |
48
+ | ---------------------------- | ----------------------------------------------------- | -------------- |
49
+ | `q_proj`, `k_proj`, `v_proj` | Compute query, key, and value for attention mechanism | Attention Proj |
50
+ | `o_proj` | Projects attention output back to model hidden size | Attention Proj |
51
+ | `down_proj` | Projects MLP output down to hidden size | MLP |
52
+ | `gate_proj` | First part of Gated MLP, controls info flow | MLP |
53
+ | `up_proj` | Expands hidden size in MLP | MLP |
54
+ | `lm_head` | Final linear layer for logits | Output Head |
55
+ | `embed_tokens` | Token embedding layer | Input Embed |
56
+ | `norm` | Final layernorm | Normalization |
57
+ | `*_layernorm` | Normalize inputs to layers | Normalization |
58
+
59
+ ---
60
+ # Model Architect
61
+ <pre><code>Qwen2ForCausalLM(
62
+ (model): Qwen2Model(
63
+ (embed_tokens): Embedding(151936, 896, padding_idx=151665)
64
+ (layers): ModuleList(
65
+ (0-23): 24 x Qwen2DecoderLayer(
66
+ (self_attn): Qwen2Attention(
67
+ (q_proj): Linear(in_features=896, out_features=896, bias=True)
68
+ (k_proj): Linear(in_features=896, out_features=128, bias=True)
69
+ (v_proj): Linear(in_features=896, out_features=128, bias=True)
70
+ (o_proj): Linear(in_features=896, out_features=896, bias=False)
71
+ (rotary_emb): LlamaRotaryEmbedding()
72
+ )
73
+ (mlp): Qwen2MLP(
74
+ (gate_proj): Linear(in_features=896, out_features=4864, bias=False)
75
+ (up_proj): Linear(in_features=896, out_features=4864, bias=False)
76
+ (down_proj): Linear(in_features=4864, out_features=896, bias=False)
77
+ (act_fn): SiLU()
78
+ )
79
+ (input_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
80
+ (post_attention_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
81
+ )
82
+ )
83
+ (norm): Qwen2RMSNorm((896,), eps=1e-06)
84
+ (rotary_emb): LlamaRotaryEmbedding()
85
+ )
86
+ (lm_head): Linear(in_features=896, out_features=151936, bias=False)
87
+ )</code></pre>
88
+
89
+ ---
90
+ # Performance & Limitations
91
+ - YET TO BE EXAMINED
92
+
93
+ ---
94
+ # Model Performace Evaluation:
95
+ - YET TO BE EVALUATED
96
+
97
+ <p align="center">
98
+ <img src="" width="20%" style="display:inline-block;"/>
99
+ <img src="" width="35%" style="display:inline-block;"/>
100
+ <img src="" width="35%" style="display:inline-block;"/>
101
+ </p>