File size: 4,479 Bytes
2e8ee1f
 
 
 
 
960f9bc
49b5cff
 
2e8ee1f
49b5cff
 
 
 
 
 
 
6d770b0
49b5cff
6d770b0
 
 
 
 
49b5cff
 
 
6d770b0
49b5cff
 
 
 
2e8ee1f
49b5cff
 
 
 
 
 
 
 
 
67d57df
49b5cff
 
 
 
 
 
6d770b0
49b5cff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91bd9f6
49b5cff
6d770b0
49b5cff
 
 
6d770b0
49b5cff
 
 
 
 
 
 
 
 
 
 
 
 
6d770b0
49b5cff
 
 
 
 
 
91bd9f6
 
6d770b0
49b5cff
 
 
 
 
 
 
 
91bd9f6
49b5cff
91bd9f6
49b5cff
 
 
 
91bd9f6
49b5cff
 
 
91bd9f6
49b5cff
91bd9f6
6d770b0
91bd9f6
49b5cff
 
 
 
 
91bd9f6
49b5cff
91bd9f6
6d770b0
91bd9f6
da0763a
91bd9f6
6d770b0
91bd9f6
49b5cff
91bd9f6
49b5cff
91bd9f6
6d770b0
91bd9f6
49b5cff
91bd9f6
49b5cff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
license: apache-2.0
base_model:
- microsoft/phi-2
---
# NCU Smart LLM (phi2-ncu) β€” Smart LLM Fine-tuned for NCU Tasks

<p align="center">
  <img src="https://huggingface.co/pranav2711/phi2-ncu-model/resolve/main/NCU-Logo.png" alt="NCU Logo" width="350"/>
</p>

> A lightweight, instruction-tuned version of [Microsoft's Phi-2](https://huggingface.co/microsoft/phi-2), customized for use cases and conversations related to The NorthCap University (NCU), India.
> Fine-tuned using LoRA on 1,098 high-quality examples, it's optimized for academic, administrative, and smart campus queries.

---

## Highlights

* **Base Model:** `microsoft/phi-2` (2.7B parameters)
* **Fine-tuned Using:** Low-Rank Adaptation (LoRA) + PEFT + Hugging Face Transformers
* **Dataset:** University questions, FAQs, policies, academic support queries, smart campus data
* **Training Environment:** Google Colab (T4 GPU), 4 epochs, batch size 1, no FP16
* **Final Format:** Full model weights (`.safetensors`) + tokenizer

---

## Model Access

| Platform           | Access Method                                                               |
| ------------------ | --------------------------------------------------------------------------- |
| Hugging Face       | [phi2-ncu-model](https://huggingface.co/pranav2711/phi2-ncu-model)          |
| Hugging Face Space | [Live Chatbot Demo](https://huggingface.co/spaces/pranav2711/phi2-ncu-chat-space) |
| Ollama (Offline)   | `ollama create phi2-ncu -f Modelfile` *(self-hosted only)*                  |

---

##  Try It Online

### Gradio Web Chat (Hugging Face Space) (Runs Slow because of free CPU Hardware)

```bash
πŸ‘‰ Visit: https://huggingface.co/spaces/pranav2711/phi2-ncu-chat-space
```

* Built using `Gradio`, deployed on Hugging Face Spaces

---

## How to Use Locally (Hugging Face Transformers)

```bash
pip install transformers accelerate peft
```

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel, PeftConfig

# Load adapter config
adapter_path = "pranav2711/phi2-ncu-model"
base_model = "microsoft/phi-2"

# Load tokenizer and base
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")

# Load fine-tuned adapter
model = PeftModel.from_pretrained(model, adapter_path)

# Inference
input_prompt = "### Question:\nHow can I apply for re-evaluation at NCU?\n\n### Answer:"
inputs = tokenizer(input_prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## How to Use with Ollama (Offline)

> This works only **locally** via `ollama create` and **not yet shareable** as public Ollama model hub is restricted.

### Folder Structure

```
phi2-ncu/
β”œβ”€β”€ Modelfile
└── model/
    β”œβ”€β”€ model.safetensors
    β”œβ”€β”€ config.json
    β”œβ”€β”€ tokenizer.json
    β”œβ”€β”€ tokenizer_config.json
    β”œβ”€β”€ vocab.json
    β”œβ”€β”€ merges.txt
```

### Steps

```bash
ollama create phi2-ncu -f Modelfile
ollama run phi2-ncu
```

---

## Example Dataset Format (Used for Training)

```json
{
  "instruction": "How do I get my degree certificate?",
  "input": "I'm a 2023 BTech passout from CSE at NCU.",
  "output": "You can collect your degree certificate from the admin block on working days between 9AM and 4PM. Carry a valid ID proof."
}
```

Formatted as:

```
### Question:
How do I get my degree certificate?
I'm a 2023 BTech passout from CSE at NCU.

### Answer:
You can collect your degree certificate...
```

---

## Training Strategy

* Used `LoRA` with rank=8, alpha=16
* Tokenized to max length = 512
* Used `Trainer` with `fp16=False` to avoid CUDA AMP issues
* Batch size = 1, Epochs = 4
* Trained on Google Colab (T4), saving final full weights

---

## License

[Apache 2.0](https://huggingface.co/pranav2711/phi2-ncu-model/resolve/main/LICENSE)

## About NCU

**The NorthCap University**, Gurugram (formerly ITM University), is a multidisciplinary university with programs in engineering, management, law, and sciences.

This model was created as part of a research initiative to explore AI for academic services, campus automation, and local LLM deployments.

## Contribute

Have better FAQs or data? Want to train on your college corpus? Fork the repo or raise a PR at:

πŸ‘‰ [https://github.com/pranav2711/ncu-smartllm](https://github.com/pranav2711/ncu-smartllm)