File size: 4,774 Bytes
91add5d 6f257b9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
---
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
license: apache-2.0
library_name: transformers
tags:
- qwen
- lora
- indian-law
- legal-ai
- finetune
datasets:
- viber1/indian-law-dataset
base_model: Qwen/Qwen2.5-7B
inference:
parameters:
temperature: 0.7
top_p: 0.9
repetition_penalty: 1.1
max_new_tokens: 512
model-index:
- name: JurisQwen
results:
- task:
type: text-generation
name: Legal Text Generation
dataset:
name: Indian Law Dataset
type: viber1/indian-law-dataset
metrics:
- type: loss
value: N/A
name: Training Loss
---
# JurisQwen: Legal Domain Fine-tuned Qwen2.5-7B Model
## Overview
JurisQwen is a specialized legal domain language model based on Qwen2.5-7B, fine-tuned on Indian legal datasets. This model is designed to assist with legal queries, document analysis, and providing information about Indian law.
## Model Details
### Model Description
- **Developed by:** Prathamesh Devadiga
- **Base Model:** Qwen2.5-7B by Qwen
- **Model Type:** Language Model with LoRA fine-tuning
- **Language:** English with focus on Indian legal terminology
- **License:** Apache-2.0
- **Finetuned from model:** Qwen/Qwen2.5-7B
- **Framework:** PEFT 0.15.1 with Unsloth optimization
### Training Dataset
The model was fine-tuned on the "viber1/indian-law-dataset" which contains instruction-response pairs focused on Indian legal knowledge and terminology.
## Technical Specifications
### Model Architecture
- Base model: Qwen2.5-7B
- Fine-tuning method: LoRA (Low-Rank Adaptation)
- LoRA configuration:
- Rank (r): 32
- Alpha: 64
- Dropout: 0.05
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
### Training Procedure
- **Training Infrastructure:** NVIDIA A100-40GB GPU
- **Quantization:** 4-bit quantization using bitsandbytes
- **Mixed Precision:** bfloat16
- **Attention Implementation:** Flash Attention 2
- **Training Hyperparameters:**
- Epochs: 3
- Batch size: 16
- Gradient accumulation steps: 2
- Learning rate: 2e-4
- Weight decay: 0.001
- Scheduler: Cosine with 10% warmup
- Optimizer: AdamW 8-bit
- Maximum sequence length: 4096
- TF32 enabled for A100
### Deployment Infrastructure
- Deployed using Modal cloud platform
- GPU: NVIDIA A100-40GB
- Persistent volume storage for model checkpoints
## Usage
### Setting Up the Environment
This model is deployed using Modal. To use it, you'll need to:
1. Install Modal:
```bash
pip install modal
```
2. Authenticate with Modal:
```bash
modal token new
```
3. Deploy the application:
```bash
python app.py
```
### Running Fine-tuning
To run the fine-tuning process:
```python
from app import app, finetune_qwen
# Deploy the app
app.deploy()
# Run fine-tuning
result = finetune_qwen.remote()
print(f"Fine-tuning result: {result}")
```
### Inference
To run inference with the fine-tuned model:
```python
from app import app, test_inference
# Example legal query
response = test_inference.remote("What are the key provisions of the Indian Contract Act?")
print(response)
```
## Input Format
The model uses the following format for prompts:
```
<|im_start|>user
[Your legal question here]
<|im_end|>
```
The model will respond with:
```
<|im_start|>assistant
[Legal response]
<|im_end|>
```
## Limitations and Biases
- The model is specifically trained on Indian legal data and may not generalize well to other legal systems
- Legal advice provided by the model should not be considered as professional legal counsel
- The model may exhibit biases present in the training data
- Performance on complex or novel legal scenarios not present in the training data may be limited
## Recommendations
- Users should validate important legal information with qualified legal professionals
- Always cross-reference model outputs with authoritative legal sources
- Be aware that legal interpretations may vary and the model provides one possible interpretation
## Environmental Impact
- Hardware: NVIDIA A100-40GB GPU
- Training time: Approximately 3-5 hours
- Cloud Provider: Modal
## Citation
If you use this model in your research, please cite:
```
@software{JurisQwen,
author = {Prathamesh Devadiga},
title = {JurisQwen: Indian Legal Domain Fine-tuned Qwen2.5-7B Model},
year = {2025},
url = {https://github.com/devadigapratham/JurisQwen}
}
```
## Acknowledgments
- Qwen team for the original Qwen2.5-7B model
- Unsloth for optimization tools
- Modal for deployment infrastructure
- Creator of the "viber1/indian-law-dataset" |