File size: 4,774 Bytes

---

language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
license: apache-2.0
library_name: transformers
tags:
- qwen
- lora
- indian-law
- legal-ai
- finetune
datasets:
- viber1/indian-law-dataset
base_model: Qwen/Qwen2.5-7B
inference:
  parameters:
    temperature: 0.7
    top_p: 0.9
    repetition_penalty: 1.1
    max_new_tokens: 512
model-index:
- name: JurisQwen
  results:
  - task:
      type: text-generation
      name: Legal Text Generation
    dataset:
      name: Indian Law Dataset
      type: viber1/indian-law-dataset
    metrics:
    - type: loss
      value: N/A
      name: Training Loss
---


# JurisQwen: Legal Domain Fine-tuned Qwen2.5-7B Model

## Overview
JurisQwen is a specialized legal domain language model based on Qwen2.5-7B, fine-tuned on Indian legal datasets. This model is designed to assist with legal queries, document analysis, and providing information about Indian law.

## Model Details

### Model Description
- **Developed by:** Prathamesh Devadiga
- **Base Model:** Qwen2.5-7B by Qwen
- **Model Type:** Language Model with LoRA fine-tuning
- **Language:** English with focus on Indian legal terminology
- **License:** Apache-2.0
- **Finetuned from model:** Qwen/Qwen2.5-7B
- **Framework:** PEFT 0.15.1 with Unsloth optimization

### Training Dataset
The model was fine-tuned on the "viber1/indian-law-dataset" which contains instruction-response pairs focused on Indian legal knowledge and terminology.

## Technical Specifications

### Model Architecture
- Base model: Qwen2.5-7B
- Fine-tuning method: LoRA (Low-Rank Adaptation)
- LoRA configuration:
  - Rank (r): 32
  - Alpha: 64
  - Dropout: 0.05
  - Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj



### Training Procedure

- **Training Infrastructure:** NVIDIA A100-40GB GPU

- **Quantization:** 4-bit quantization using bitsandbytes

- **Mixed Precision:** bfloat16

- **Attention Implementation:** Flash Attention 2

- **Training Hyperparameters:**

  - Epochs: 3

  - Batch size: 16

  - Gradient accumulation steps: 2

  - Learning rate: 2e-4

  - Weight decay: 0.001

  - Scheduler: Cosine with 10% warmup

  - Optimizer: AdamW 8-bit

  - Maximum sequence length: 4096

  - TF32 enabled for A100



### Deployment Infrastructure

- Deployed using Modal cloud platform

- GPU: NVIDIA A100-40GB

- Persistent volume storage for model checkpoints



## Usage



### Setting Up the Environment

This model is deployed using Modal. To use it, you'll need to:



1. Install Modal:

```bash

pip install modal

```



2. Authenticate with Modal:

```bash

modal token new

```



3. Deploy the application:

```bash

python app.py

```



### Running Fine-tuning

To run the fine-tuning process:



```python

from app import app, finetune_qwen

# Deploy the app
app.deploy()

# Run fine-tuning
result = finetune_qwen.remote()

print(f"Fine-tuning result: {result}")

```



### Inference

To run inference with the fine-tuned model:



```python

from app import app, test_inference

# Example legal query
response = test_inference.remote("What are the key provisions of the Indian Contract Act?")

print(response)

```



## Input Format

The model uses the following format for prompts:

```

<|im_start|>user
[Your legal question here]
<|im_end|>

```



The model will respond with:

```

<|im_start|>assistant
[Legal response]
<|im_end|>

```



## Limitations and Biases

- The model is specifically trained on Indian legal data and may not generalize well to other legal systems

- Legal advice provided by the model should not be considered as professional legal counsel

- The model may exhibit biases present in the training data

- Performance on complex or novel legal scenarios not present in the training data may be limited



## Recommendations

- Users should validate important legal information with qualified legal professionals

- Always cross-reference model outputs with authoritative legal sources

- Be aware that legal interpretations may vary and the model provides one possible interpretation



## Environmental Impact

- Hardware: NVIDIA A100-40GB GPU

- Training time: Approximately 3-5 hours

- Cloud Provider: Modal



## Citation

If you use this model in your research, please cite:



```

@software{JurisQwen,

  author = {Prathamesh Devadiga},

  title = {JurisQwen: Indian Legal Domain Fine-tuned Qwen2.5-7B Model},

  year = {2025},

  url = {https://github.com/devadigapratham/JurisQwen}

}

```



## Acknowledgments

- Qwen team for the original Qwen2.5-7B model

- Unsloth for optimization tools

- Modal for deployment infrastructure

- Creator of the "viber1/indian-law-dataset"