--- language: - zho - eng - fra - spa - por - deu - ita - rus - jpn - kor - vie - tha - ara license: apache-2.0 library_name: transformers tags: - qwen - lora - indian-law - legal-ai - finetune datasets: - viber1/indian-law-dataset base_model: Qwen/Qwen2.5-7B inference: parameters: temperature: 0.7 top_p: 0.9 repetition_penalty: 1.1 max_new_tokens: 512 model-index: - name: JurisQwen results: - task: type: text-generation name: Legal Text Generation dataset: name: Indian Law Dataset type: viber1/indian-law-dataset metrics: - type: loss value: N/A name: Training Loss --- # JurisQwen: Legal Domain Fine-tuned Qwen2.5-7B Model ## Overview JurisQwen is a specialized legal domain language model based on Qwen2.5-7B, fine-tuned on Indian legal datasets. This model is designed to assist with legal queries, document analysis, and providing information about Indian law. ## Model Details ### Model Description - **Developed by:** Prathamesh Devadiga - **Base Model:** Qwen2.5-7B by Qwen - **Model Type:** Language Model with LoRA fine-tuning - **Language:** English with focus on Indian legal terminology - **License:** Apache-2.0 - **Finetuned from model:** Qwen/Qwen2.5-7B - **Framework:** PEFT 0.15.1 with Unsloth optimization ### Training Dataset The model was fine-tuned on the "viber1/indian-law-dataset" which contains instruction-response pairs focused on Indian legal knowledge and terminology. ## Technical Specifications ### Model Architecture - Base model: Qwen2.5-7B - Fine-tuning method: LoRA (Low-Rank Adaptation) - LoRA configuration: - Rank (r): 32 - Alpha: 64 - Dropout: 0.05 - Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj ### Training Procedure - **Training Infrastructure:** NVIDIA A100-40GB GPU - **Quantization:** 4-bit quantization using bitsandbytes - **Mixed Precision:** bfloat16 - **Attention Implementation:** Flash Attention 2 - **Training Hyperparameters:** - Epochs: 3 - Batch size: 16 - Gradient accumulation steps: 2 - Learning rate: 2e-4 - Weight decay: 0.001 - Scheduler: Cosine with 10% warmup - Optimizer: AdamW 8-bit - Maximum sequence length: 4096 - TF32 enabled for A100 ### Deployment Infrastructure - Deployed using Modal cloud platform - GPU: NVIDIA A100-40GB - Persistent volume storage for model checkpoints ## Usage ### Setting Up the Environment This model is deployed using Modal. To use it, you'll need to: 1. Install Modal: ```bash pip install modal ``` 2. Authenticate with Modal: ```bash modal token new ``` 3. Deploy the application: ```bash python app.py ``` ### Running Fine-tuning To run the fine-tuning process: ```python from app import app, finetune_qwen # Deploy the app app.deploy() # Run fine-tuning result = finetune_qwen.remote() print(f"Fine-tuning result: {result}") ``` ### Inference To run inference with the fine-tuned model: ```python from app import app, test_inference # Example legal query response = test_inference.remote("What are the key provisions of the Indian Contract Act?") print(response) ``` ## Input Format The model uses the following format for prompts: ``` <|im_start|>user [Your legal question here] <|im_end|> ``` The model will respond with: ``` <|im_start|>assistant [Legal response] <|im_end|> ``` ## Limitations and Biases - The model is specifically trained on Indian legal data and may not generalize well to other legal systems - Legal advice provided by the model should not be considered as professional legal counsel - The model may exhibit biases present in the training data - Performance on complex or novel legal scenarios not present in the training data may be limited ## Recommendations - Users should validate important legal information with qualified legal professionals - Always cross-reference model outputs with authoritative legal sources - Be aware that legal interpretations may vary and the model provides one possible interpretation ## Environmental Impact - Hardware: NVIDIA A100-40GB GPU - Training time: Approximately 3-5 hours - Cloud Provider: Modal ## Citation If you use this model in your research, please cite: ``` @software{JurisQwen, author = {Prathamesh Devadiga}, title = {JurisQwen: Indian Legal Domain Fine-tuned Qwen2.5-7B Model}, year = {2025}, url = {https://github.com/devadigapratham/JurisQwen} } ``` ## Acknowledgments - Qwen team for the original Qwen2.5-7B model - Unsloth for optimization tools - Modal for deployment infrastructure - Creator of the "viber1/indian-law-dataset"