IndicLaw-Class: Code-Mixed Legal Intent Classifier
IndicLaw-Class
is a lightweight multilingual transformer-based classifier that identifies legal intent from code-mixed Indian queries (e.g., Kannada-English, Hinglish). It is fine-tuned on citizen-style queries for real-world legal triage applications.
Model Overview
- Architecture:
distilbert-base-multilingual-cased
- Task: Multi-class text classification (6 legal categories)
- Input Style: Informal, code-mixed queries like:
divorce file maadbeku without husband consent
builder flat delay case haakbeku
rent refund maadbeku, owner refusing
Legal Categories
The model classifies input into one of the following categories:
Label | Description |
---|---|
Family Law | Divorce, custody, alimony, marriage |
Property Law | Inheritance, land disputes, transfer |
Criminal Law | FIRs, police misconduct, assault |
Consumer Complaints | E-commerce, refund issues, builders |
Rent & Tenancy | Eviction, deposit disputes, lease |
Public Services | Certificates, ID updates, ration |
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Hardware Type: More information needed Hours used: More information needed Cloud Provider: More information needed Compute Region: More information needed Carbon Emitted: More information needed
Citation
@misc{nishanth_prakash_2025,
author = { nishanth prakash },
title = { IndicLaw-Class (Revision 87ae96e) },
year = 2025,
url = { https://huggingface.co/nprak26/IndicLaw-Class },
doi = { 10.57967/hf/5964 },
publisher = { Hugging Face }
}
How to Get Started With the Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
import torch
# Load model and tokenizer from your local folder
model_dir = "./indiclaw-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSequenceClassification.from_pretrained(model_dir)
# Load label map (from labels.txt you saved earlier)
label_map = {}
with open(f"{model_dir}/labels.txt", "r") as f:
for line in f:
idx, label = line.strip().split("\t")
label_map[int(idx)] = label
# Create pipeline
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
# Test inputs
examples = [
"wife divorce file maadbeku",
"flat possession delay aadmele builder case file madbeku",
"tenant evict maadbeku no notice"
]
# Run predictions
for text in examples:
result = classifier(text)[0]
label_str = result["label"]
if "label" in label_str.lower():
label_id = int(label_str.split("_")[-1])
else:
label_id = int(label_str)
label_name = label_map[label_id]
print(f"Input: {text}\nPredicted: {label_name} (confidence: {result['score']:.2f})\n")
---
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support