IndicLaw-Class: Code-Mixed Legal Intent Classifier

IndicLaw-Class is a lightweight multilingual transformer-based classifier that identifies legal intent from code-mixed Indian queries (e.g., Kannada-English, Hinglish). It is fine-tuned on citizen-style queries for real-world legal triage applications.


Model Overview

  • Architecture: distilbert-base-multilingual-cased
  • Task: Multi-class text classification (6 legal categories)
  • Input Style: Informal, code-mixed queries like:
    • divorce file maadbeku without husband consent
    • builder flat delay case haakbeku
    • rent refund maadbeku, owner refusing

Legal Categories

The model classifies input into one of the following categories:

Label Description
Family Law Divorce, custody, alimony, marriage
Property Law Inheritance, land disputes, transfer
Criminal Law FIRs, police misconduct, assault
Consumer Complaints E-commerce, refund issues, builders
Rent & Tenancy Eviction, deposit disputes, lease
Public Services Certificates, ID updates, ration

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: More information needed Hours used: More information needed Cloud Provider: More information needed Compute Region: More information needed Carbon Emitted: More information needed


Citation

@misc{nishanth_prakash_2025,
    author       = { nishanth prakash },
    title        = { IndicLaw-Class (Revision 87ae96e) },
    year         = 2025,
    url          = { https://huggingface.co/nprak26/IndicLaw-Class },
    doi          = { 10.57967/hf/5964 },
    publisher    = { Hugging Face }
}

How to Get Started With the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
import torch

# Load model and tokenizer from your local folder
model_dir = "./indiclaw-classifier"

tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSequenceClassification.from_pretrained(model_dir)

# Load label map (from labels.txt you saved earlier)
label_map = {}
with open(f"{model_dir}/labels.txt", "r") as f:
    for line in f:
        idx, label = line.strip().split("\t")
        label_map[int(idx)] = label

# Create pipeline
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Test inputs
examples = [
    "wife divorce file maadbeku",
    "flat possession delay aadmele builder case file madbeku",
    "tenant evict maadbeku no notice"
]

# Run predictions
for text in examples:
    result = classifier(text)[0]
    label_str = result["label"]
    if "label" in label_str.lower():
      label_id = int(label_str.split("_")[-1])
    else:
      label_id = int(label_str)
    label_name = label_map[label_id]
    print(f"Input: {text}\nPredicted: {label_name} (confidence: {result['score']:.2f})\n")


---
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for nprak26/IndicLaw-Class

Finetuned
(297)
this model