ModernBERT Japan Legal - Fine-tuned Model
This model is a fine-tuned version of sbintuitions/modernbert-ja-130m on Japanese legal case data for research purposes.
Model Details
- Base Model: ModernBERT Japan 130M
- Training Data: 65,855 Japanese legal cases (1947-2024)
- Task: Masked Language Modeling (MLM) for legal domain adaptation.
- Domain: Japanese Legal Text.
How to Use
You can use our models directly with the transformers library v4.48.0 or higher:
pip install -U "transformers>=4.48.0"
Additionally, if your GPUs support Flash Attention 2, we recommend using our models with Flash Attention 2.
pip install flash-attn --no-build-isolation
Example Usage
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer, pipeline
model = AutoModelForMaskedLM.from_pretrained("nguyenthanhasia/modernbert-ja-legal", torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("nguyenthanhasia/modernbert-ja-legal")
fill_mask = pipeline("fill-mask", model=model, tokenizer=tokenizer)
results = fill_mask("おはようございます、今日の天気は<mask>です。")
for result in results:
print(result)
# {'score': 0.5078125, 'token': 16416, 'token_str': '晴れ', 'sequence': 'おはようございます、今日の天気は晴れです。'}
# {'score': 0.240234375, 'token': 28933, 'token_str': '曇り', 'sequence': 'おはようございます、今日の天気は曇りです。'}
# {'score': 0.078125, 'token': 92339, 'token_str': 'くもり', 'sequence': 'おはようございます、今日の天気はくもりです。'}
# {'score': 0.078125, 'token': 2988, 'token_str': '雨', 'sequence': 'おはようございます、今日の天気は雨です。'}
# {'score': 0.0223388671875, 'token': 52525, 'token_str': '快晴', 'sequence': 'おはようございます、今日の天気は快晴です。'}
Intended Use and Limitations
This model is intended for research purposes on Japanese legal texts. It can be used for experiments on domain adaptation and benchmarking legal NLP tasks.
Limitations:
- Domain Specificity (Japanese legal text only).
- Training Data Bias.
- Research Use Only: Not for critical applications or legal advice.
- Inherits Base Model Limitations.
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support