Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
stefan-it
/
ModernBERT-large-tokenizer-fix
like
0
Fill-Mask
Transformers
PyTorch
ONNX
Safetensors
English
modernbert
masked-lm
long-context
arxiv:
2412.13663
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
ModernBERT-large-tokenizer-fix
Ctrl+K
Ctrl+K
7 contributors
History:
30 commits
stefan-it
docs: introduce section about tokenizer fixes
125ed8f
verified
30 days ago
onnx
Upload ONNX weights (#1)
6 months ago
.gitattributes
Safe
1.52 kB
initial commit
6 months ago
README.md
Safe
9.02 kB
docs: introduce section about tokenizer fixes
30 days ago
config.json
Safe
1.19 kB
Bump `max_position_embeddings` to 8192
6 months ago
model.safetensors
Safe
1.58 GB
xet
Purge duplicate "decoder.weight", rely on tied weights instead
6 months ago
pytorch_model.bin
1.58 GB
xet
Purge duplicate "decoder.weight", rely on tied weights instead
6 months ago
special_tokens_map.json
Safe
694 Bytes
Also update tokenizer/special_tokens_map
6 months ago
tokenizer.json
Safe
2.13 MB
Also update tokenizer/special_tokens_map
6 months ago
tokenizer_config.json
Safe
20.8 kB
fix: also use `add_prefix_space = True` in tokenizer config
30 days ago