GeistBERT-Nyströmformer

GeistBERT-Nyströmformer is a German language model designed for efficient long-context processing. It extends GeistBERT by integrating the Nyströmformer self-attention mechanism, reducing memory and computation costs while maintaining strong performance.

This variant is ideal for:

  • Medium-length document processing in legal, news, and academic text analysis.
  • Longer context tasks that require more efficiency than standard BERT/RoBERTa but less VRAM than Longformer.

Key Features:

  • Nyströmformer self-attention: Efficient approximation of full self-attention, reducing computational overhead.
  • Improved scalability: Handles longer sequences without the high VRAM requirements of Longformer.
  • Optimized for German NLP: Trained on a for the most part deduplicated German corpus (OSCAR23, OPUS, MC4).

Compared to Longformer, GeistBERT-Nyströmformer strikes a balance between efficiency and extended context length, making it a more accessible alternative for tasks requiring longer dependencies.

For more details, see GeistBERT on Hugging Face.

Citations

If you use GeistBERT in your research, please cite the following paper:

@misc{scheibleschmitt2025geistbertbreathinglifegerman,
      title={GeistBERT: Breathing Life into German NLP}, 
      author={Raphael Scheible-Schmitt and Johann Frei},
      year={2025},
      eprint={2506.11903},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.11903}, 
}
Downloads last month
27
Safetensors
Model size
131M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GeistBERT/GeistBERT_base_nystromformer

Finetuned
(2)
this model