Gordon's picture

Gordon

GordonM

AI & ML interests

Data Science for good

Recent Activity

reacted to MoritzLaurer's post with ๐Ÿ”ฅ about 1 month ago
๐Ÿš€ Releasing a new zeroshot-classifier based on ModernBERT! Some key takeaways: - โšก Speed & efficiency: It's multiple times faster and uses significantly less memory than DeBERTav3. You can use larger batch sizes and enabling bf16 (instead of fp16) gave me a ~2x speed boost as well - ๐Ÿ“‰ Performance tradeoff: It performs slightly worse than DeBERTav3 on average across my zeroshot classification task collection - ๐Ÿง  Use cases: I recommend using it for scenarios requiring speed and a larger context window (8k). - ๐Ÿ’ก Whatโ€™s next? Iโ€™m preparing a newer version trained on better + longer synthetic data to fully leverage the 8k context window and improve upon the training mix of my older zeroshot-v2.0 models. I also hope that there will be a multilingual variant in the future. Great work by https://huggingface.co/answerdotai ! If youโ€™re looking for a high-speed zeroshot classifier, give it a try! ๐Ÿ“„ Resources below: ๐Ÿ‘‡ Base model: https://huggingface.co/MoritzLaurer/ModernBERT-base-zeroshot-v2.0 Large model: https://huggingface.co/MoritzLaurer/ModernBERT-large-zeroshot-v2.0 Updated zeroshot collection: https://huggingface.co/collections/MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f ModernBERT collection with paper: https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb
liked a model 2 months ago
Varosa/SeamlessExpressive
View all activity

Organizations

Sydney Informatics Hub's profile picture marsupial.ai's profile picture The University of Sydney's profile picture

models 0

None public yet

datasets 0

None public yet