Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
3
3
41
codito
codito
Follow
21world's profile picture
1 follower
·
9 following
https://codito.in
codito
AI & ML interests
None yet
Recent Activity
liked
a model
14 days ago
stduhpf/google-gemma-3-12b-it-qat-q4_0-gguf-small
liked
a dataset
6 months ago
nvidia/HelpSteer2
reacted
to
tomaarsen
's
post
with 🔥
6 months ago
📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost. 1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference. 2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU. Usage is as simple as `SentenceTransformer("all-MiniLM-L6-v2", backend="onnx")`. Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉 🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways: 1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with `from_model2vec` or with `from_distillation` where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed. 2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU. Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0 Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html
View all activity
Organizations
None yet
models
8
Sort: Recently updated
codito/Phi-3.5-mini-instruct-func-test1
Text Generation
•
Updated
Sep 23, 2024
•
4
codito/Phi-3.5-mini-instruct-func-test1-Q8_0-GGUF
Updated
Sep 22, 2024
•
6
codito/Phi-3.5-mini-instruct-func-test2
Text Generation
•
Updated
Sep 22, 2024
•
4
codito/gemma-2-2b-it-reflection-test1
Text Generation
•
Updated
Sep 7, 2024
•
5
codito/gemma-2-2b-it-func-test2
Text Generation
•
Updated
Aug 22, 2024
•
2
codito/gemma-2-2b-it-func-test4
Text Generation
•
Updated
Aug 21, 2024
codito/gemma-2-2b-it-func-test3
Text Generation
•
Updated
Aug 17, 2024
•
1
codito/gemma-2-2b-it-func-test1
Text Generation
•
Updated
Aug 16, 2024
•
1
datasets
None public yet