4 6 33

Hanwool Albert Lee PRO

Albertmade

AI & ML interests

fin-NLP

Recent Activity

liked a model 6 days ago

skt/A.X-4.0-Light

reacted to tomaarsen's post with 🔥 7 days ago

‼️Sentence Transformers v5.0 is out! The biggest update yet introduces Sparse Embedding models, encode methods improvements, Router module for asymmetric models & much more. Sparse + Dense = 🔥 hybrid search performance! Details: 1️⃣ Sparse Encoder Models Brand new support for sparse embedding models that generate high-dimensional embeddings (30,000+ dims) where <1% are non-zero: - Full SPLADE, Inference-free SPLADE, and CSR architecture support - 4 new modules, 12 new losses, 9 new evaluators - Integration with @elastic-co, @opensearch-project, @NAVER LABS Europe, @qdrant, @IBM, etc. - Decode interpretable embeddings to understand token importance - Hybrid search integration to get the best of both worlds 2️⃣ Enhanced Encode Methods & Multi-Processing - Introduce encode_query & encode_document automatically use predefined prompts - No more manual pool management - just pass device list directly to encode() - Much cleaner and easier to use than the old multi-process approach 3️⃣ Router Module & Advanced Training - Router module with different processing paths for queries vs documents - Custom learning rates for different parameter groups - Composite loss logging - see individual loss components - Perfect for two-tower architectures 4️⃣ Comprehensive Documentation & Training - New Training Overview, Loss Overview, API Reference docs - 6 new training example documentation pages - Full integration examples with major search engines - Extensive blogpost on training sparse models Read the comprehensive blogpost about training sparse embedding models: https://huggingface.co/blog/train-sparse-encoder See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v5.0.0 What's next? We would love to hear from the community! What sparse encoder models would you like to see? And what new capabilities should Sentence Transformers handle - multimodal embeddings, late interaction models, or something else? Your feedback shapes our roadmap!

liked a model 7 days ago

skt/A.X-4.0

View all activity

Organizations

liked a model 6 days ago

skt/A.X-4.0-Light

Text Generation • 7B • Updated about 5 hours ago • 9.86k • 72

reacted to tomaarsen's post with 🔥 7 days ago

Post

2322

‼️Sentence Transformers v5.0 is out! The biggest update yet introduces Sparse Embedding models, encode methods improvements, Router module for asymmetric models & much more. Sparse + Dense = 🔥 hybrid search performance! Details:

1️⃣ Sparse Encoder Models
Brand new support for sparse embedding models that generate high-dimensional embeddings (30,000+ dims) where <1% are non-zero:

- Full SPLADE, Inference-free SPLADE, and CSR architecture support
- 4 new modules, 12 new losses, 9 new evaluators
- Integration with @elastic-co , @opensearch-project , @NAVER LABS Europe, @qdrant , @IBM , etc.
- Decode interpretable embeddings to understand token importance
- Hybrid search integration to get the best of both worlds

2️⃣ Enhanced Encode Methods & Multi-Processing
- Introduce encode_query & encode_document automatically use predefined prompts
- No more manual pool management - just pass device list directly to encode()
- Much cleaner and easier to use than the old multi-process approach

3️⃣ Router Module & Advanced Training
- Router module with different processing paths for queries vs documents
- Custom learning rates for different parameter groups
- Composite loss logging - see individual loss components
- Perfect for two-tower architectures

4️⃣ Comprehensive Documentation & Training
- New Training Overview, Loss Overview, API Reference docs
- 6 new training example documentation pages
- Full integration examples with major search engines
- Extensive blogpost on training sparse models

Read the comprehensive blogpost about training sparse embedding models: https://huggingface.co/blog/train-sparse-encoder

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v5.0.0

What's next? We would love to hear from the community! What sparse encoder models would you like to see? And what new capabilities should Sentence Transformers handle - multimodal embeddings, late interaction models, or something else? Your feedback shapes our roadmap!