Blair Chintella

ctranslate2-4you

AI & ML interests

None yet

Recent Activity

updated a model 1 day ago
ctranslate2-4you/molmo-7B-O-bnb-4bit
reacted to tomaarsen's post with ā¤ļø 2 days ago
šŸŽļø Today I'm introducing a method to train static embedding models that run 100x to 400x faster on CPU than common embedding models, while retaining 85%+ of the quality! Including 2 fully open models: training scripts, datasets, metrics. We apply our recipe to train 2 Static Embedding models that we release today! We release: 2ļøāƒ£ an English Retrieval model and a general-purpose Multilingual similarity model (e.g. classification, clustering, etc.), both Apache 2.0 šŸ§  my modern training strategy: ideation -> dataset choice -> implementation -> evaluation šŸ“œ my training scripts, using the Sentence Transformers library šŸ“Š my Weights & Biases reports with losses & metrics šŸ“• my list of 30 training and 13 evaluation datasets The 2 Static Embedding models have the following properties: šŸŽļø Extremely fast, e.g. 107500 sentences per second on a consumer CPU, compared to 270 for 'all-mpnet-base-v2' and 56 for 'gte-large-en-v1.5' 0ļøāƒ£ Zero active parameters: No Transformer blocks, no attention, not even a matrix multiplication. Super speed! šŸ“ No maximum sequence length! Embed texts at any length (note: longer texts may embed worse) šŸ“ Linear instead of exponential complexity: 2x longer text takes 2x longer, instead of 2.5x or more. šŸŖ† Matryoshka support: allow you to truncate embeddings with minimal performance loss (e.g. 4x smaller with a 0.56% perf. decrease for English Similarity tasks) Check out the full blogpost if you'd like to 1) use these lightning-fast models or 2) learn how to train them with consumer-level hardware: https://huggingface.co/blog/static-embeddings The blogpost contains a lengthy list of possible advancements; I'm very confident that our 2 models are only the tip of the iceberg, and we may be able to get even better performance. Alternatively, check out the models: * https://huggingface.co/sentence-transformers/static-retrieval-mrl-en-v1 * https://huggingface.co/sentence-transformers/static-similarity-mrl-multilingual-v1
View all activity

Organizations

None yet

ctranslate2-4you's activity

reacted to tomaarsen's post with ā¤ļø 2 days ago
view post
Post
4092
šŸŽļø Today I'm introducing a method to train static embedding models that run 100x to 400x faster on CPU than common embedding models, while retaining 85%+ of the quality! Including 2 fully open models: training scripts, datasets, metrics.

We apply our recipe to train 2 Static Embedding models that we release today! We release:
2ļøāƒ£ an English Retrieval model and a general-purpose Multilingual similarity model (e.g. classification, clustering, etc.), both Apache 2.0
šŸ§  my modern training strategy: ideation -> dataset choice -> implementation -> evaluation
šŸ“œ my training scripts, using the Sentence Transformers library
šŸ“Š my Weights & Biases reports with losses & metrics
šŸ“• my list of 30 training and 13 evaluation datasets

The 2 Static Embedding models have the following properties:
šŸŽļø Extremely fast, e.g. 107500 sentences per second on a consumer CPU, compared to 270 for 'all-mpnet-base-v2' and 56 for 'gte-large-en-v1.5'
0ļøāƒ£ Zero active parameters: No Transformer blocks, no attention, not even a matrix multiplication. Super speed!
šŸ“ No maximum sequence length! Embed texts at any length (note: longer texts may embed worse)
šŸ“ Linear instead of exponential complexity: 2x longer text takes 2x longer, instead of 2.5x or more.
šŸŖ† Matryoshka support: allow you to truncate embeddings with minimal performance loss (e.g. 4x smaller with a 0.56% perf. decrease for English Similarity tasks)

Check out the full blogpost if you'd like to 1) use these lightning-fast models or 2) learn how to train them with consumer-level hardware: https://huggingface.co/blog/static-embeddings

The blogpost contains a lengthy list of possible advancements; I'm very confident that our 2 models are only the tip of the iceberg, and we may be able to get even better performance.

Alternatively, check out the models:
* sentence-transformers/static-retrieval-mrl-en-v1
* sentence-transformers/static-similarity-mrl-multilingual-v1
  • 1 reply
Ā·
New activity in ibm-granite/granite-3.1-2b-instruct 21 days ago
New activity in allenai/OLMo-2-1124-13B-Instruct-preview about 1 month ago
New activity in HuggingFaceTB/SmolVLM-Instruct about 1 month ago

loading images locally?

5
#8 opened about 2 months ago by
fusi0n