26 8 216

Damar Jati 🍫

DamarJati

https://damarcreative.my.id

AI & ML interests

Indonesian - Multimodal, Compvis, NLP | Discord: @damarjati_

Recent Activity

reacted to MoritzLaurer's post with ❤️ 15 days ago

FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself! 📏 The paper introduces the FACTS Grounding benchmark for evaluating the factuality of LLM outputs. 🤖 Fact-checking is automated by an ensemble of LLM judges that verify if a response is fully grounded in a factual reference document. 🧪 The authors tested different prompt templates on held-out data to ensure their generalization. 📚 It's highly educational to read these templates to learn how frontier labs design prompts and understand their limitations. 💾 You can now download and reuse these prompt templates via the prompt-templates library! 🔄 The library simplifies sharing prompt templates on the HF hub or locally via standardized YAML files. Let’s make LLM work more transparent and reproducible by sharing more templates like this! Links 👇 - prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/ - all templates on the HF Hub: https://huggingface.co/datasets/MoritzLaurer/facts-grounding-prompts - FACTS paper: https://storage.googleapis.com/deepmind-media/FACTS/FACTS_grounding_paper.pdf

liked a Space 26 days ago

prithivMLmods/Multimodal-OCR

liked a Space about 1 month ago

yoyolicoris/diffvox

View all activity

Organizations

reacted to MoritzLaurer's post with ❤️ 15 days ago

Post

3337

FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself!

📏 The paper introduces the FACTS Grounding benchmark for evaluating the factuality of LLM outputs.

🤖 Fact-checking is automated by an ensemble of LLM judges that verify if a response is fully grounded in a factual reference document.

🧪 The authors tested different prompt templates on held-out data to ensure their generalization.

📚 It's highly educational to read these templates to learn how frontier labs design prompts and understand their limitations.

💾 You can now download and reuse these prompt templates via the prompt-templates library!

🔄 The library simplifies sharing prompt templates on the HF hub or locally via standardized YAML files. Let’s make LLM work more transparent and reproducible by sharing more templates like this!

Links 👇
- prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
- all templates on the HF Hub: MoritzLaurer/facts-grounding-prompts
- FACTS paper: https://storage.googleapis.com/deepmind-media/FACTS/FACTS_grounding_paper.pdf

reacted to clem's post with 🔥 4 months ago

Post

2474

The 🐳 just crossed 10,000 followers on HF

deepseek-ai

reacted to ZennyKenny's post with 🔥 5 months ago

Post

3477

I've completed the first unit of the just-launched Hugging Face Agents Course. I would highly recommend it, even for experienced builders, because it is a great walkthrough of the smolagents library and toolkit.

reacted to tomaarsen's post with 🔥 5 months ago

Post

7189

📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost.

1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.

Usage is as simple as SentenceTransformer("all-MiniLM-L6-v2", backend="onnx"). Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉

🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:

1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with from_model2vec or with from_distillation where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.
2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html

1 reply

reacted to m-ric's post with 🔥 5 months ago

Post

3455

Today we make the biggest release in smolagents so far: 𝘄𝗲 𝗲𝗻𝗮𝗯𝗹𝗲 𝘃𝗶𝘀𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀, 𝘄𝗵𝗶𝗰𝗵 𝗮𝗹𝗹𝗼𝘄𝘀 𝘁𝗼 𝗯𝘂𝗶𝗹𝗱 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹 𝘄𝗲𝗯 𝗯𝗿𝗼𝘄𝘀𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁𝘀! 🥳

Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.

The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !

Go try it out, it's the most cracked agentic stuff I've seen in a while 🤯 (well, along with OpenAI's Operator who beat us by one day)

For more detail, read our announcement blog 👉 https://huggingface.co/blog/smolagents-can-see
The code for the web browser example is here 👉 https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py

3 replies

posted an update 6 months ago

Post

3990

Happy New Year 2025 🤗
For the Huggingface community.

reacted to victor's post with 🔥 10 months ago

Post

6055

🙋 Calling all Hugging Face users! We want to hear from YOU!

What feature or improvement would make the biggest impact on Hugging Face?

Whether it's the Hub, better documentation, new integrations, or something completely different – we're all ears!

Your feedback shapes the future of Hugging Face. Drop your ideas in the comments below! 👇

185 replies

reacted to victor's post with ➕ about 1 year ago

Post

2276

Am I the only one who think command-r-+ is a better daily Assistant than ChatGPT-4? (and it's not even close :D)

6 replies

replied to KingNish's post about 1 year ago

Wow, this has quite a short processing time.
Awesome!

reacted to KingNish's post with 🔥 about 1 year ago

Post

2665

Introducing JARVIS Tony's voice assistant for You.

JARVIS responds to all your questions in audio format.
Must TRY -> KingNish/JARVIS

Jarvis is currently equipped to accept text input and provide audio output.
In the future, it may also support audio input.

DEMO Video:

4 replies

reacted to dhuynh95's post with 🤯 over 1 year ago

Post

Hello World! This post is written by the Large Action Model framework LaVague! Find out more on https://github.com/mithril-security/LaVague

Edit: Here is the video of 🌊LaVague posting this. This is quite meta

2 replies

reacted to macadeliccc's post with 👍 over 1 year ago

Post

Benefits of imatrix quantization in place of quip#

Quip-# is a quantization method proposed by [Cornell-RelaxML](https://github.com/Cornell-RelaxML) that claims tremendous performance gains using only 2-bit precision.

RelaxML proposes that quantizing a model from 16 bit to 2 bit precision they can utilize Llama-2-70B on a single 24GB GPU.

QuIP# aims to revolutionize model quantization through a blend of incoherence processing and advanced lattice codebooks. By switching to a Hadamard transform-based incoherence approach, QuIP# enhances GPU efficiency, making weight matrices more Gaussian-like and ideal for quantization with its improved lattice codebooks.

This new method has already seen some adoption by projects like llama.cpp. The use of the Quip-# methodology has been implemented in the form of imatrix calculations. The importance matrix is calculated from a dataset such as wiki.train.raw and will output the perplexity on the given dataset.

This interim step can improve the results of the quantized model. If you would like to explore this process for yourself:

llama.cpp - https://github.com/ggerganov/llama.cpp/
Quip# paper - https://cornell-relaxml.github.io/quip-sharp/
AutoQuip# colab - https://colab.research.google.com/drive/1rPDvcticCekw8VPNjDbh_UcivVBzgwEW?usp=sharing

Other impressive quantization projects to watch:
+ AQLM
https://github.com/Vahe1994/AQLM
https://arxiv.org/abs/2401.06118

reacted to alielfilali01's post with 👍 over 1 year ago

Post

I love the new Viewer and i didn't knew how much i needed it until now
@sylvain , @lhoestq and team, GREAT JOB 🔥 and THANK YOU 🤗

4 replies

reacted to Xenova's post with 🤯❤️ over 1 year ago

Post

Introducing Remove Background Web: In-browser background removal, powered by @briaai 's new RMBG-v1.4 model and 🤗 Transformers.js!

Everything runs 100% locally, meaning none of your images are uploaded to a server! 🤯 At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).

Check it out! 👇
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4

9 replies

reacted to fffiloni's post with ❤️ over 1 year ago

Post

Quick build of the day: LCM Supa Fast Image Variation
—
We take the opportunity to combine moondream1 vision and LCM SDXL fast abilities to generate a variation from the subject of the image input.
All that thanks to gradio APIs 🤗

Try the space: https://huggingface.co/spaces/fffiloni/lcm-img-variations

3 replies

reacted to joaogante's post with ❤️ over 1 year ago

Post

Up to 3x faster LLM generation with no extra resources/requirements - ngram speculation has landed in 🤗 transformers! 🏎️💨

All you need to do is to add prompt_lookup_num_tokens=10 to your generate call, and you'll get faster LLMs 🔥

How does it work? 🤔

Start with assisted generation, where a smaller model generates candidate sequences. The net result is a significant speedup if the model agrees with the candidate sequences! However, we do require a smaller model trained similarly 😕

The idea introduced (and implemented) by Apoorv Saxena consists of gathering the candidate sequences from the input text itself. If the latest generated ngram is in the input, use the continuation therein as a candidate! No smaller model is required while still achieving significant speedups 🔥

In fact, the penalty of gathering and testing the candidates is so small that you should use this technique whenever possible!

Here is the code example that produces the outputs shown in the video: https://pastebin.com/bms6XtR4

Have fun 🤗

3 replies

reacted to alvarobartt's post with 🤯 over 1 year ago

Post

💨 Notux 8x7b was just released!

From Argilla, we recently fine-tuned Mixtral 8x7b Instruct from Mistral AI using DPO, and a binarized and curated version of UltraFeedback, to find out it outperforms every other MoE-based model on the Hub.

- argilla/notux-8x7b-v1
- argilla/ultrafeedback-binarized-preferences-cleaned

19 replies

Damar Jati 🍫

AI & ML interests

Recent Activity

Organizations

DamarJati's activity