Top Contributors: Dataset Downloads

community

https://huggingface.co/spaces/mvaloatto/TCTF

Activity Feed

AI & ML interests

📊 Creators of datasets with the most cumulative new downloads each month (users only, no orgs)

Recent Activity

Sterzhang authored a paper 26 days ago

Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

Sterzhang authored a paper 26 days ago

FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation

Sterzhang authored a paper about 1 month ago

EquivPruner: Boosting Efficiency and Quality in LLM-Based Search via Action Pruning

View all activity

Sterzhang

authored 2 papers 26 days ago

Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

Paper • 2406.07502 • Published Jun 11, 2024 • 1

FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation

Paper • 2408.12168 • Published Aug 22, 2024

Sterzhang

authored a paper about 1 month ago

EquivPruner: Boosting Efficiency and Quality in LLM-Based Search via Action Pruning

Paper • 2505.16312 • Published May 22 • 3

chansung

posted an update 3 months ago

Post

3868

simple guide on the recipe for GRPO on Open-R1 which is built on top of TRL

I think FastAPI wrapper of vLLM with WeightSyncWorker is pretty cool feature. Also, we have many predefined reward functions out of the box!

5 replies

chansung

posted an update 3 months ago

Post

2649

Mistral AI Small 3.1 24B is not only commercial free but also the best model in a single GPU deployment.

I packed up all the information you need to know in a single picture. Hope this helps! :)

1 reply

chansung

posted an update 4 months ago

Post

1584

Gemma 3 Release in a nutshell
(seems like function calling is not supported whereas the announcement said so)

mvaloatto

updated a Space 5 months ago

Top Contributors: Dataset Downloads

📊

Creators of datasets with the most cumulative new downloads

chansung

posted an update 5 months ago

Post

3024

Simple Paper Review #5

I briefly reviewed the paper "SFT Memorizes, RL Generalizes," which compares SFT and RL in post-training of LLM/VLM from HKU, UC Berkeley, Google DeepMind, and New York University

The conclusion suggests SFT excels in memorization, while RL is better for generalization. However, since LLM/VLM should benefit humans beyond just generalization, a mix of SFT and RL is advisable. Typically, some SFT is followed by RL to understand prompt formats and enhance generalization through trial and error.

The study focused on one model, Llama-3.2-Vision-11B, using environments like General Points for arithmetic reasoning and V-IRL for spatial reasoning. Training data was used for both SFT and RL, with evaluations on in-distribution and out-of-distribution data to assess memorization and generalization.

I want to apply RL extensively, but it requires building a similar simulation environment. For domain-specific models, significant investment in creating a "playground" for the model is crucial, as the effort will directly influence the outcomes.

https://arxiv.org/abs/2501.17161

chansung

posted an update 5 months ago

Post

4443

A brief summary of the o3-mini

The OpenAI o3-mini model is a significant improvement over the o1-mini, reaching o1 performance levels. While generally good, its performance isn't universally better than previous models (o1, o1-prev.) or GPT-4o across all benchmarks. This means workflows should be re-evaluated with each model upgrade.

The o3-mini has "low," "medium," and "high" versions, with "low" being the base model used for benchmarking. It's speculated that the higher versions simply involve more processing. A fair comparison with other models like Gemini 2.0 Thinking or DeepSeek-R1 would likely need to use the "low" version and a similar "think more" mechanism.

The system card is recommended reading due to its comprehensive benchmark data.

https://openai.com/index/openai-o3-mini/

wyu1

authored a paper 5 months ago

OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Paper • 2501.15427 • Published Jan 26 • 6

chansung

posted an update 5 months ago

Post

2036

Simple summary on DeepSeek AI's Janus-Pro: A fresh take on multimodal AI!

It builds on its predecessor, Janus, by tweaking the training methodology rather than the model architecture. The result? Improved performance in understanding and generating multimodal data.

Janus-Pro uses a three-stage training strategy, similar to Janus, but with key modifications:
✦ Stage 1 & 2: Focus on separate training for specific objectives, rather than mixing data.
✦ Stage 3: Fine-tuning with a careful balance of multimodal data.

Benchmarks show Janus-Pro holds its own against specialized models like TokenFlow XL and MetaMorph, and other multimodal models like SD3 Medium and DALL-E 3.

The main limitation? Low image resolution (384x384). However, this seems like a strategic choice to focus on establishing a solid "recipe" for multimodal models. Future work will likely leverage this recipe and increased computing power to achieve higher resolutions.

chansung

posted an update 5 months ago

Post

1742

New look for AI powered paper reviews from the list by Hugging Face Daily Papers ( managed by the @akhaliq )

Bookmark the webpage along, check comprehensive reviews by Google DeepMind Gemini 1.5, and listen to audio podcast made by the same tech used in NotebookLM.

Link: https://deep-diver.github.io/ai-paper-reviewer/

This is not an official service by Hugging Face. It is just a service developed by an individual developer using his own money :)

chansung

posted an update 5 months ago

Post

2046

Simple summarization of Evolving Deeper LLM Thinking (Google DeepMind)

The process starts by posing a question.
1) The LLM generates initial responses.
2) These generated responses are evaluated according to specific criteria (program-based checker).
3) The LLM critiques the evaluated results.
4) The LLM refines the responses based on the evaluation, critique, and original responses.

The refined response is then fed back into step 2). If it meets the criteria, the process ends. Otherwise, the algorithm generates more responses based on the refined ones (with some being discarded, some remaining, and some responses potentially being merged).

Through this process, it demonstrated excellent performance in complex scheduling problems (travel planning, meeting scheduling, etc.). It's a viable method for finding highly effective solutions in specific scenarios.

However, there are two major drawbacks:
🤔 An excessive number of API calls are required. (While the cost might not be very high, it leads to significant latency.)
🤔 The evaluator is program-based. (This limits its use as a general method. It could potentially be modified/implemented using LLM as Judge, but that would introduce additional API costs for evaluation.)

https://arxiv.org/abs/2501.09891

chansung

posted an update 5 months ago

Post

2091

Simple Summarization on DeepSeek-R1 from DeepSeek AI

The RL stage is very important.
↳ However, it is difficult to create a truly helpful AI for people solely through RL.
↳ So, we applied a learning pipeline consisting of four stages: providing a good starting point, reasoning RL, SFT, and safety RL, and achieved performance comparable to o1.
↳ Simply fine-tuning other open models with the data generated by R1-Zero (distillation) resulted in performance comparable to o1-mini.

Of course, this is just a brief overview and may not be of much help. All models are accessible on Hugging Face, and the paper can be read through the GitHub repository.

Model:

deepseek-ai
Paper: https://github.com/deepseek-ai/DeepSeek-R1

1 reply

chansung

authored a paper 7 months ago

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Paper • 2412.06071 • Published Dec 8, 2024 • 9

SaylorTwift

posted an update 7 months ago

Post

798

How do I test an LLM for my unique needs?
If you work in finance, law, or medicine, generic benchmarks are not enough.
This blog post uses Argilla, Distilllabel and 🌤️Lighteval to generate evaluation dataset and evaluate models.

https://github.com/argilla-io/argilla-cookbook/blob/main/domain-eval/README.md

chansung

posted an update 8 months ago

Post

1985

🎙️ Listen to the audio "Podcast" of every single Hugging Face Daily Papers.

Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.

🎙️ Try out yourself at https://deep-diver.github.io/ai-paper-reviewer/

This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)

"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in!
: https://github.com/deep-diver/paper-reviewer

This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!

3 replies

chansung

posted an update 8 months ago

Post

4763

Effortlessly stay up-to-date with AI research trends using a new AI tool, "AI Paper Reviewer" !!

It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq ) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents.
blog link: https://deep-diver.github.io/ai-paper-reviewer/

Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest!
: https://github.com/deep-diver/paper-reviewer

wyu1

authored a paper 9 months ago

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Paper • 2410.10813 • Published Oct 14, 2024 • 12

Sterzhang

authored a paper 9 months ago

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 71

AI & ML interests

Recent Activity

Team members 10

TopContributors-DatasetDownloads's activity

Top Contributors: Dataset Downloads