-
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper • 2310.11441 • Published • 28 -
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper • 2501.12326 • Published • 62 -
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Paper • 2406.08451 • Published • 26 -
GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents
Paper • 2406.10819 • Published • 1
Aymeric Roucher
m-ric
AI & ML interests
Leading Agents at Hugging Face 🤗
Organizations
Scaling Laws 📏
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Paper • 2206.10789 • Published • 4 -
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Paper • 2401.00448 • Published • 31 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 10 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 7
🧑⚖️ LLM-as-a-judge
-
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper • 2306.05685 • Published • 36 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 43 -
Leveraging Large Language Models for NLG Evaluation: A Survey
Paper • 2401.07103 • Published • 4 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55
🤖 Agents
-
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Paper • 2310.03714 • Published • 35 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 43 -
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
Paper • 2308.08155 • Published • 8 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 223
🛣️ Grammar
-
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
Paper • 2305.13971 • Published • 4 -
Autoregressive Entity Retrieval
Paper • 2010.00904 • Published • 1 -
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Paper • 2109.05093 • Published • 1
LLM foundations
-
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 106 -
Textbooks Are All You Need
Paper • 2306.11644 • Published • 145 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 111 -
Large Language Models Struggle to Learn Long-Tail Knowledge
Paper • 2211.08411 • Published • 3
🌍 Earth
Mother of all Training Clusters
https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf
Could be useful one day
🚀 Spinning Up in LLMs
-
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 40 -
Efficient Estimation of Word Representations in Vector Space
Paper • 1301.3781 • Published • 6 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 19 -
Attention Is All You Need
Paper • 1706.03762 • Published • 69
🔎⇒💬 RAG
-
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 11 -
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Paper • 2401.08406 • Published • 37 -
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Paper • 2104.08663 • Published • 3 -
Precise Zero-Shot Dense Retrieval without Relevance Labels
Paper • 2212.10496 • Published • 4
👁️ Vision
💡 Interpretability - understanding LLMs
-
Linearity of Relation Decoding in Transformer Language Models
Paper • 2308.09124 • Published • 2 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 110 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 106 -
Mission: Impossible Language Models
Paper • 2401.06416 • Published • 3
🔧 Optimization Mechanics 🔧
-
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 -
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 -
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Paper • 2402.05099 • Published • 20 -
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Paper • 2401.10774 • Published • 59
Open-source AI Releases - August '24
-
nvidia/Mistral-NeMo-Minitron-8B-Base
Text Generation • 8B • Updated • 6.08k • 176 -
Running5454
Instant SmolLM
🤏Run SmolLM-360M-Instruct in realtime with MLC WebLLM
-
black-forest-labs/FLUX.1-schnell
Text-to-Image • Updated • 676k • • 4.08k -
Running on Zero4.89k4.89k
FLUX.1 [Schnell]
🏎Generate images from text prompts
GUI Agents
-
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper • 2310.11441 • Published • 28 -
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper • 2501.12326 • Published • 62 -
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Paper • 2406.08451 • Published • 26 -
GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents
Paper • 2406.10819 • Published • 1
Could be useful one day
Scaling Laws 📏
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Paper • 2206.10789 • Published • 4 -
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Paper • 2401.00448 • Published • 31 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 10 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 7
🚀 Spinning Up in LLMs
-
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 40 -
Efficient Estimation of Word Representations in Vector Space
Paper • 1301.3781 • Published • 6 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 19 -
Attention Is All You Need
Paper • 1706.03762 • Published • 69
🧑⚖️ LLM-as-a-judge
-
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper • 2306.05685 • Published • 36 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 43 -
Leveraging Large Language Models for NLG Evaluation: A Survey
Paper • 2401.07103 • Published • 4 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55
🔎⇒💬 RAG
-
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 11 -
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Paper • 2401.08406 • Published • 37 -
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Paper • 2104.08663 • Published • 3 -
Precise Zero-Shot Dense Retrieval without Relevance Labels
Paper • 2212.10496 • Published • 4
🤖 Agents
-
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Paper • 2310.03714 • Published • 35 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 43 -
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
Paper • 2308.08155 • Published • 8 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 223
👁️ Vision
🛣️ Grammar
-
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
Paper • 2305.13971 • Published • 4 -
Autoregressive Entity Retrieval
Paper • 2010.00904 • Published • 1 -
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models
Paper • 2109.05093 • Published • 1
💡 Interpretability - understanding LLMs
-
Linearity of Relation Decoding in Transformer Language Models
Paper • 2308.09124 • Published • 2 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 110 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 106 -
Mission: Impossible Language Models
Paper • 2401.06416 • Published • 3
LLM foundations
-
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 106 -
Textbooks Are All You Need
Paper • 2306.11644 • Published • 145 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 111 -
Large Language Models Struggle to Learn Long-Tail Knowledge
Paper • 2211.08411 • Published • 3
🔧 Optimization Mechanics 🔧
-
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 -
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 -
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Paper • 2402.05099 • Published • 20 -
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Paper • 2401.10774 • Published • 59
🌍 Earth
Open-source AI Releases - August '24
-
nvidia/Mistral-NeMo-Minitron-8B-Base
Text Generation • 8B • Updated • 6.08k • 176 -
Running5454
Instant SmolLM
🤏Run SmolLM-360M-Instruct in realtime with MLC WebLLM
-
black-forest-labs/FLUX.1-schnell
Text-to-Image • Updated • 676k • • 4.08k -
Running on Zero4.89k4.89k
FLUX.1 [Schnell]
🏎Generate images from text prompts
Mother of all Training Clusters
https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf