Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.10202

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 4
Diversity of Thought Improves Reasoning Abilities of Large Language Models

Paper • 2310.07088 • Published Oct 11, 2023 • 5
AutoMix: Automatically Mixing Language Models

Paper • 2310.12963 • Published Oct 19, 2023 • 14
SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network

Paper • 2310.09049 • Published Oct 13, 2023 • 1

Moral Foundations of Large Language Models

Paper • 2310.15337 • Published Oct 23, 2023 • 1
Specific versus General Principles for Constitutional AI

Paper • 2310.13798 • Published Oct 20, 2023 • 2
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 24
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 47

Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

Paper • 2310.00212 • Published Sep 30, 2023 • 2
Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9
Aligning Language Models with Offline Reinforcement Learning from Human Feedback

Paper • 2308.12050 • Published Aug 23, 2023 • 1
Secrets of RLHF in Large Language Models Part I: PPO

Paper • 2307.04964 • Published Jul 11, 2023 • 28

FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation

Paper • 2310.03214 • Published Oct 5, 2023 • 18
Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9

Efficient RLHF: Reducing the Memory Usage of PPO

Paper • 2309.00754 • Published Sep 1, 2023 • 13
Statistical Rejection Sampling Improves Preference Optimization

Paper • 2309.06657 • Published Sep 13, 2023 • 13
Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 30
Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9

Secrets of RLHF in Large Language Models Part I: PPO

Paper • 2307.04964 • Published Jul 11, 2023 • 28
Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28
Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9
Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

Paper • 2310.00212 • Published Sep 30, 2023 • 2

Advanced and Recent Papers

Advanced and recent papers about deep learning. Please send your recommend paper to email: [email protected]

AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models

Paper • 2309.16414 • Published Sep 28, 2023 • 19
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Paper • 2309.13018 • Published Sep 22, 2023 • 9
Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 23
Language models in molecular discovery

Paper • 2309.16235 • Published Sep 28, 2023 • 10

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9
MagiCapture: High-Resolution Multi-Concept Portrait Customization

Paper • 2309.06895 • Published Sep 13, 2023 • 27
Training-free Camera Control for Video Generation

Paper • 2406.10126 • Published Jun 14 • 12
An Image is Worth 32 Tokens for Reconstruction and Generation

Paper • 2406.07550 • Published Jun 11 • 55

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 24
Robotic Offline RL from Internet Videos via Value-Function Pre-Training

Paper • 2309.13041 • Published Sep 22, 2023 • 8
Voyager: An Open-Ended Embodied Agent with Large Language Models

Paper • 2305.16291 • Published May 25, 2023 • 9

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Paper • 2309.10202 • Published Sep 18, 2023 • 9
Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Paper • 2309.10020 • Published Sep 18, 2023 • 40
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22 • 82

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs