-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 113 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 73 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33
Collections
Discover the best community collections!
Collections including paper arxiv:2311.00618
-
De-Diffusion Makes Text a Strong Cross-Modal Interface
Paper • 2311.00618 • Published • 21 -
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper • 2311.10093 • Published • 57 -
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Paper • 2311.13231 • Published • 26 -
Diffusion Model Alignment Using Direct Preference Optimization
Paper • 2311.12908 • Published • 47
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
De-Diffusion Makes Text a Strong Cross-Modal Interface
Paper • 2311.00618 • Published • 21 -
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Paper • 2310.19773 • Published • 19 -
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Paper • 2310.15308 • Published • 22
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 74 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 41 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40
-
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 53 -
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
Paper • 2310.08579 • Published • 14 -
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Paper • 2310.12921 • Published • 19 -
De-Diffusion Makes Text a Strong Cross-Modal Interface
Paper • 2311.00618 • Published • 21
-
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Paper • 2309.03895 • Published • 13 -
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Paper • 2309.16650 • Published • 10 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 9 -
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling
Paper • 2310.15169 • Published • 9
-
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 50 -
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
Paper • 2309.06380 • Published • 32 -
ImageBind-LLM: Multi-modality Instruction Tuning
Paper • 2309.03905 • Published • 16 -
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
Paper • 2309.06933 • Published • 12