Zhuofan Zong's picture

2 12 8

Zhuofan Zong PRO

zongzhuofan

·

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

upvoted a paper about 2 months ago

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

upvoted a paper 4 months ago

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

View all activity

Organizations

authored a paper about 2 months ago

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Paper • 2505.00703 • Published May 1 • 43

authored 2 papers 6 months ago

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

Paper • 2412.11279 • Published Dec 15, 2024 • 12

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Paper • 2412.09618 • Published Dec 12, 2024 • 21

authored 10 papers about 1 year ago

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection

Paper • 2110.12130 • Published Oct 23, 2021

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Paper • 2404.13046 • Published Apr 19, 2024 • 1

Large-batch Optimization for Dense Visual Predictions

Paper • 2210.11078 • Published Oct 20, 2022

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

Paper • 2406.11831 • Published Jun 17, 2024 • 22

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Paper • 2404.03653 • Published Apr 4, 2024 • 37

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

Paper • 2403.16999 • Published Mar 25, 2024 • 5

Self-slimmed Vision Transformer

Paper • 2111.12624 • Published Nov 24, 2021 • 1

DETRs with Collaborative Hybrid Assignments Training

Paper • 2211.12860 • Published Nov 22, 2022

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

Paper • 2304.00967 • Published Apr 3, 2023

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

Paper • 2305.18295 • Published May 29, 2023 • 8