jizhongpeng's picture

jizhongpeng

jizhongpeng

·

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

ByteDance/Q-Insight

liked a model 4 days ago

OPPOer/MultilingualFLUX.1-adapter

liked a model 4 days ago

timbrooks/instruct-pix2pix

View all activity

Organizations

upvoted 3 papers about 1 month ago

ImgEdit: A Unified Image Editing Dataset and Benchmark

Paper • 2505.20275 • Published May 26 • 17

In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer

Paper • 2504.20690 • Published Apr 29 • 19

LightLab: Controlling Light Sources in Images with Diffusion Models

Paper • 2505.09608 • Published May 14 • 32

upvoted a collection 3 months ago

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated 3 days ago • 67

upvoted a paper 5 months ago

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 30

upvoted a paper 7 months ago

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Paper • 2411.13281 • Published Nov 20, 2024 • 22

upvoted a paper 9 months ago

Aria: An Open Multimodal Native Mixture-of-Experts Model

Paper • 2410.05993 • Published Oct 8, 2024 • 112

upvoted a collection 9 months ago

🏆 Leaderboards & Arenas

20 items • Updated 10 days ago • 7

upvoted a collection 10 months ago

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Apr 28 • 219

upvoted 3 papers 10 months ago

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences

Paper • 2408.14468 • Published Aug 26, 2024 • 39

Towards flexible perception with visual memory

Paper • 2408.08172 • Published Aug 15, 2024 • 24

FRAP: Faithful and Realistic Text-to-Image Generation with Adaptive Prompt Weighting

Paper • 2408.11706 • Published Aug 21, 2024 • 7

upvoted 2 papers 11 months ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61

Q-Ground: Image Quality Grounding with Large Multi-modality Models

Paper • 2407.17035 • Published Jul 24, 2024 • 1

upvoted a collection 11 months ago

Magpie-Qwen2 Datasets

Dataset built with Qwen2 72B and Qwen2 7B. • 6 items • Updated Jan 13 • 10

upvoted a paper 11 months ago

LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding

Paper • 2407.15754 • Published Jul 22, 2024 • 20

upvoted a collection 12 months ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 230

upvoted a paper 12 months ago

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Paper • 2407.04842 • Published Jul 5, 2024 • 57

upvoted a collection 12 months ago

InternVL2.0

Expanding Performance Boundaries of Open-Source MLLM • 15 items • Updated Apr 20 • 89

upvoted a paper about 1 year ago

CMC-Bench: Towards a New Paradigm of Visual Signal Compression

Paper • 2406.09356 • Published Jun 13, 2024 • 5