Yasunori Ozaki's picture

Yasunori Ozaki PRO

alfredplpl

·

https://alfredplpl.github.io/en/index.html

AI & ML interests

Computer Vision, LLM

Recent Activity

liked a model 4 days ago

NSFW-API/NSFW_Wan_1.3b

liked a model 4 days ago

stockmark/Stockmark-2-VL-100B-beta

liked a model 5 days ago

mmnga/llm-jp-3.1-8x13b-instruct4-gguf

View all activity

Organizations

alfredplpl's activity

upvoted a paper 9 days ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published 10 days ago • 91

upvoted a paper 10 days ago

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Paper • 2505.18773 • Published 13 days ago • 7

upvoted a collection 17 days ago

Gemma 3n Preview

2 items • Updated 8 days ago • 110

upvoted an article about 1 month ago

Article

Mixture of Experts Explained

By

and 5 others •

Dec 11, 2023

• 666

upvoted 3 papers about 2 months ago

MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft

Paper • 2504.08388 • Published Apr 11 • 40

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10 • 28

DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published Apr 8 • 75

upvoted a collection 2 months ago

Llama 4

Llama 4 release • 13 items • Updated Apr 29 • 522

upvoted a paper 2 months ago

AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset

Paper • 2503.19462 • Published Mar 25 • 10

upvoted a paper 3 months ago

VBench: Comprehensive Benchmark Suite for Video Generative Models

Paper • 2311.17982 • Published Nov 29, 2023 • 9

upvoted 2 collections 3 months ago

Gemma 3

4 items • Updated 24 days ago • 15

Gemma 3 Release

24 items • Updated 8 days ago • 380

upvoted 8 papers 4 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 144

Magic 1-For-1: Generating One Minute Video Clips within One Minute

Paper • 2502.07701 • Published Feb 11 • 36

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published Feb 11 • 29

Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published Feb 3 • 70

FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation

Paper • 2502.05179 • Published Feb 7 • 24

Goku: Flow Based Video Generative Foundation Models

Paper • 2502.04896 • Published Feb 7 • 104

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 232

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published Feb 4 • 65