Paulson's picture

935 22

Paulson

Pnaomi

·

AI & ML interests

Yes

Recent Activity

upvoted a paper about 11 hours ago

EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions

upvoted a paper about 11 hours ago

Evaluating LLMs Robustness in Less Resourced Languages with Proxy Models

upvoted a paper about 11 hours ago

MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories

View all activity

Organizations

Pnaomi's activity

upvoted 19 papers about 11 hours ago

EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions

Paper • 2505.23473 • Published 12 days ago • 1

Evaluating LLMs Robustness in Less Resourced Languages with Proxy Models

Paper • 2506.07645 • Published 1 day ago • 1

MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories

Paper • 2506.04807 • Published 6 days ago • 2

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Paper • 2506.07527 • Published 1 day ago • 3

Dreamland: Controllable World Creation with Simulator and Generative Models

Paper • 2506.08006 • Published 1 day ago • 6

ConfQA: Answer Only If You Are Confident

Paper • 2506.07309 • Published 2 days ago • 8

GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior

Paper • 2506.08012 • Published 1 day ago • 7

Debatable Intelligence: Benchmarking LLM Judges via Debate Speech Evaluation

Paper • 2506.05062 • Published 5 days ago • 11

CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models

Paper • 2506.07463 • Published 2 days ago • 8

Bootstrapping World Models from Dynamics Models in Multimodal Foundation Models

Paper • 2506.06006 • Published 5 days ago • 10

BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation

Paper • 2506.07530 • Published 1 day ago • 12

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition

Paper • 2506.07553 • Published 1 day ago • 12

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

Paper • 2506.06205 • Published 4 days ago • 20

SpatialLM: Training Large Language Models for Structured Indoor Modeling

Paper • 2506.07491 • Published 1 day ago • 29

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published 1 day ago • 37

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Paper • 2506.06444 • Published 4 days ago • 60

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published 1 day ago • 55

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Paper • 2506.07044 • Published 3 days ago • 80

Reinforcement Pre-Training

Paper • 2506.08007 • Published 1 day ago • 143

upvoted a paper 1 day ago

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Paper • 2506.05579 • Published 5 days ago • 3