renqibing's picture

2 3

renqibing

renqibing

·

renqibing

AI & ML interests

large language model, trustworthy AI

Recent Activity

authored a paper 12 days ago

One RL to See Them All: Visual Triple Unified Reinforcement Learning

updated a dataset 7 months ago

SafeMTData/SafeMTData

authored a paper 8 months ago

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

View all activity

Organizations

renqibing's activity

authored a paper 12 days ago

One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published 14 days ago • 59

updated a dataset 7 months ago

SafeMTData/SafeMTData

Viewer • Updated Nov 21, 2024 • 2.28k • 85 • 9

authored a paper 8 months ago

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Paper • 2410.10700 • Published Oct 14, 2024 • 2

upvoted 2 papers 8 months ago

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Paper • 2410.10700 • Published Oct 14, 2024 • 2

CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion

Paper • 2403.07865 • Published Mar 12, 2024 • 1

liked a dataset 8 months ago

SafeMTData/SafeMTData

Viewer • Updated Nov 21, 2024 • 2.28k • 85 • 9

liked 2 models about 1 year ago

TheBloke/wizardLM-7B-HF

Text Generation • Updated Jun 5, 2023 • 1.02k • 95

meta-llama/Meta-Llama-3-8B

Text Generation • Updated Sep 27, 2024 • 5.06M • • 6.2k