Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Bertoti's picture

Bertoti

Giuliano

lisaterumi's profile picture

pcuenq's profile picture

Nassami1's profile picture

·

giulianobertoti
giulianobertoti

AI & ML interests

None yet

Organizations

Giuliano 's collections 11

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

Paper • 2507.03112 • Published Jul 3, 2025 • 34
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2, 2025 • 39
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens

Paper • 2506.17218 • Published Jun 20, 2025 • 29
WebSailor: Navigating Super-human Reasoning for Web Agent

Paper • 2507.02592 • Published Jul 3, 2025 • 126

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3, 2025 • 47

CohereLabs/c4ai-command-r7b-12-2024

8B • Updated Oct 30, 2025 • 28.8k • 421
Text2SQL is Not Enough: Unifying AI and Databases with TAG

Paper • 2408.14717 • Published Aug 27, 2024 • 26

LLM Personalization

AI PERSONA: Towards Life-long Personalization of LLMs

Paper • 2412.13103 • Published Dec 17, 2024 • 2

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23, 2024 • 78
CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs

Paper • 2408.02193 • Published Aug 5, 2024 • 1
CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Paper • 2411.04329 • Published Nov 7, 2024
SWE-Gym/OpenHands-7B-Agent

Updated Dec 23, 2024 • 234

STaR: Bootstrapping Reasoning With Reasoning

Paper • 2203.14465 • Published Mar 28, 2022 • 9
Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 11
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 94
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 61

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3, 2025 • 47
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought

Paper • 2501.07542 • Published Jan 13, 2025 • 3

LTX-Video: Realtime Video Latent Diffusion

Paper • 2501.00103 • Published Dec 30, 2024 • 51
nvidia/Cosmos-1.0-Diffusion-14B-Text2World

Updated May 7, 2025 • 334 • 61
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3, 2025 • 225

A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?

Paper • 2409.15277 • Published Sep 23, 2024 • 38
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 107

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 11
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

Paper • 2401.00812 • Published Jan 1, 2024 • 12
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Paper • 2310.03714 • Published Oct 5, 2023 • 37
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 35

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 90
OmniParser for Pure Vision Based GUI Agent

Paper • 2408.00203 • Published Aug 1, 2024 • 24
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published Dec 5, 2024 • 71
zai-org/cogagent-9b-20241220

Image-Text-to-Text • 14B • Updated Dec 25, 2024 • 730 • 54

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

Paper • 2507.03112 • Published Jul 3, 2025 • 34
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2, 2025 • 39
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens

Paper • 2506.17218 • Published Jun 20, 2025 • 29
WebSailor: Navigating Super-human Reasoning for Web Agent

Paper • 2507.02592 • Published Jul 3, 2025 • 126

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3, 2025 • 47
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought

Paper • 2501.07542 • Published Jan 13, 2025 • 3

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3, 2025 • 47

LTX-Video: Realtime Video Latent Diffusion

Paper • 2501.00103 • Published Dec 30, 2024 • 51
nvidia/Cosmos-1.0-Diffusion-14B-Text2World

Updated May 7, 2025 • 334 • 61
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3, 2025 • 225

CohereLabs/c4ai-command-r7b-12-2024

8B • Updated Oct 30, 2025 • 28.8k • 421
Text2SQL is Not Enough: Unifying AI and Databases with TAG

Paper • 2408.14717 • Published Aug 27, 2024 • 26

A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?

Paper • 2409.15277 • Published Sep 23, 2024 • 38
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 107

LLM Personalization

AI PERSONA: Towards Life-long Personalization of LLMs

Paper • 2412.13103 • Published Dec 17, 2024 • 2

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 11
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

Paper • 2401.00812 • Published Jan 1, 2024 • 12
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Paper • 2310.03714 • Published Oct 5, 2023 • 37
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 35

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23, 2024 • 78
CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs

Paper • 2408.02193 • Published Aug 5, 2024 • 1
CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Paper • 2411.04329 • Published Nov 7, 2024
SWE-Gym/OpenHands-7B-Agent

Updated Dec 23, 2024 • 234

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 90
OmniParser for Pure Vision Based GUI Agent

Paper • 2408.00203 • Published Aug 1, 2024 • 24
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published Dec 5, 2024 • 71
zai-org/cogagent-9b-20241220

Image-Text-to-Text • 14B • Updated Dec 25, 2024 • 730 • 54

STaR: Bootstrapping Reasoning With Reasoning

Paper • 2203.14465 • Published Mar 28, 2022 • 9
Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 11
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 94
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 61

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs