speech-recognition-community-v2 (Speech Recognition Community Event Version 2)

posted an update 3 days ago

Post

217

🚀 New in smolagents v1.20.0: Remote Python Execution via WebAssembly (Wasm)

We've just merged a major new capability into the smolagents framework: the CodeAgent can now execute Python code remotely in a secure, sandboxed WebAssembly environment!

🔧 Powered by Pyodide and Deno, this new WasmExecutor lets your agent-generated Python code run safely: without relying on Docker or local execution.

Why this matters:
✅ Isolated execution = no host access
✅ No need for Python on the user's machine
✅ Safer evaluation of arbitrary code
✅ Compatible with serverless / edge agent workloads
✅ Ideal for constrained or untrusted environments

This is just the beginning: a focused initial implementation with known limitations. A solid MVP designed for secure, sandboxed use cases. 💡

💡 We're inviting the open-source community to help evolve this executor:
• Tackle more advanced Python features
• Expand compatibility
• Add test coverage
• Shape the next-gen secure agent runtime

🔗 Check out the PR: https://github.com/huggingface/smolagents/pull/1261

Let's reimagine what agent-driven Python execution can look like: remote-first, wasm-secure, and community-built.

This feature is live in smolagents v1.20.0!
Try it out.
Break things. Extend it. Give us feedback.
Let's build safer, smarter agents; together 🧠⚙️

👉 https://github.com/huggingface/smolagents/releases/tag/v1.20.0

#smolagents #WebAssembly #Python #AIagents #Pyodide #Deno #OpenSource #HuggingFace #AgenticAI

albertvillanova

posted an update 19 days ago

Post

1581

🚀 SmolAgents v1.19.0 is live!
This release brings major improvements to agent flexibility, UI usability, streaming architecture, and developer experience: making it easier than ever to build smart, interactive AI agents. Here's what's new:

🔧 Agent Upgrades
- Support for managed agents in ToolCallingAgent
- Context manager support for cleaner agent lifecycle handling
- Output formatting now uses XML tags for consistency

🖥️ UI Enhancements
- GradioUI now supports reset_agent_memory: perfect for fresh starts in dev & demos.

🔄 Streaming Refactor
- Streaming event aggregation moved off the Model class
- ➡️ Better architecture & maintainability

📦 Output Tracking
- CodeAgent outputs are now stored in ActionStep
- ✅ More visibility and structure to agent decisions

🐛 Bug Fixes
- Smarter planning logic
- Cleaner Docker logs
- Better prompt formatting for additional_args
- Safer internal functions and final answer matching

📚 Docs Improvements
- Added quickstart examples with tool usage
- One-click Colab launch buttons
- Expanded reference docs (AgentMemory, GradioUI docstrings)
- Fixed broken links and migrated to .md format

🔗 Full release notes:
https://github.com/huggingface/smolagents/releases/tag/v1.19.0

💬 Try it out, explore the new features, and let us know what you build!

#smolagents #opensource #AIagents #LLM #HuggingFace

gagan3012

authored a paper 26 days ago

Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images

Paper • 2506.13458 • Published 28 days ago

albertvillanova

posted an update about 2 months ago

Post

653

New in smolagents v1.17.0:
- Structured generation in CodeAgent 🧱
- Streamable HTTP MCP support 🌐
- Agent.run() returns rich RunResult 📦

Smarter agents, smoother workflows.
Try it now: https://github.com/huggingface/smolagents/releases/tag/v1.17.0

gagan3012

authored a paper about 2 months ago

Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning

Paper • 2505.16088 • Published May 22 • 3

DrishtiSharma

authored a paper about 2 months ago

Behind Maya: Building a Multilingual Vision Language Model

Paper • 2505.08910 • Published May 13 • 1

w11wo

authored a paper about 2 months ago

Massive-STEPS: Massive Semantic Trajectories for Understanding POI Check-ins -- Dataset and Benchmarks

Paper • 2505.11239 • Published May 16

albertvillanova

posted an update about 2 months ago

Post

2504

New in smolagents v1.16.0:
🔍 Bing support in WebSearchTool
🐍 Custom functions & executor_kwargs in LocalPythonExecutor
🔧 Streaming GradioUI fixes
🌐 Local web agents via api_base & api_key
📚 Better docs

👉 https://github.com/huggingface/smolagents/releases/tag/v1.16.0

albertvillanova

posted an update 3 months ago

Post

2793

smolagents v1.14.0 is out! 🚀
🔌 MCPClient: A sleek new client for connecting to remote MCP servers, making integrations more flexible and scalable.
🪨 Amazon Bedrock: Native support for Bedrock-hosted models.
SmolAgents is now more powerful, flexible, and enterprise-ready. 💼

Full release 👉 https://github.com/huggingface/smolagents/releases/tag/v1.14.0
#smolagents #LLM #AgenticAI

DrishtiSharma

authored a paper 3 months ago

Robust and Fine-Grained Detection of AI Generated Texts

Paper • 2504.11952 • Published Apr 16 • 12

gigant

authored 2 papers 3 months ago

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

Paper • 2206.15076 • Published Jun 30, 2022 • 5

Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure

Paper • 2504.10049 • Published Apr 14 • 3

anton-l

authored a paper 3 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 192

emre

authored a paper 4 months ago

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 32

emre

posted an update 4 months ago

Post

3543

having trouble with auto train
hello there this is the first time i am testing auto train with a 1.8k SFT dataset. Howevery i am not quite sure the training is going smooth. Logs seem quite confusing, token did not match can not auth, generates confusing train splits, do you know how i can check my running job properly?
what is being used for training as data?
any ideas?

1 reply

·

w11wo

authored a paper 4 months ago

COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition

Paper • 2503.07259 • Published Mar 10

albertvillanova

posted an update 4 months ago

Post

4140

🚀 New smolagents update: Safer Local Python Execution! 🦾🐍

With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. 🔒

Here's why this matters & what you need to know! 🧵👇

1️⃣ Why is local execution risky? ⚠️
AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.

2️⃣ New Safety Layer in smolagents 🛡️
We now inspect every return value during execution:
✅ Allowed: Safe built-in types (e.g., numbers, strings, lists)
⛔ Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)

3️⃣ Immediate Benefits 💡
- Prevent agents from accessing unsafe builtins
- Block unauthorized file or network access
- Reduce accidental security vulnerabilities

4️⃣ Security Disclaimer ⚠️
🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨
If you need true isolation, use a remote sandboxed executor like Docker or E2B.

5️⃣ The Best Practice: Use Sandboxed Execution 🔐
For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.

6️⃣ Upgrade Now & Stay Safe! 🚀
Check out the latest smolagents release and start building safer AI agents today.

🔗 https://github.com/huggingface/smolagents

What security measures do you take when running AI-generated code? Let’s discuss! 👇

#AI #smolagents #Python #Security

2 replies

·

albertvillanova

posted an update 4 months ago

Post

4045

🚀 Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. 🦾🔒

Here's why this is a game-changer for agent-based systems: 🧵👇

1️⃣ Security First 🔐
Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications.

2️⃣ Deterministic & Reproducible Runs 📦
By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable setting—no more environment mismatches or dependency issues!

3️⃣ Resource Control & Limits 🚦
Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents don’t spiral out of control.

4️⃣ Safer Code Execution in Production 🏭
Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure.

5️⃣ Easy to Integrate 🛠️
With smolagents, you can simply configure your agent to use Docker or E2B as its execution backend—no need for complex security setups!

6️⃣ Perfect for Autonomous AI Agents 🤖
If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation.

⚡ Get started now: https://github.com/huggingface/smolagents

What will you build with smolagents? Let us know! 🚀💡

nguyenvulebinh

authored a paper 5 months ago

MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models

Paper • 2411.18152 • Published Nov 27, 2024 • 1

vumichien

authored a paper 5 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

Speech Recognition Community Event Version 2

AI & ML interests

Recent Activity

Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images

Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning

Behind Maya: Building a Multilingual Vision Language Model

Massive-STEPS: Massive Semantic Trajectories for Understanding POI Check-ins -- Dataset and Benchmarks

Robust and Fine-Grained Detection of AI Generated Texts

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing

Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure

SmolVLM: Redefining small and efficient multimodal models

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition

MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models

Bridging the Data Provenance Gap Across Text, Speech and Video

AI & ML interests

Recent Activity

Team members 199

speech-recognition-community-v2's activity