YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

GPT OSS Safeguard 20B - Colab Demo Usage Guide

Overview

This demo showcases OpenAI's gpt-oss-safeguard-20b model for policy-based safety classification and content moderation.

How to Use

1. Open in Google Colab

Upload the notebook file to Google Drive
Open with Google Colab: https://colab.research.google.com/
Select "Upload" tab and choose the notebook file

2. Set Up Runtime

Go to Runtime > Change runtime type
Set Hardware accelerator to GPU
Recommended: High-RAM runtime for better performance

3. Run Setup Cells

Cell 1: Install packages (may take 5-10 minutes)
⚠️ IMPORTANT: Restart runtime when prompted
Cell 2: Import libraries and load model (may take 2-3 minutes)

4. Explore Examples

The notebook includes 9 sections:

Environment Setup
Model Loading
Safety Policy Examples
Helper Functions
Binary Content Safety Classification
Advanced Spam Detection with Reasoning
Reasoning Effort Comparison
Custom Policy Creation
Performance Information

Key Features Demonstrated

Policy-Based Classification

Binary Output: Simple 0/1 decisions
Policy-Referencing: Category labels with confidence
Detailed Reasoning: Full JSON with rationale and rule citations

Configurable Reasoning Effort

Low: Fast classification for high-volume scenarios
Medium (default): Balanced speed and accuracy
High: Deep analysis for complex edge cases

Custom Policy Support

Bring-your-own-policy approach
Template provided for creating custom policies
Support for complex, multi-rule safety guidelines

Example Use Cases

Content Moderation

# Classify user-generated content
result = classify_content_policy(
    content="I think you're terrible and should hurt yourself.",
    policy=content_safety_policy,
    reasoning_effort="medium"
)

Spam Detection

# Advanced spam classification with reasoning
result = classify_content_policy(
    content="Congratulations! You've won $1000000! Click here!",
    policy=spam_policy,
    reasoning_effort="high"
)

Custom Policies

# Create domain-specific policies
privacy_policy = create_custom_policy(
    name="Privacy Data Protection",
    instructions="Classify privacy violations...",
    # ... additional parameters
)

System Requirements

GPU: Required (Tesla T4 or better recommended)
RAM: 16GB+ (High-RAM Colab instance)
Runtime: GPU-enabled Colab session
Time: 10-15 minutes initial setup

Model Specifications

Total Parameters: 21B
Active Parameters: 3.6B
Quantization: MXFP4 for efficiency
License: Apache 2.0 (commercial use allowed)
VRAM Requirement: ~16GB

Troubleshooting

Common Issues:

Out of Memory: Use High-RAM runtime
CUDA Errors: Restart runtime and try again
Slow Loading: Normal for first load (model download)
Policy Format Errors: Ensure Harmony format compatibility

Performance Tips:

Use medium reasoning effort for most cases
Batch process multiple contents for efficiency
Monitor GPU memory usage
Adjust max_new_tokens based on response complexity

Additional Resources

License

This demo and the underlying model are released under Apache 2.0 license, allowing free commercial use.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support