YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

GPT OSS Safeguard 20B - Colab Demo Usage Guide

Overview

This demo showcases OpenAI's gpt-oss-safeguard-20b model for policy-based safety classification and content moderation.

How to Use

1. Open in Google Colab

2. Set Up Runtime

  • Go to Runtime > Change runtime type
  • Set Hardware accelerator to GPU
  • Recommended: High-RAM runtime for better performance

3. Run Setup Cells

  • Cell 1: Install packages (may take 5-10 minutes)
  • โš ๏ธ IMPORTANT: Restart runtime when prompted
  • Cell 2: Import libraries and load model (may take 2-3 minutes)

4. Explore Examples

The notebook includes 9 sections:

  1. Environment Setup
  2. Model Loading
  3. Safety Policy Examples
  4. Helper Functions
  5. Binary Content Safety Classification
  6. Advanced Spam Detection with Reasoning
  7. Reasoning Effort Comparison
  8. Custom Policy Creation
  9. Performance Information

Key Features Demonstrated

Policy-Based Classification

  • Binary Output: Simple 0/1 decisions
  • Policy-Referencing: Category labels with confidence
  • Detailed Reasoning: Full JSON with rationale and rule citations

Configurable Reasoning Effort

  • Low: Fast classification for high-volume scenarios
  • Medium (default): Balanced speed and accuracy
  • High: Deep analysis for complex edge cases

Custom Policy Support

  • Bring-your-own-policy approach
  • Template provided for creating custom policies
  • Support for complex, multi-rule safety guidelines

Example Use Cases

Content Moderation

# Classify user-generated content
result = classify_content_policy(
    content="I think you're terrible and should hurt yourself.",
    policy=content_safety_policy,
    reasoning_effort="medium"
)

Spam Detection

# Advanced spam classification with reasoning
result = classify_content_policy(
    content="Congratulations! You've won $1000000! Click here!",
    policy=spam_policy,
    reasoning_effort="high"
)

Custom Policies

# Create domain-specific policies
privacy_policy = create_custom_policy(
    name="Privacy Data Protection",
    instructions="Classify privacy violations...",
    # ... additional parameters
)

System Requirements

  • GPU: Required (Tesla T4 or better recommended)
  • RAM: 16GB+ (High-RAM Colab instance)
  • Runtime: GPU-enabled Colab session
  • Time: 10-15 minutes initial setup

Model Specifications

  • Total Parameters: 21B
  • Active Parameters: 3.6B
  • Quantization: MXFP4 for efficiency
  • License: Apache 2.0 (commercial use allowed)
  • VRAM Requirement: ~16GB

Troubleshooting

Common Issues:

  1. Out of Memory: Use High-RAM runtime
  2. CUDA Errors: Restart runtime and try again
  3. Slow Loading: Normal for first load (model download)
  4. Policy Format Errors: Ensure Harmony format compatibility

Performance Tips:

  • Use medium reasoning effort for most cases
  • Batch process multiple contents for efficiency
  • Monitor GPU memory usage
  • Adjust max_new_tokens based on response complexity

Additional Resources

License

This demo and the underlying model are released under Apache 2.0 license, allowing free commercial use.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support