nano-coder-free / README_free_H200.md
mlopez6132's picture
Upload README_free_H200.md with huggingface_hub
3ee5ebe verified

πŸ†“ Free H200 Training: Nano-Coder on Hugging Face

This guide shows you how to train a nano-coder model using Hugging Face's free H200 GPU access (4 minutes daily).

🎯 What You Get

  • Free H200 GPU: 4 minutes per day
  • No Credit Card Required: Completely free
  • Easy Setup: Just a few clicks
  • Model Sharing: Automatic upload to HF Hub

πŸš€ Quick Start

Option 1: Hugging Face Space (Recommended)

  1. Create HF Space:

    huggingface-cli repo create nano-coder-free --type space
    
  2. Upload Files:

    • Upload all the Python files to your space
    • Make sure app.py is in the root directory
  3. Configure Space:

    • Set Hardware: H200 (free tier)
    • Set Python Version: 3.9+
    • Set Requirements: requirements.txt
  4. Launch Training:

    • Go to your space URL
    • Click "πŸš€ Start Free H200 Training"
    • Wait for training to complete (3.5 minutes)

Option 2: Local Setup with HF Free Tier

  1. Install Dependencies:

    pip install -r requirements.txt
    
  2. Set HF Token:

    export HF_TOKEN="your_token_here"
    
  3. Run Free Training:

    python hf_free_training.py
    

πŸ“Š Model Configuration (Free Tier)

Parameter Free Tier Full Model
Layers 6 12
Heads 6 12
Embedding 384 768
Context 512 1024
Parameters ~15M ~124M
Training Time 3.5 min 2-4 hours

⏰ Time Management

  • Daily Limit: 4 minutes of H200 time
  • Training Time: 3.5 minutes (safe buffer)
  • Automatic Stop: Script stops before time limit
  • Daily Reset: New 4 minutes every day at midnight UTC

🎨 Features

Training Features

  • βœ… Automatic Time Tracking: Stops before limit
  • βœ… Frequent Checkpoints: Every 200 iterations
  • βœ… HF Hub Upload: Models saved automatically
  • βœ… Wandb Logging: Real-time metrics
  • βœ… Progress Monitoring: Time remaining display

Generation Features

  • βœ… Interactive UI: Gradio interface
  • βœ… Custom Prompts: Any Python code start
  • βœ… Adjustable Parameters: Temperature, tokens
  • βœ… Real-time Generation: Instant results

πŸ“ File Structure

nano-coder-free/
β”œβ”€β”€ app.py                    # HF Space app
β”œβ”€β”€ hf_free_training.py       # Free H200 training script
β”œβ”€β”€ prepare_code_dataset.py   # Dataset preparation
β”œβ”€β”€ sample_nano_coder.py      # Code generation
β”œβ”€β”€ requirements.txt          # Dependencies
β”œβ”€β”€ model.py                  # nanoGPT model
β”œβ”€β”€ configurator.py           # Configuration
└── README_free_H200.md       # This file

πŸ”§ Customization

Adjust Training Parameters

Edit hf_free_training.py:

# Model size (smaller = faster training)
n_layer = 4      # Even smaller
n_head = 4       # Even smaller
n_embd = 256     # Even smaller

# Training time (be conservative)
MAX_TRAINING_TIME = 3.0 * 60  # 3 minutes

# Batch size (larger = faster)
batch_size = 128  # If you have memory

Change Dataset

# In prepare_code_dataset.py
dataset = load_dataset("your-dataset")  # Your own dataset

πŸ“ˆ Expected Results

After 3.5 minutes of training on H200:

  • Training Loss: ~2.5-3.0
  • Validation Loss: ~2.8-3.3
  • Model Size: ~15MB
  • Code Quality: Basic Python functions
  • Iterations: ~500-1000

🎯 Use Cases

Perfect For:

  • βœ… Learning: Understand nanoGPT training
  • βœ… Prototyping: Test ideas quickly
  • βœ… Experiments: Try different configurations
  • βœ… Small Models: Code generation demos

Not Suitable For:

  • ❌ Production: Too small for real use
  • ❌ Large Models: Limited by time/parameters
  • ❌ Long Training: 4-minute daily limit

πŸ”„ Daily Workflow

  1. Morning: Check if you can train today
  2. Prepare: Have your dataset ready
  3. Train: Run 3.5-minute training session
  4. Test: Generate some code samples
  5. Share: Upload to HF Hub if good
  6. Wait: Come back tomorrow for more training

🚨 Troubleshooting

Common Issues

  1. "Daily limit reached"

    • Wait until tomorrow
    • Check your timezone
  2. "No GPU available"

    • H200 might be busy
    • Try again in a few minutes
  3. "Training too slow"

    • Reduce model size
    • Increase batch size
    • Use smaller context
  4. "Out of memory"

    • Reduce batch_size
    • Reduce block_size
    • Reduce model size

Performance Tips

  • Batch Size: Use largest that fits in memory
  • Context Length: 512 is good for free tier
  • Model Size: 6 layers is optimal
  • Learning Rate: 1e-3 for fast convergence

πŸ“Š Monitoring

Wandb Dashboard

  • Real-time loss curves
  • Training metrics
  • Model performance

HF Hub

  • Model checkpoints
  • Training logs
  • Generated samples

Local Files

  • out-nano-coder-free/ckpt.pt - Latest model
  • daily_limit_YYYY-MM-DD.txt - Usage tracking

πŸŽ‰ Success Stories

Users have achieved:

  • βœ… Basic Python function generation
  • βœ… Simple class definitions
  • βœ… List comprehensions
  • βœ… Error handling patterns
  • βœ… Docstring generation

πŸ”— Resources

🀝 Contributing

Want to improve the free H200 setup?

  1. Optimize Model: Make it train faster
  2. Better UI: Improve the Gradio interface
  3. More Datasets: Support other code datasets
  4. Documentation: Help others get started

πŸ“ License

This project follows the same license as the original nanoGPT repository.


Happy Free H200 Training! πŸš€

Remember: 4 minutes a day keeps the AI doctor away! πŸ˜„