Spaces:
Running
Running
π Free H200 Training: Nano-Coder on Hugging Face
This guide shows you how to train a nano-coder model using Hugging Face's free H200 GPU access (4 minutes daily).
π― What You Get
- Free H200 GPU: 4 minutes per day
- No Credit Card Required: Completely free
- Easy Setup: Just a few clicks
- Model Sharing: Automatic upload to HF Hub
π Quick Start
Option 1: Hugging Face Space (Recommended)
Create HF Space:
huggingface-cli repo create nano-coder-free --type space
Upload Files:
- Upload all the Python files to your space
- Make sure
app.py
is in the root directory
Configure Space:
- Set Hardware: H200 (free tier)
- Set Python Version: 3.9+
- Set Requirements:
requirements.txt
Launch Training:
- Go to your space URL
- Click "π Start Free H200 Training"
- Wait for training to complete (3.5 minutes)
Option 2: Local Setup with HF Free Tier
Install Dependencies:
pip install -r requirements.txt
Set HF Token:
export HF_TOKEN="your_token_here"
Run Free Training:
python hf_free_training.py
π Model Configuration (Free Tier)
Parameter | Free Tier | Full Model |
---|---|---|
Layers | 6 | 12 |
Heads | 6 | 12 |
Embedding | 384 | 768 |
Context | 512 | 1024 |
Parameters | ~15M | ~124M |
Training Time | 3.5 min | 2-4 hours |
β° Time Management
- Daily Limit: 4 minutes of H200 time
- Training Time: 3.5 minutes (safe buffer)
- Automatic Stop: Script stops before time limit
- Daily Reset: New 4 minutes every day at midnight UTC
π¨ Features
Training Features
- β Automatic Time Tracking: Stops before limit
- β Frequent Checkpoints: Every 200 iterations
- β HF Hub Upload: Models saved automatically
- β Wandb Logging: Real-time metrics
- β Progress Monitoring: Time remaining display
Generation Features
- β Interactive UI: Gradio interface
- β Custom Prompts: Any Python code start
- β Adjustable Parameters: Temperature, tokens
- β Real-time Generation: Instant results
π File Structure
nano-coder-free/
βββ app.py # HF Space app
βββ hf_free_training.py # Free H200 training script
βββ prepare_code_dataset.py # Dataset preparation
βββ sample_nano_coder.py # Code generation
βββ requirements.txt # Dependencies
βββ model.py # nanoGPT model
βββ configurator.py # Configuration
βββ README_free_H200.md # This file
π§ Customization
Adjust Training Parameters
Edit hf_free_training.py
:
# Model size (smaller = faster training)
n_layer = 4 # Even smaller
n_head = 4 # Even smaller
n_embd = 256 # Even smaller
# Training time (be conservative)
MAX_TRAINING_TIME = 3.0 * 60 # 3 minutes
# Batch size (larger = faster)
batch_size = 128 # If you have memory
Change Dataset
# In prepare_code_dataset.py
dataset = load_dataset("your-dataset") # Your own dataset
π Expected Results
After 3.5 minutes of training on H200:
- Training Loss: ~2.5-3.0
- Validation Loss: ~2.8-3.3
- Model Size: ~15MB
- Code Quality: Basic Python functions
- Iterations: ~500-1000
π― Use Cases
Perfect For:
- β Learning: Understand nanoGPT training
- β Prototyping: Test ideas quickly
- β Experiments: Try different configurations
- β Small Models: Code generation demos
Not Suitable For:
- β Production: Too small for real use
- β Large Models: Limited by time/parameters
- β Long Training: 4-minute daily limit
π Daily Workflow
- Morning: Check if you can train today
- Prepare: Have your dataset ready
- Train: Run 3.5-minute training session
- Test: Generate some code samples
- Share: Upload to HF Hub if good
- Wait: Come back tomorrow for more training
π¨ Troubleshooting
Common Issues
"Daily limit reached"
- Wait until tomorrow
- Check your timezone
"No GPU available"
- H200 might be busy
- Try again in a few minutes
"Training too slow"
- Reduce model size
- Increase batch size
- Use smaller context
"Out of memory"
- Reduce batch_size
- Reduce block_size
- Reduce model size
Performance Tips
- Batch Size: Use largest that fits in memory
- Context Length: 512 is good for free tier
- Model Size: 6 layers is optimal
- Learning Rate: 1e-3 for fast convergence
π Monitoring
Wandb Dashboard
- Real-time loss curves
- Training metrics
- Model performance
HF Hub
- Model checkpoints
- Training logs
- Generated samples
Local Files
out-nano-coder-free/ckpt.pt
- Latest modeldaily_limit_YYYY-MM-DD.txt
- Usage tracking
π Success Stories
Users have achieved:
- β Basic Python function generation
- β Simple class definitions
- β List comprehensions
- β Error handling patterns
- β Docstring generation
π Resources
π€ Contributing
Want to improve the free H200 setup?
- Optimize Model: Make it train faster
- Better UI: Improve the Gradio interface
- More Datasets: Support other code datasets
- Documentation: Help others get started
π License
This project follows the same license as the original nanoGPT repository.
Happy Free H200 Training! π
Remember: 4 minutes a day keeps the AI doctor away! π