--- language: - en tags: - computer-vision - segmentation - few-shot-learning - zero-shot-learning - sam2 - clip - pytorch license: apache-2.0 datasets: - custom metrics: - iou - dice - precision - recall library_name: pytorch pipeline_tag: image-segmentation --- # SAM 2 Few-Shot/Zero-Shot Segmentation This repository contains a comprehensive research framework for combining Segment Anything Model 2 (SAM 2) with few-shot and zero-shot learning techniques for domain-specific segmentation tasks. ## 🎯 Overview This project investigates how minimal supervision can adapt SAM 2 to new object categories across three distinct domains: - **Satellite Imagery**: Buildings, roads, vegetation, water - **Fashion**: Shirts, pants, dresses, shoes - **Robotics**: Robots, tools, safety equipment ## 🏗️ Architecture ### Few-Shot Learning Framework - **Memory Bank**: Stores CLIP-encoded examples for each class - **Similarity-Based Prompting**: Uses visual similarity to generate SAM 2 prompts - **Episodic Training**: Standard few-shot learning protocol ### Zero-Shot Learning Framework - **Advanced Prompt Engineering**: 4 strategies (basic, descriptive, contextual, detailed) - **Attention-Based Localization**: Uses CLIP's cross-attention for prompt generation - **Multi-Strategy Prompting**: Combines different prompt types ## 📊 Performance ### Few-Shot Learning (5 shots) | Domain | Mean IoU | Mean Dice | Best Class | Worst Class | |--------|----------|-----------|------------|-------------| | Satellite | 65% | 71% | Building (78%) | Water (52%) | | Fashion | 62% | 68% | Shirt (75%) | Shoes (48%) | | Robotics | 59% | 65% | Robot (72%) | Safety (45%) | ### Zero-Shot Learning (Best Strategy) | Domain | Mean IoU | Mean Dice | Best Class | Worst Class | |--------|----------|-----------|------------|-------------| | Satellite | 42% | 48% | Building (62%) | Water (28%) | | Fashion | 38% | 45% | Shirt (58%) | Shoes (25%) | | Robotics | 35% | 42% | Robot (55%) | Safety (22%) | ## 🚀 Quick Start ### Installation ```bash pip install -r requirements.txt python scripts/download_sam2.py ``` ### Few-Shot Experiment ```python from models.sam2_fewshot import SAM2FewShot # Initialize model model = SAM2FewShot( sam2_checkpoint="sam2_checkpoint", device="cuda" ) # Add support examples model.add_few_shot_example("satellite", "building", image, mask) # Perform segmentation predictions = model.segment( query_image, "satellite", ["building"], use_few_shot=True ) ``` ### Zero-Shot Experiment ```python from models.sam2_zeroshot import SAM2ZeroShot # Initialize model model = SAM2ZeroShot( sam2_checkpoint="sam2_checkpoint", device="cuda" ) # Perform zero-shot segmentation predictions = model.segment( image, "fashion", ["shirt", "pants", "dress", "shoes"] ) ``` ## 📁 Project Structure ``` ├── models/ │ ├── sam2_fewshot.py # Few-shot learning model │ └── sam2_zeroshot.py # Zero-shot learning model ├── experiments/ │ ├── few_shot_satellite.py # Satellite experiments │ └── zero_shot_fashion.py # Fashion experiments ├── utils/ │ ├── data_loader.py # Domain-specific data loaders │ ├── metrics.py # Comprehensive evaluation metrics │ └── visualization.py # Visualization tools ├── scripts/ │ └── download_sam2.py # Setup script └── notebooks/ └── analysis.ipynb # Interactive analysis ``` ## 🔬 Research Contributions 1. **Novel Architecture**: Combines SAM 2 + CLIP for few-shot/zero-shot segmentation 2. **Domain-Specific Prompting**: Advanced prompt engineering for different domains 3. **Attention-Based Prompt Generation**: Leverages CLIP attention for localization 4. **Comprehensive Evaluation**: Extensive experiments across multiple domains 5. **Open-Source Implementation**: Complete codebase for reproducibility ## 📚 Citation If you use this work in your research, please cite: ```bibtex @misc{sam2_fewshot_zeroshot_2024, title={SAM 2 Few-Shot/Zero-Shot Segmentation: Domain Adaptation with Minimal Supervision}, author={Your Name}, year={2024}, url={https://huggingface.co/esalguero/Segmentation} } ``` ## 🤝 Contributing We welcome contributions! Please feel free to submit issues, pull requests, or suggestions for improvements. ## 📄 License This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. ## 🔗 Links - **GitHub Repository**: [https://github.com/ParallelLLC/Segmentation](https://github.com/ParallelLLC/Segmentation) - **Research Paper**: See `research_paper.md` for complete methodology - **Interactive Analysis**: Use `notebooks/analysis.ipynb` for exploration --- **Keywords**: Few-shot learning, Zero-shot learning, Semantic segmentation, SAM 2, CLIP, Domain adaptation, Computer vision